Skip to Content
Data Engineer
SeatGeek operates a unique business model in a complicated, opaque market. The questions we seek answers to are often not simple or clear, and the data we rely on is complex. Even the simple questions depend on careful curation of the underlying data. That’s where Data Engineers come in.

Data engineers are members of SeatGeek’s data science team. As a team, we share common views on experimental rigor, pragmatism, and software quality. Data engineers focus on maintaining and expanding our ecosystem of data pipelines. These pipelines provide the basis for many crucial downstream processes: complex analyses, data products, and business performance measurement.   

What You’ll Do

  • Investigate and prototype different task dependency frameworks to understand the most appropriate design for a given use case
  • Collaborate closely with our marketing team to understand and accurately integrate marketing and application data sources into new tables for measuring customer acquisition
  • Build data warehouse monitoring tools on data usage patterns to identify inefficient query behavior or candidates for archival
  • Build a new stream ingestion process to efficiently send and store currently live listings as users see them
  • Spearhead the design of the core data vocabulary used for analyzing user activity, and enforce consistency across both data producers and consumers

What We're Looking For

The ideal candidate can relay complex concepts to both technical and non-technical audiences, enjoys partnering with both business users and application developers, and has a proven ability to build systems that allow greater clarity, consistency, and efficiency in utilizing data.  Experience with specific tools is less important than aptitude and drive, but at a minimum, we would expect:

  • 3-5 years experience developing on and maintaining ETL systems, and a high level understanding of different data processing systems
  • Experience working cross-functionally with different business units to translate business problems into data problems and solve them
  • Comfort turning ideas into code (bonus points for experience with Python or Scala)
  • Proficiency with analytical and operational SQL queries, and knowledge of how to optimize and troubleshoot

Bonus points for candidates who have experience with or desire to learn any of the following:

  • Dimensional modeling, especially in e-commerce or consumer tech
  • Streaming data/message queues (Reactive Extensions, Spark Streaming, Akka-streams, Kafka, RabbitMQ)
  • AWS infrastructure (we use Redshift, S3, EMR, Kinesis, Lambda, and RDS)
  • Database internals and tuning, trade-offs and benefits of different datastores (both analytical and transactional)
  • Optimization of complex data workflows that reconcile batch and streaming data
  • Dependency management at scale across multiple classes of data source

The Tools We Use

You absolutely do not need experience with all of these, but we thought you might be curious. Tools can be learned, so we care much more about your general engineering skill than knowledge of a particular language.

  • Languages: Scala and Python for general purpose development, R for analysis and prototyping
  • Frameworks: luigi and Airflow for dependency management, Spark on EMR for map/reduce
  • Streaming: Spark Streaming, Kinesis (+Firehose), RabbitMQ
  • Datastores: MySQL and Postgres in production, Redshift and S3
  • Other: AWS Lambda, Git, dbt

Perks

  • A competitive base salary and equity stake in a well-funded growth stage company
  • A laid-back, fun workplace designed to facilitate collaboration and company wide events
  • $120/mo to spend on live events tickets
  • A superb benefits package that supports health/dental/vision
  • A focus on transparency. We have regular team lunches and Q&A panels where employees can chat openly with teams across SeatGeek, our co-founders, and external guests from the industry
  • Annual subscriptions to Citibike, Spotify, and meditation services