← Back to All Jobs

Principal Data Engineer

Job Description

We are seeking an experienced engineering professional to lead by example the building, maintaining, and using of our platforms and services to manage data at MLB. We are looking for those who can demonstrate and guide the implementation of our Infrastructure as code, automated testing, and modern observability strategies. This would be done through both reference implementations and individual contributions. In addition, you would be responsible for growing an engineering culture of craftsmanship through mentoring, technical leadership as well as formal and informal presentations across MLB. We value winning as a team and communicating inclusively, professionally, and effectively. The platforms we create and manage securely move, transform, and enrich data for both analytic consumption as well as the delivery of actionable intelligence throughout the MLB organization. These platforms and services create a single source of truth of well-curated data domains that MLB depends on.

You will have the opportunity to contribute to many different projects and transfer your ideas into solutions for some of the most meaningful data problems in MLB!


●    Connect to a wide variety of source and target systems

●    Move data between systems

●    Transform / Enrich data

●    Create and manage MLB data products

●    Organize disparate data into curated business domains

●    Coordination services


We use python modules like gcp libraries, tox, pytest, and panda regularly. In addition, we want to use frameworks like DBT and Meltano. Kafka and containers will be a regular part of our infrastructure. Some of the GCP services we use are for Storage, Containers, Secrets, Big Query and monitoring.


We use tools like GitHub, Terraform, Ansible, Bash and Docker Compose.


We are an agile shop that believes in Infrastructure as Code (IaC) and crafting lite-weight asynchronous services with clean code that are tested in an automated way. We believe being active on pull requests helps us win as a team!


The ideal candidate should have 8+ years in software engineering designing and delivering services with mastery of at least one language. Must also be proficient in Python and SQL. You can lead discussions surrounding IaC, SDLC, observability and how to build highly available services. You have created reference implementations as well as highly available platforms that move and transform massive data sets. We are looking for an experienced engineer who can wear many hats and is confident as an individual contributor.


Lead, mentor, and participate in the following:


●    Analyze, design, code, test, configure and modify software for our platform, integrations and services using various programming languages, technologies and development methodologies.

●    Design, develop, test, debug and implement platforms, pipelines, solutions and/or software tools, and utilities for the purpose of assuring acceptable performance and service levels.

●    Automated delivery of software using source control, IaC throughout the entire delivery model

●    Ensures that implemented platforms, pipelines and solutions are optimally monitored, with relevant alerts, logging and tracing that guarantees the durability, availability and performance of our services.

●    Organize Data into well-curated data domains designed for consumption and performance in GCP to provide MLB a governed single source of truth.

●    Complete documentation that contributes value, including but not limited to testing, training and software delivery

Apply for this job

What You Should Know