Data Infrastructure Engineer
Medallia • New York City, NY, United StatesAbout General Intuition
We are the frontier research lab dedicated to building foundation models for environments that require deep spatial and temporal reasoning. For the past year, we've been pushing the forefront of AI across agents capable of navigating space and time, world models that provide training environments for those agents, and video understanding models with a focus on transfer to the real world.
We raised a seed round of $133M from General Catalyst and Khosla to discover the next generation of intelligence.
The Role
Every day, our Medal platform ingests gameplay video that is raw, unfiltered, and packed with insights. We're looking for a seasoned engineer to take full ownership of our Clips/ML data infrastructure, building the next generation of scalable, real-time pipelines that power everything from user-facing discovery to machine learning research. You'll lead the architecture, operations, and performance of data systems that sit at the heart of our product, influencing everything from content indexing to model training.
If you're passionate about building petabyte-scale video pipelines, love working on low-latency systems, and are excited to help define the future of real-time gaming insights, we want to hear from you.
You Will
Architect and operate petabyte-scale ingestion pipelines
Design automated QA guard-rails (schema validation, anomaly detection, deduplication)
Build high-performance ETL and feature-extraction jobs to process and index hundreds of millions of clips into columnar/video-native formats
Own the end-to-end data ingestion stack (desktop, web & mobile clients)
Establish real-time monitoring, lineage, and “five-nines” SLAs, driving continuous improvement across storage, compute, and network layers
Partner with research and product to curate high-signal data slices, data-health metrics, and accelerate model experimentation
Champion security, privacy, and governance: implement robust RBAC, audit trails, and compliant retention policies for sensitive gameplay footage and user inputs
Mentor and uplevel engineers (including internal Medal platform talent), fostering a culture of craftsmanship, documentation, and ruthless focus on data excellence
Qualifications
5+ years of experience in data engineering, backend systems, or related roles. Experience with video data or ML infrastructure is a plus.
Deep knowledge of ETL/ELT pipelines, distributed systems, and streaming data architectures (e.g., Kafka, Spark, Flink, etc.)
Strong proficiency with Python, Java, Go, or similar languages used in data-intensive environments
Experience with cloud infrastructure (e.g., AWS, GCP), provisioning tools (e.g. terraform, pulumi) and modern data stack tools (e.g., dbt, Airflow, Parquet, Arrow)
Track record of designing systems with extreme scale and performance requirements
Experience consolidating data from diverse sources into unified data models
Deep understanding of data QA methodologies, anomaly detection, and automated testing in production systems
Passion for mentorship and team development; able to upskill engineers and advocate for engineering excellence
A bias toward ownership, urgency, and a desire to build systems that just work, even at scale
Benefits
Competitive salary and meaningful equity
Comprehensive health insurance including dental and vision insurance
401k