Data Infrastructure Engineer

PythonScala

About zaimler

AI agents can't reason over data they don't understand. Enterprise data today is fragmented across dozens of systems with no shared context, meaning, or structure, and that's why most enterprise AI is failing. The shift from copilots to autonomous agents is creating an entirely new infrastructure layer, and we're building it.

zaimler is the context infrastructure for the agentic era: a platform that automatically discovers domain knowledge, maps relationships, and gives AI agents the semantic understanding to operate with precision at scale. Imagine knowledge graphs that support real-time inference, built for systems that need to reason, not just retrieve.

zaimler was founded by Biswajit Das (ex-VP Engineering, Truera), a Data Infra veteran and former Chief Architect at Visa, and Sofus Macskassy (ex-Director of Engineering, LinkedIn), who built one of the largest knowledge graphs in production in the industry at LinkedIn. We're a small, senior team at the seed stage, deploying with major enterprises across insurance, travel, and technology. If you want to build infrastructure that the next decade of AI runs on, we'd love to talk.

The Role

We’re looking for a Data Infrastructure Engineer to help build the foundational distributed data layer that powers our semantic platform. You’ll design, build, and scale systems for high-throughput data ingestion, transformation, and real-time processing, shaping the backbone that makes our knowledge layer possible.

As one of the early members of our Bangalore office, you’ll play a key role in setting the technical direction, culture, and standards for our growing team.

What You’ll Do

Build and operate large-scale data pipelines on Spark, Kafka, and Ray.

Design fault-tolerant streaming and batch systems that move terabytes reliably.

Optimize data workflows for performance, cost, and latency.

Collaborate with ML and product engineers to ensure data is discoverable, structured, and queryable.

Automate deployments with Kubernetes, Terraform, and CI/CD pipelines.

Monitor, debug, and improve distributed jobs in production.

What We’re Looking For

Deep experience with distributed data systems (Spark, Kafka, Flink, Ray).

Strong programming skills (Python, Scala, or Java).

Comfort with Kubernetes and cloud environments (AWS/GCP/Azure).

Solid understanding of streaming vs. batch tradeoffs, state management, and scaling patterns.

Ability to collaborate across data, infra, and ML teams.

Why Join

A rare opportunity to be an early engineer in our Bangalore office, shaping both the company’s direction and the core product from the ground up.

Competitive salary, full benefits, and meaningful equity.

Work alongside engineers and researchers from LinkedIn, Visa, Meta, and Branch.

An onsite, high-collaboration culture designed for deep technical work and fast iteration.

Comprehensive benefits package (health insurance, meals, equipment, and other local perks).

Apply Now