Founding Software Engineer, Data Infrastructure
Airweave • Amsterdam, North Holland, NetherlandsWe're building the connective tissue between AI and user data. Today's AI agents are brilliant but mostly blind: they can reason, plan, and execute, but they can't see what's happening across users' work. Every app is a silo. Every database is an island. We're changing that with a focused technical team and great customers, including one of the world's leading AI labs.
Airweave turns scattered information into searchable and learned intelligence. We give agents the context they need by ensuring the right information is available when decisions are made.
The next wave of software will connect dots across systems and work alongside people naturally. That won't come from a bigger model. It will come from the hard work of connecting intelligence to the data in our work and world. We're building the infrastructure to make that possible.
The Role
We're looking for a founding engineer to own Airweave's data and infrastructure layer, the systems that make our distributed search and data pipelines scalable, reliable, and observable.
At Airweave, you'll build and operate the platform that thousands of AI agents depend on. That means distributed sync pipelines pulling data from dozens of sources, vector databases powering LLM search, and the orchestration layer that keeps it all running. You'll work closely with the product team, but your focus is on the foundation: making sure data flows reliably at scale, LLM inference stays fast, and the whole system holds up under real production load.
This is early-stage infrastructure work. The architecture is still being shaped, and your decisions will define how we scale.
What you'll work on
Design and scale distributed data pipelines that sync hundreds of millions of documents from dozens of sources into advanced search indexes
Build and improve Temporal workflows for parallel sync orchestration: retries, backpressure, and failure recovery across workers
Own our Kubernetes deployments with Helm charts: autoscaling, and resource management for bursty search, sync, and LLM workloads
Scale PostgreSQL for high-throughput; connection pooling, read replicas, and partitioning (we ask a lot from this database)
Manage vector database (Vespa) infrastructure: sharding, replication, backup strategies for large-scale agentic search
Orchestrate and optimize LLM inference pipelines: batching, caching, provider failover
Build monitoring and alerting with Prometheus, Grafana, and custom instrumentation for cluster health
Infrastructure as code for the base with Terraform
You might be a fit if
You've built or operated data pipelines at scale: ETL, event processing, streaming, or sync infrastructure
You're comfortable with Kubernetes, Terraform, and infrastructure as code
You've scaled databases and understand the tradeoffs (pooling, replication, sharding)
You have experience with distributed systems: workflow orchestration, message queues, and eventual consistency
You're interested in LLM infrastructure: embeddings, vector search, inference optimization
You like building reliable systems and have opinions about observability
You're drawn to early-stage environments where you own the whole problem
Nice to have
Experience with Temporal, Airflow, or similar workflow engines
Background in scaling search (Elastic, Qdrant, Pinecone, Weaviate)
Familiarity with LLM inference
What we offer
Customers including one of the world's leading AI labs
Competitive salary (€80K–€100K) with meaningful equity (0.25%–0.75%)
Health, dental, and vision coverage
Work in-person in San Francisco with a highly-skilled, technical team
Direct impact on architecture and infrastructure decisions from the first week
Hiring Process
Founders call (45 min) – learn about our vision, share your story.
Paid work trial (3 days, on-site) – work on a real problem with the team. You'll get a feel for the codebase, how we collaborate, and whether this is the right fit for both sides. We cover your travel expenses.
Offer – we decide fast (usually within 24 hours).
Our Values
The systems we build are made to support real AI infrastructure. They're designed to handle complexity without adding noise, to surface what matters, and to hold up when it counts.
We don't wait for specs or permission. High agency means finding what needs to be done and doing it, whether that's a gnarly bug, a missing abstraction, or a conversation no one wants to have. We zoom out to first principles and zoom in to the details, where the real problems live.
We have bold opinions, loosely held. We say what we think, change our minds when the evidence points somewhere new, and don't pretend to know things we don't. The best idea wins, not the loudest voice.
We bring these values to our product every day.
Join Us
We're a small team tackling a big problem. If you want to work on the systems that thousands of AI agents depend on, ship code that matters, and help define how intelligent software interacts with the world, we'd love to hear from you.
The stack is young. The problems are real. The architecture is yours to shape.