DVC.ai

Software Engineer (AI Data Engine, Staff/ Senior, Open Source, SaaS)

DVC.ai • US
Go Remote
About Us
At iterative.ai, we build open-source tools for machine learning DVC (12k+ ⭐  on GitHub), and enterprise-grade data infrastructure solutions. We also offer a team collaboration SaaS solution - Studio. We're a well-funded (Series A), remote-first team (50+ employees) on a mission to solve the complexities of managing datasets, ML infrastructure, ML models lifecycle, and other ML & data-centric workflows.
We value great collaboration and communication skills, both among internal teams and in how we interact with our users. We take care to balance and be responsive to the needs of our open source community as well as our enterprise customers.
Check us out in other places:
🖥 Website 📂  Docs 👾: GitHub 🖊  Blog ⏯️  YouTube 💬 Discord

Job Description

"... competitive advantage in AI goes not so much to those with data but those with a data engine: iterated data acquisition, re-training, evaluation, deployment, telemetry. And whoever can spin it fastest. " - A. Karpathy

We are building the next generation of DVC - DVCx that will serve as a core infrastructure component to manage large amounts of unstructured data (e.g. on a scale of the LAION 5B dataset). How to create or improve a dataset in minutes if there are millions or billons of objects in a bucket? How to add additional signals (e.g. embeddings) at scale to a dataset like LAION 5B?

Join us if you have experience in building big-data, distributed data processors (Spark, Ray, etc), if you have experience using data infrastructure like the one that is used in self-driving cars, if you have similar experience and you want to make this unstructured data management tools available in open source and SaaS.

Responsibilities

  • Own large new areas within our data management software, and build them from ground up
  • Participate in the entire product lifecycle from concept through production
  • Be able, and willing, to multi-task and learn new technologies quickly
  • Must Have

  • 5+ years of industry experience as a software engineer
  • Experience building or working with AI infrastructure at scale (similar to Tesla's data engine, Waymo, etc) or similar relevant experience
  • Solid knowledge of Python
  • At least one year of experience with file systems, concurrency, multithreading, and server architectures
  • Passionate about building highly reliable system software
  • Great to Have

  • Experience working remotely
  • Experience working on high performance database internals, or heavily distributed server backends
  • Prior startup experience
  • Experience at other API technology companies
  • Command of modern system-level languages like Go or Rust