We are looking for a Software Data Engineer to join our growing Data Team! Reporting to
the Engineering Manager, you will evolve our data models in several styles of datastores and
operationalize production-grade data pipelines. As part of this role, you'll collaborate with a
world-class team, experience growth and mentorship, and apply data engineering solutions
to shape the future of scientific discovery.
Pay range: $110,000 - 135,000
We know compensation is an important part of choosing your next role. The range shown reflects our target hiring range, informed by market data, internal equity, and the role’s current scope. Often the mid-range is where we tend to fall, but individual offers may vary based on experience, skills, and the role scope.
You Will:
Collaborate with Machine Learning, Full-stack engineers and Science to solve complex document mining challenges, helping us capture and model additional scientific experimentsScale data pipelines to allow our data to go from research to platform quickly and reliablyWork with sources that contain both semi-structured and unstructured dataUse your experience to help define and apply best practices for a broad platform of technologies in a cloud-based environmentArchitect and maintain robust data pipelines that ingest diverse sources and utilize LLMs for high-fidelity entity extraction into structured formatsImplement evaluation frameworks to monitor the accuracy, drift, and hallucination rates of extraction models within the production pipeline.Lead or consult the authoring of engineering design proposals following the unified Platform Stream roadmap at BenchSciLeverage a deep understanding of the business context and the team’s goals to unlock independent technical decisions in the face of open-ended requirementsProactively identify new opportunities (from both internal and external sources) and advocate for and implement improvements to the current state of projectsRespond with urgency and drive urgency in own team to operational issues, owning resolution within one's sphere of responsibilityChallenge the status quo and propose newertechnologies or ways of working
You Have:
A degree in Computer Science/Engineering or a related field within science3+ years experience working as a software developerin the industryProficient with PythonProficient with SQLExperience using LLMs for structured data extractionExperience with event-driven architecture with Pub/SubA track record in building high-quality, maintainable codeExperience with cloud computing (for example: GCP, Azure, AWS)
Nice To Have:
ML/Data science exposureWorked with Auth0, TerraformHave experience with data warehouse solutions like BigQuery, and databases including AlloyDB and SpannerHave experience with agentic driven development and AI-based tools like Cursor or Claude CodeHave experience with building ConversationalAI solutions