Senior Software Engineer, Voice Agent

Decagon • San Francisco, CA, United States

Python

About Decagon

Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences.

Our technology enables industry-defining enterprises like Avis Budget Group, Block’s Cash App and Square, Chime, Oura Health, and Hunter Douglas to deploy AI agents that power personalized, deeply satisfying interactions across voice, chat, email, SMS, and every other channel.

We’re building a future where customer experiences are being redefined from support tickets and hold music to faster resolutions, richer conversations, and deeper relationships. We’re proud to be backed by world-class investors who share that vision, including a16z, Accel, Bain Capital Ventures, Coatue, and Index Ventures, along with many others.

We’re an in-office company, driven by a shared commitment to excellence and velocity. Our values — Just Get It Done, Invent What Customers Want, Winner’s Mindset, and The Polymath Principle — shape how we work and grow as a team.

About the Team

The Voice Agent team builds the real-time systems that allow Decagon agents to carry natural conversations across our omnichannels (phone, web, mobile, etc). We work across speech understanding, audio streaming, synthesis quality, and the voice specific execution logic that enables timing, pacing, and responsive dialogue.

Our systems must remain accurate, responsive, and stable at scale. Voice is one of the most technically challenging surfaces in conversational AI, and our small team owns this entire capability end to end.

About the Role

As a Senior Software Engineer focused on Voice Agent, you will design and improve the systems that power Decagon’s live voice agents. You will work across streaming audio, transcription, synthesis, timing, and real time orchestration.

You will collaborate closely with Research to bring new models to production, with Infra to optimize performance, and with Product to unlock new conversational capabilities.

In this role, you will

Build the real time voice runtime that powers natural customer conversations
Improve speech understanding and synthesis quality while keeping latency low
Design systems that manage timing, interruptions, and streaming audio reliably
Create tools that make voice interactions easy to debug, test, and observe at scale

Your background looks something like this

5+ years of experience in software engineering
Proficiency with Typescript, Python, and asynchronous programming
Experience with asynchronous or streaming systems
Strong debugging skills across audio, networking, or real time pipelines
An interest in speech, audio, or multimodal AI

Even better

Experience with speech recognition or synthesis systems
Experience with VAD, streaming protocols, or other real time audio systems
Experience designing or maintaining LLM driven applications
Experience optimizing performance for low latency use cases

Benefits:

Medical, dental, and vision benefits
Take what you need vacation policy
Daily lunches, dinners and snacks in the office to keep you at your best

Compensation

$250K – $330K + Offers Equity

Apply Now