Engineering Manager, Model Serving

Company: Together AI
Location: San Francisco
Posted on: April 2, 2026

Job Description:

Together AI is building the AI Inference & Model Shaping Platform that brings the most advanced generative AI models to the world. Our platform powers multi-tenant server-less workloads and dedicated endpoints, enabling developers, enterprises, and researchers to harness the latest LLMs, multimodal models, image, audio, video, and reasoning models at scale. We are looking for an exceptional Engineering Lead to partner closely with our cross-functional engineering, infrastructure, research, and sales teams to ensure excellence of our ML API offerings. Your primary focus will be on delivering world-class inference and fine-tuning in our public APIs and customer deployments by building automation and operations processes. This role is ideal for a highly motivated and technically adept individual who excels in fast-paced, dynamic environments. You will be in charge of designing and scaling our ML processes & tooling at production scale – optimizing operations to ensure availability and reliability for our services, across differing tenants and user loads, and in a multi-cluster deployment. You will serve as a passionate advocate for internal and external customers, providing feedback to the wider engineering and infrastructure teams to improve our systems and core business metrics. If you thrive in a collaborative, problem-solving environment and are driven to deliver operational excellence, we encourage you to apply for this exciting opportunity. Key Responsibilities Own availability and performance SLAs for production inference and fine-tuning services across serverless and dedicated deployments Own & improve testing, deployment, configuration management, and monitoring practices for multi-cluster ML infrastructure – partnering closely with Infra SREs Build self-serve tooling and automation to reduce operational toil and enable self-serve offerings. Define and enforce configuration best practices for inference engines (SGLang, TRT-LLM, vLLM etc.) to prevent runtime issues Lead incident response, conduct postmortems, and drive reliability improvements Mentor team members and potentially grow into hiring/team building as the organization scales Partner with infrastructure and ML engineering teams to improve system reliability and cost efficiency Required Qualifications 5 years operating production ML inference or training systems at scale 2 years in senior IC or tech lead roles, with demonstrated mentorship and technical leadership experience. Having built or scaled teams is a plus. Deep expertise with Kubernetes, multi-cluster orchestration, and ML serving frameworks Experience with multi-tenant SaaS platforms Proven track record of SLA ownership with specific metrics (99.9% uptime, p99 latency targets) Customer escalation and incident communication experience Experience with LLM inference serving systems (SGLang, vLLM, TRT-LLM, or similar) Ability to influence cross-functional teams and make deployment/architecture decisions Nice to Have Experience building internal developer platforms or self-serve tooling Background in cost optimization for GPU infrastructure Contributions to open-source ML infrastructure projects About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $250,000 - $300,000 equity benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy

Keywords: Together AI, San Jose , Engineering Manager, Model Serving, IT / Software / Systems , San Francisco, California

Didn't find what you're looking for? Search again!

Let San Francisco recruiters find you. Post your resume for free!

Get San Francisco IT / Software / Systems jobs via email.

View more San Jose IT / Software / Systems jobs

Other IT / Software / Systems Jobs

Investment Support Analyst - West Coast
Description: About IMP IMP is a rapidly growing Fintech and RegTech firm serving the buy-side of the investment management industry. We are creating solutions that leverage machine learning to improve the success (more...)
Company: IMP Consulting
Location: Sacramento
Posted on: 04/3/2026

Director, Management and Business Development (Public Sector)
Description: U.S. - What we do matters By playing this video you consent to Google/YouTube processing your data and using cookies Learn more . Position Description: We a seeking a Director, Management and Business (more...)
Company: CGI
Location: Sacramento
Posted on: 04/3/2026

ServiceMax Product Management Intern
Description: Our world is transforming, and PTC is leading the way. Our software brings the physical and digital worlds together, enabling companies to improve operations, create better products, and empower people (more...)
Company: Pilot Thomas Logistics
Location: San Ramon
Posted on: 04/4/2026

Salary in San Jose, California Area | More details for San Jose, California Jobs |Salary

AI/ML Developer Relations - US (San Francisco)
Description: About Us At Encord, we're building the AI infrastructure of the future. The biggest challenge AI companies face today is actually not half as glamorous as the outside world may think: it's all about data (more...)
Company: Encord
Location: San Francisco
Posted on: 04/3/2026

Product Manager, Connectors
Description: About Glean: Glean is the Work AI platform that helps everyone work smarter with AI. What began as the industry s most advanced enterprise search has evolved into a full-scale Work AI ecosystem, powering (more...)
Company: Glean
Location: San Francisco
Posted on: 04/3/2026

Senior Product Manager – Platform
Description: About Snorkel At Snorkel, we believe meaningful AI doesn t start with the model, it starts with the data. We re on a mission to help enterprises transform expert knowledge into specialized AI at scale. (more...)
Company: Snorkel AI
Location: San Francisco
Posted on: 04/3/2026

Software Engineer, Frontend
Description: About Glean: Glean is the Work AI platform that helps everyone work smarter with AI. What began as the industry s most advanced enterprise search has evolved into a full-scale Work AI ecosystem, powering (more...)
Company: Glean
Location: San Francisco
Posted on: 04/3/2026

Senior Program Manager, Infrastructure Strategy and Business Operations
Description: About The Role Together AI is building the compute infrastructure to power the next generation of AI. As a Program Manager on our Infrastructure Strategy team, you will drive a wide range of programs (more...)
Company: Together AI
Location: San Francisco
Posted on: 04/3/2026

Data Center Technician/Cable Puller
Description: Overview We are looking for a motivated and detail oriented Data Center Technician to join our growing team. This role is essential to maintaining
Company: The Archetype Strategy
Location: Sacramento
Posted on: 04/3/2026

Network Engineer
Description: Looking for an opportunity to make an impact Join the Leidos Digital Modernization DigiMod team in accelerating information technology in a changing world where we make a difference by modernizing (more...)
Company: Leidos
Location: Marysville
Posted on: 04/3/2026

Loading more jobs...

Engineering Manager, Model Serving

Didn't find what you're looking for? Search again!

Other IT / Software / Systems Jobs

Log In or Create An Account