Senior ML Performance Engineer

Company: Lemurian Labs
Location: Santa Clara
Posted on: February 18, 2026

Job Description:

Job Description Job Description At Lemurian Labs, we're on a mission to bring the power of AI to everyone—without leaving a massive environmental footprint. We care deeply about the impact AI has on our society and planet, and we're building a rock-solid foundation for its future, ensuring AI grows sustainably and responsibly. Because let's face it, what good is innovation if it doesn't help the world? We are building a high-performance, portable compiler that lets developers "build once, deploy anywhere." Yes, anywhere. We're talking about seamless cross-platform compatibility, so you can train your models in the cloud, deploy them to the edge, and everything in between—all while optimizing for resource efficiency and scalability. If the idea of sustainably scaling AI motivates you and you're excited about making AI development both powerful and accessible, then we'd love to have you. Join us at Lemurian Labs, where you can have fun building the future—without leaving a mess behind. The Role We're looking for a Senior ML Performance Engineer to architect and lead our Performance Testing Platform from the ground up. You'll be the technical authority on how we measure, validate, and optimize the performance of large language models (Llama 3.2 70B, DeepSeek, and others) before and after compiler optimization on modern GPU architectures. This is a high-impact role where you'll directly influence our product quality and our customers' success. You'll work at the intersection of ML systems, GPU architecture, and performance engineering—building the infrastructure that proves our compiler delivers real value. Here is what you will do: Design and build a comprehensive performance testing platform for evaluating LLM inference workloads across GPU clusters Define and implement the benchmarking methodology, metrics, and test suites that measure latency, throughput, memory utilization, power consumption, and model accuracy Establish baseline performance for unoptimized models (Llama 3.2 70B, DeepSeek, etc.) and validate post-optimization improvements Develop automated testing pipelines for continuous performance validation across compiler releases and model updates Investigate performance bottlenecks using profiling tools (ROCm profilers, GPU traces, system-level monitoring) and work with the compiler team to drive optimizations Create dashboards and reporting that provide clear visibility into performance trends, regressions, and wins Collaborate cross-functionally with compiler engineers, ML engineers, and DevOps to ensure performance testing is integrated into our development workflow Document best practices for performance testing and optimization of ML workloads on GPU hardware Essential Skills and Experience: BS degree in computer science, computer engineering, electrical engineering, or equivalent practical experience 7 years of experience in performance engineering, benchmarking, or systems engineering roles Deep understanding of ML inference workloads, particularly transformer-based models and LLMs Hands-on experience with GPU programming and optimization (CUDA, ROCm, or similar) Strong programming skills in Python and C/C++ Proven track record of building performance testing infrastructure or benchmarking platforms from scratch Experience with ML frameworks (PyTorch, TensorFlow, ONNX Runtime, vLLM, TensorRT-LLM, etc.) Proficiency with profiling and debugging tools for GPU workloads Strong analytical skills with the ability to design experiments, analyze results, and communicate findings clearly Experience with CI/CD systems and test automation frameworks Preferred Skills and Experience: Masters or PhD degree in computer science, computer engineering, electrical engineering, or equivalent practical experience. Experience with AMD GPUs (Mi200/Mi300 series) and ROCm ecosystem Knowledge of compiler optimization techniques and their impact on performance Experience with distributed inference and multi-GPU workloads Familiarity with ML model quantization, pruning, and other optimization techniques Background in high-performance computing or systems-level optimization Experience with infrastructure-as-code (Kubernetes, Docker, Terraform) Contributions to open-source ML or systems projects Personal Attributes: Obsessive about details — you notice the 2% regression that others miss Self-driven — you take ownership and don't wait for permission to solve problems Collaborative mindset — you work well across teams and help others succeed Passionate about sustainability — you care about making AI more efficient and environmentally responsible Clear communicator — you can explain complex technical concepts to both engineers and stakeholders Salary depends on experience and geographical location. This salary range may be inclusive of several career levels and will be narrowed during the interview process based on a number of factors, such as the candidate's experience, knowledge, skills, and abilities, as well as internal equity among our team. Additional benefits for this role may include: equity, company bonus opportunities, medical, dental, and vision benefits; retirement savings plan; and supplemental wellness benefits. Lemurian Labs ensures equal employment opportunity without discrimination or harassment based on race, color, religion, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity or expression, age, disability, national origin, marital or domestic/civil partnership status, genetic information, citizenship status, veteran status, or any other characteristic protected by law. EOE

Keywords: Lemurian Labs, San Jose , Senior ML Performance Engineer, IT / Software / Systems , Santa Clara, California

Didn't find what you're looking for? Search again!

Let Santa Clara recruiters find you. Post your resume for free!

Get Santa Clara IT / Software / Systems jobs via email.

View more San Jose IT / Software / Systems jobs

Other IT / Software / Systems Jobs

Software Engineer - Grok Voice
Description: Job Description Job Description About xAI xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, (more...)
Company: xAI
Location: Palo Alto
Posted on: 02/19/2026

Signal Processing Engineer
Description: Job Description Job Description Array Labs builds advanced radar systems to help humanity understand and respond to changes across the physical world. We re launching a coordinated fleet of radar satellites (more...)
Company: Array Labs
Location: Palo Alto
Posted on: 02/19/2026

Backend Engineer - Enterprise
Description: Job Description Job Description About xAI xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, (more...)
Company: xAI
Location: Palo Alto
Posted on: 02/19/2026

Salary in San Jose, California Area | More details for San Jose, California Jobs |Salary

Senior/Principal DevOps Engineer
Description: Job Description Job Description At Bellota Labs , we are a fast-paced, hypergrowth startup poised to revolutionize the gaming world with ClubWPT Gold a groundbreaking product from the World Poker Tour (more...)
Company: Bellota Labs
Location: Redwood City
Posted on: 02/19/2026

Project Manager (Contract)
Description: Job Description Job Description Title: Project Manager I Employee Experience Location: Alameda, CA Onsite Only Contract Type: W2, 1-Year Contract with possible extensions Rate: 45 60/hr (more...)
Company: Blue Star Partners LLC
Location: Alameda
Posted on: 02/19/2026

M&A Integration Manager
Description: Job Description Job Description About the job Job Summary: Reporting to the VP, Director of Integration, the M amp A Integration Manager is a critical role responsible for ensuring the successful integration (more...)
Company: Attunement Search LLC
Location: Orinda
Posted on: 02/19/2026

Principal Network Engineer
Description: Job Description Job Description About the organization A well-established, community-focused financial institution headquartered in the Bay Area, known for its personalized banking services and long-standing (more...)
Company: LTD Global
Location: Livermore
Posted on: 02/19/2026

Senior Machine Learning Engineer, Recommendation & AI Applications
Description: Job Description Job Description About NewsBreak Founded in 2015, NewsBreak is the Content Intelligence platform shaping the future content economy. With over 40 million monthly active users, our flagship (more...)
Company: NewsBreak
Location: Mountain View
Posted on: 02/19/2026

Senior Software Engineer, Core Experiences - San Mateo, USA
Description: Job Description Job Description Mission Speechify is the easiest way to listen to the world's information. Articles on the web, documents in the cloud, books on your phone. We absorb it all and let you (more...)
Company: Speechify
Location: San Mateo
Posted on: 02/19/2026

Software Engineer, Backend
Description: Job Description Job Description HMBL is your premiere Talent Partner and Executive Search Solution. We were founded on the fact that technical recruiting is most fruitful via partnership than it is transactional. (more...)
Company: HMBL
Location: Mountain View
Posted on: 02/19/2026

Loading more jobs...

Senior ML Performance Engineer

Didn't find what you're looking for? Search again!

Other IT / Software / Systems Jobs

Log In or Create An Account