Who this seat is for
We are not looking for a backend engineer who "added AI." We are not looking for someone whose LLM experience is internal tooling and a side project.
We are looking for an ML-leaning engineer who has built, shipped, and stayed on call for AI systems inside a growth-phase AI SaaS company. Somewhere AI was the product, not a feature added later.
You think like an ML engineer first. The model call is the easy part. The interesting part is everything that wraps it.
What you will own
- Design and evolution of agentic reasoning flows inside the control plane
- Retrieval quality, embedding strategy, grounding discipline
- Evaluation pipelines and the metrics they hold us to
- Hallucination detection and how we react to it in real time
- Structured outputs that downstream services can actually trust
- Human-in-the-loop escalation logic — when Lena should not act
- Cost and latency optimization under real production traffic
- Model selection and the discipline around when to switch
You will work directly with the AI Team Lead, the EVP Product and Engineering, the backend engineers who own action execution, and the UX team that shapes how operators see Lena. This seat shapes how intelligence behaves in the platform.
Hard requirements
- 5+ years of ML engineering, or a combined 5+ years across ML and applied AI
- 2+ years building and shipping LLM-powered systems inside a growth-phase AI SaaS company
- Hands-on RAG ownership: vector databases, embedding tuning, retrieval optimization, grounding strategy, and the failure modes of each
- Built evaluation pipelines for LLM performance and reliability that measured something real and changed a decision
- Strong Python on production systems, not on notebooks
- Comfort operating under live constraints: latency, cost, observability, safety
If your LLM experience is experimentation, side projects, or non-production work, this is not the right level. Honest answer: take a year at this level somewhere else first, then come back. We will still be here.
High-signal indicators
- Designed agentic workflows that measurably improved on a baseline
- Came from an AI-native company that scaled from early traction into growth
- Reduced hallucination or improved grounding in production, with the numbers to show it
- Cost optimization at scale — caching, prompt redesign, retrieval rework
- Built or operated AI logging that an enterprise security team would sign off on
What "senior" means here
You can take an ambiguous AI goal and turn it into a structured system design. You define the evaluation before the feature ships. You anticipate the failure mode before it happens. You hold the line between probabilistic reasoning and deterministic safeguards. You improve systems with data, not with vibes. You bring the next step, not just the problem.
You are comfortable being accountable for what the AI did at 2am.
Who should not apply
- Backend engineers looking to pivot into AI
- Research-focused profiles without production ownership
- Engineers whose AI work never met a paying customer
- Candidates without exposure to growth-stage product pressure
Why this seat matters
We are building a defensible AI architecture inside a category that is still being named. The person in this seat shapes how Lena reasons. How reliable our automation becomes. How much cost we burn to get there.
The work compounds. So does the responsibility.