Senior AI Engineer

Who this seat is for

We are not looking for a backend engineer who "added AI." We are not looking for someone whose LLM experience is internal tooling and a side project.

We are looking for an ML-leaning engineer who has built, shipped, and stayed on call for AI systems inside a growth-phase AI SaaS company. Somewhere AI was the product, not a feature added later.

You think like an ML engineer first. The model call is the easy part. The interesting part is everything that wraps it.

What you will own

Design and evolution of agentic reasoning flows inside the control plane
Retrieval quality, embedding strategy, grounding discipline
Evaluation pipelines and the metrics they hold us to
Hallucination detection and how we react to it in real time
Structured outputs that downstream services can actually trust
Human-in-the-loop escalation logic — when Lena should not act
Cost and latency optimization under real production traffic
Model selection and the discipline around when to switch

You will work directly with the AI Team Lead, the EVP Product and Engineering, the backend engineers who own action execution, and the UX team that shapes how operators see Lena. This seat shapes how intelligence behaves in the platform.

Hard requirements

5+ years of ML engineering, or a combined 5+ years across ML and applied AI
2+ years building and shipping LLM-powered systems inside a growth-phase AI SaaS company
Hands-on RAG ownership: vector databases, embedding tuning, retrieval optimization, grounding strategy, and the failure modes of each
Built evaluation pipelines for LLM performance and reliability that measured something real and changed a decision
Strong Python on production systems, not on notebooks
Comfort operating under live constraints: latency, cost, observability, safety

If your LLM experience is experimentation, side projects, or non-production work, this is not the right level. Honest answer: take a year at this level somewhere else first, then come back. We will still be here.

High-signal indicators

Designed agentic workflows that measurably improved on a baseline
Came from an AI-native company that scaled from early traction into growth
Reduced hallucination or improved grounding in production, with the numbers to show it
Cost optimization at scale — caching, prompt redesign, retrieval rework
Built or operated AI logging that an enterprise security team would sign off on

What "senior" means here

You can take an ambiguous AI goal and turn it into a structured system design. You define the evaluation before the feature ships. You anticipate the failure mode before it happens. You hold the line between probabilistic reasoning and deterministic safeguards. You improve systems with data, not with vibes. You bring the next step, not just the problem.

You are comfortable being accountable for what the AI did at 2am.

Who should not apply

Backend engineers looking to pivot into AI
Research-focused profiles without production ownership
Engineers whose AI work never met a paying customer
Candidates without exposure to growth-stage product pressure

Why this seat matters

We are building a defensible AI architecture inside a category that is still being named. The person in this seat shapes how Lena reasons. How reliable our automation becomes. How much cost we burn to get there.

The work compounds. So does the responsibility.

Who this seat is for

What you will own

Hard requirements

High-signal indicators

What "senior" means here

Who should not apply

Why this seat matters

Two ways in.

Lab scenarios linked to this role.

Lena Was Wrong

The Boardroom Incident

Improve the RAG