NetSpeek

FILE 07.02 / SCENARIO

SIMULATED · FOR CANDIDATE EVALUATION · NOT REPRESENTATIVE OF PRODUCTION SYSTEMS

Lena Was Wrong

A confident AI diagnosis that turned out to be wrong. Find the flaw in the reasoning, then propose how we'd catch it next time.

FILE 07.02.1 / SETUP

Last quarter, on a routine Tuesday, Lena handled a "camera offline" incident in Conference Room 4B.

She was confident. 91%. She had four retrieved documents that all pointed at the same answer: the PoE switch was dropping power to the camera. Her proposed action was to cycle PoE power on switch port 14. That action would be safe to take — it's reversible, isolated to one device, and the runbook explicitly endorses it as a first move.

The operator on duty did the human thing: she read Lena's diagnosis, glanced at the evidence, and… overrode. She'd seen something Lena hadn't. The actual root cause was a failed HDMI capture card on the encoder appliance. No PoE involvement at all. The operator pushed a replacement encoder, the room was back up in 26 minutes.

So Lena was wrong. And she was confidently wrong — which is the harder failure mode to catch.

We do postmortems on every incident where Lena's diagnosis didn't match ground truth. Below is the snapshot from that workflow: Lena's diagnosis, the retrieved evidence she used, and the ground truth (revealed after operator override). Read it carefully — there's a specific failure mode here that has a name.

We want you to find the name.

FILE 07.02.2 / TELEMETRY

What is on the operator's screen.

Real-shaped operational data. Anonymized device IDs, real-shaped timing. The same view an on-call engineer would see in the moment.

SOURCE · 01 / lena_diagnosisLIVE
workflow_idincident-2c9d11
diagnosisConference room 4B camera failure due to PoE switch drop
confidence0.91
model_classreasoning-llm
retrieved_docs4
tokens_in6210
tokens_out480
latency_ms1832
lena.diagnosis · snapshot
SOURCE · 02 / retrieved_evidenceLIVE

doc_1

titlePoE troubleshooting guide
chunk_idpoe-tg-ch7
relevance_score0.88

doc_2

titleCamera-offline runbook
chunk_idrunbook-camera-offline-step4
relevance_score0.86

doc_3

titlePoE troubleshooting guide
chunk_idpoe-tg-ch7
relevance_score0.85

doc_4

titleNetwork architecture overview (stale)
chunk_idnet-arch-section2
relevance_score0.71
retrieved evidence used for diagnosis
SOURCE · 03 / ground_truthLIVE
root_causeCapture appliance failed on the encoder side — no PoE involvement.
time_to_real_resolution_min26
operator_override_at_min4
operator_correctness_checkcorrect
what actually happened (revealed after operator override)

FILE 07.02.3 / LENA REASONING

What Lena would recommend right now.

LENA · DIAGNOSIS // CONFIDENCE 91%FLAGGED

PoE switch is dropping power to the conference room camera; restart the switch port.

Proposed action

Cycle PoE power on switch port 14 for conference room 4B.

Cited evidence

  • 01Two retrieved docs agree on the PoE-switch-drop pattern.
  • 02Switch port 14 telemetry shows a transient power drop in the last hour.
  • 03Camera-offline runbook step 4 calls for PoE cycle as first action.
CONFIDENCE91%INTENTIONALLY FLAWED · FIND THE FLAW

FILE 07.02.4 / YOUR RESPONSE

Tell us what you would do.

Short and specific beats long and vague. The next step is the application form — we save what you have written here so you do not lose it.

Was it retrieval, grounding, evaluation, prompt design, or something else? We want the technical name for the failure, not just 'Lena was confident but wrong.'

0 / 800
0 / 600

A name, what it measures, and roughly how you'd compute it. We don't need pseudocode.

0 / 500
0 / 300
ROLE / TARGETREQUIRED

This scenario maps to 2 roles. Pick the one you want your application attached to.

SKIP / TAKE FIELD NOTE PATH

Fill in each prompt to continue. The soft minimums are guidance, not gatekeeping.