Objective

Rule

If a failure is not diagnosable for a human, it is not ready for an agent.

Definition of done

Artifact contract

Canonical location test-results/
Failure records test-results/failures/<test-id>.json
Triage output test-results/triage/triage-report.json
Supporting evidence Trace, screenshot, video, stdout, stderr, JUnit, Playwright results.
Recommended structure
/test-results/
  junit.xml
  results.json
  failures/
    <test-id>.json
  triage/
    triage-report.json

Canonical failure schema

{
  "schemaVersion": "1.0",
  "testId": "auth-chromium::specs/sprint5/article.publish.spec.js::user can publish article",
  "title": "user can publish article",
  "project": "auth-chromium",
  "file": "specs/sprint5/article.publish.spec.js",
  "line": 42,
  "environment": "qa",
  "status": "failed",
  "startedAt": "2026-03-28T10:00:00Z",
  "durationMs": 18432,
  "errorType": "TimeoutError",
  "errorMessage": "Timeout 10000ms exceeded while waiting for getByRole('button', { name: 'Publish Article' })",
  "step": "Click Publish Article",
  "evidence": {
    "trace": "test-results/traces/publish-article-trace.zip",
    "screenshot": "test-results/screenshots/publish-article.png",
    "video": "test-results/videos/publish-article.webm",
    "stdout": "test-results/logs/publish-article.stdout.log",
    "stderr": "test-results/logs/publish-article.stderr.log"
  },
  "networkContext": {
    "webBaseUrl": "http://sut.testlab:3000",
    "apiBaseUrl": "http://sut.testlab:3001/api",
    "ciNatMode": false
  },
  "classification": {
    "primary": "LOCATOR",
    "secondary": "TIMING",
    "confidence": 0.86,
    "ruleId": "locator_not_found_role_name"
  }
}

Failure taxonomy

Deterministic rules first
const FAILURE_RULES = [
  { id: 'dns_lookup_failed', match: /ENOTFOUND|getaddrinfo/i, primary: 'ENV' },
  { id: 'connection_refused', match: /ECONNREFUSED/i, primary: 'ENV' },
  { id: 'unauthorized_forbidden', match: /401|403|unauthorized|forbidden/i, primary: 'AUTH' },
  { id: 'duplicate_or_exists', match: /duplicate|already exists|unique constraint/i, primary: 'DATA' },
  { id: 'locator_not_found_role_name', match: /locator|getByRole|getByLabel|strict mode violation|not found/i, primary: 'LOCATOR' },
  { id: 'timeout_waiting', match: /Timeout|waiting for/i, primary: 'TIMING' },
  { id: 'assertion_failure', match: /expect\(|toBe|toEqual|received|expected/i, primary: 'ASSERTION' }
];

Commands

Validate artifact surface

find test-results -maxdepth 3 -type f | sort
cat test-results/failures/sample-failure.json | jq .
Human override rule

If the classification is wrong, fix the rule or add a rule. Do not patch output manually and call that “agent intelligence.”

Expected outputs

Failure modes

Operational value