Runbook 04 — CI artifacts and “debug without rerun”

Validate that CI publishes sufficient artifacts to diagnose failures without re-executing tests. Artifact paths must be deterministic, discoverable, and usable by another engineer.

blob report (per shard) merged HTML report screenshots video trace junit (optional)

Definition: “Debug without rerun” means you can determine why it failed using CI logs + artifacts only.

Objective + success criteria

Objective: Verify that CI pipelines upload artifacts required to diagnose failures without rerunning the pipeline, and that artifact links remain available for review.

Rule: Every failing test must produce at least one of: screenshot, trace, video, and a human-readable error context file (if implemented).

Success criteria:

  • Shard jobs upload deterministic per-shard output (blob-report-*/)
  • Merge job publishes a merged HTML report (playwright-report/)
  • Failure diagnosis is possible using artifacts + CI logs only (no re-execution required)
  • Artifacts are attached to jobs and accessible via GitLab UI

Artifact contract (what must exist)

Job type Must upload Why it exists
proof_sharded (each shard) blob-report-*/
test-results/ (optional but recommended)
Preserves per-test evidence (trace/video/screenshot) and blob report inputs for merging.
proof_merge_and_publish playwright-report/
junit.xml (optional)
Creates one merged report for reviewers; supports MR widgets / trend tooling via JUnit.

If using error-context.md (like Sprint 6 logs show), treat it as part of the evidence contract.

Debug without rerun (how to use CI artifacts)

1) Open merged report (fastest path)

  • In GitLab job page → Artifacts → browse playwright-report/
  • Open index.html (or GitLab’s artifact viewer if configured)
  • Identify the failing test, then open its attachments (trace/video/screenshot)

2) Download artifacts and replay trace locally

This is the “no rerun” power move: replay CI’s exact browser timeline.

# After downloading artifacts zip(s) from GitLab:
# Example trace path (from logs):
npx playwright show-trace blob-report-2/**/trace.zip

3) Review the error context file (if present)

# Many suites write an additional error context markdown file.
# Example path (from logs):
cat blob-report-2/**/error-context.md | sed -n '1,200p'

4) Confirm it is not an environment illusion

If the failure smells like routing/env mismatch, validate the job’s runtime contract echoed in logs (WEB_BASE_URL / API_BASE_URL) before touching test code.

Exact commands

Local reproduction of artifact generation (developer machine or runner)

Prefer configuring artifacts in playwright.config.* so CI and local behavior match. CLI flags can vary by Playwright version.

# Run the same project CI uses (examples: chromium, auth-chromium)
npx playwright test --project=auth-chromium --reporter=line

Recommended Playwright config snippet (artifact-friendly)

// playwright.config.js (example)
use: {
  screenshot: 'only-on-failure',
  video: 'on-first-retry',
  trace: 'on-first-retry'
},
retries: 2

Open local report

npx playwright show-report

Locate local artifacts

ls -la playwright-report || true
ls -la test-results || true

CI-side enforcement (recommended)

These checks fail the job if evidence directories are missing. Put shard checks in shard jobs; merge checks in merge job.

# Shard job: must produce blob-report-*
test -d "blob-report-${CI_NODE_INDEX}" || (echo "missing blob-report-${CI_NODE_INDEX}" && exit 1)

# Merge job: must produce merged HTML report
test -d playwright-report || (echo "missing playwright-report" && exit 1)
test -f playwright-report/index.html || (echo "missing playwright-report/index.html" && exit 1)

Expected outputs (logs, screenshots, artifacts)

Merged HTML report playwright-report/index.html exists and can be opened/downloaded from GitLab artifacts.
Blob inputs At least one blob-report-* directory exists in shard job artifacts (more when sharded).
Failure attachments Failed tests include screenshot and trace/video based on config (trace.zip, video.webm, test-failed-*.png).
Diagnosis proof A reviewer can identify root cause using report + attachments + logs (no rerun required).

Failure modes + how to diagnose

Artifacts missing from CI job

Symptom: job fails or completes but no artifacts are attached.

Diagnosis: check artifacts:paths and working directory paths.

# In CI logs (add this near end of the job)
pwd
ls -la
find . -maxdepth 2 -type d -name "blob-report-*" -o -name "playwright-report" -o -name "test-results" | sed -n '1,200p'

Action: correct artifact paths and ensure they are relative to the job working directory.

Artifacts generated locally but not in CI

Symptom: local run produces trace/video; CI run does not.

Diagnosis: ensure CI is using the same Playwright config and project.

cat playwright.config.* 2>/dev/null || true
node -e 'console.log("CI=", !!process.env.CI, "PW_VERSION=", require("@playwright/test/package.json").version)'

Artifacts exist but aren’t useful

Symptom: report shows failure but no trace/video/screenshot and no context.

Diagnosis: artifact settings too strict (e.g., trace off, video off) or retries disabled.

Action: enable trace: "on-first-retry" and video: "on-first-retry", and keep retries consistent.

Artifacts expire before review

Symptom: artifacts missing due to expiration.

Action: increase artifact retention window for review workflow (especially MR review + triage).

Why it matters (production relevance)

Artifact-first pipelines reduce cycle time and eliminate rerun dependence. When failures are diagnosable from artifacts alone, teams triage accurately, reduce flaky reruns, and maintain trust in CI as a release signal.