Skip to main content
The Replay API runs historical traces back through your agent under a different manifest version. Use it to empirically confirm whether a change actually breaks the things the regression check flagged as risky — or to recover traces that the pre-deploy check marked medium_risk (where structural reasoning can only say “might differ”).

When to use replay

Confirm a structural prediction

The regression check said medium_risk (model swap, prompt rewrite). Replay actually runs the affected traces through the new manifest so you can see the new outputs side-by-side with the originals.

Reproduce a flaky bug

A trace failed in production. Replay it against the same manifest to see if the failure is deterministic, then against your fix branch to verify it’s resolved.

Build training data from drift

Replay flagged-for-repair traces against a known-good manifest, then export as JSONL for SFT. This is the bridge between trace history and the Datasets API.

Re-score with a new evaluator

Add a new @eval function. Replay traces under the same manifest to re-score them without re-running the agent.

Lifecycle

Endpoints at a glance

MethodPathPurpose
POST/api/v1/replay/batchesCreate a new replay batch (selects which traces to run)
GET/api/v1/replay/batches/{batch_id}Track progress + retrieve aggregate results
GET/api/v1/replay/exportExport trace prompts or replay results as JSONL
POST/api/v1/replay/tasks/{task_id}/submitSubmit a single task’s result (called by replay workers)

Quick start

import httpx

# 1. Create a replay batch — every "drop"-verdict trace from the last 7 days,
#    re-run against manifest v4.
resp = httpx.post(
    "https://api.decimal.ai/api/v1/replay/batches",
    headers={"Authorization": "Bearer dai_sk_..."},
    json={
        "agent_name": "support-agent",
        "target_manifest_id": "mfst_v4_abc",
        "trace_filter": {"eval_verdict": "drop", "since_days": 7},
    },
)
batch_id = resp.json()["batch_id"]

# 2. Poll for completion
import time
while True:
    status = httpx.get(
        f"https://api.decimal.ai/api/v1/replay/batches/{batch_id}",
        headers={"Authorization": "Bearer dai_sk_..."},
    ).json()
    if status["status"] == "completed":
        break
    time.sleep(5)

print(f"{status['kept']} kept · {status['repaired']} repaired · {status['still_failing']} still failing")