Skip to main content
If you ship agent code to users — and your job is to make sure agent changes don’t silently break in production — this is your reading path. The goal: by the end you have automatic, pre-deploy structural impact analysis on every PR, and post-deploy regression bisect if something does break.
This page is a curated reading path, not new material. Every link goes to an existing guide. Treat it as the “what to read in what order” map.

The problem this page solves

Agents are non-deterministic by nature, but their configuration — prompts, tools, models, output schemas, sub-agent topology — is deterministic. DecimalAI fingerprints that configuration into a manifest, attaches the manifest to every trace, and uses manifest-diffs to predict what changes in your code will break what production traffic. You don’t write eval cases up front. You don’t pause to define “correctness.” You ship code and get a structural impact report on the PR.

Reading path (in order)

1

Step 0 — see the payoff on seeded data (2 minutes)

Before touching your agent, run the sandbox so you know what you’re wiring up:
pip install decimalai          # Python 3.10+
decimalai demo regression      # → a live impact report on a seeded v1→v2 change
2-Minute Demo walkthrough explains what the report is telling you.
2

Get your first trace in under 5 minutes

Quickstart — install, init, see a trace land in the dashboard.Pick the tab for your framework. If you’re using something we don’t have a tab for, see generic OTel.
3

Understand what's actually being captured

Tracing guide — what counts as a trace, what counts as a span, what gets serialized and what doesn’t.Manifests guide — the deterministic fingerprint of your agent. Read this carefully; it’s the foundation everything else builds on.
4

Wire the GitHub Action

Regression Check guide — the GitHub Action that posts a manifest-impact comment on every PR.This is the workflow that gives you the killer outcome: a PR that changes the agent’s tools gets 🔴 HIGH IMPACT — 247 traces will break (called the removed tool).
5

Understand the severity bands

Compatibility Policies guide — how high_risk / medium_risk / low_risk are computed and how to tune the thresholds.Read this before you start ignoring the action’s verdicts. The defaults are conservative on purpose.
6

Set up post-deploy bisect

Post-deploy bisect guide — what to do when a regression slips through and you need to find which manifest version introduced the bad behavior.This is the workflow you’ll reach for at 3am when your agent suddenly stops calling a critical tool.
7

Wire replay for the medium-risk cases

Replay guide — for the changes the regression check can only label medium_risk, replay surfaces the actual behavioral diff by re-running historical inputs through both manifests.

What you can skip (for now)

FeatureSkip if…Read it when…
EvaluationsYou only want deploy-safety value. The 5 built-in checks run automatically on every trace and give you keep / repair / replay / drop verdicts — no eval cases required.You have a specific quality regression you want to write a check for. See the evaluations guide.
DatasetsYou’re not training models. Datasets are the export pipeline (trace → JSONL → fine-tuning).You want to turn traces into training data. See the datasets guide.
SkillsYou have a single agent. Skills are reusable agent knowledge files — an optional add-on.You have multiple agents that share knowledge. See the skills guide.

The five files you’ll actually touch

FileWhat you change it for
scripts/init_for_decimal.pyOnce. Calls your agent factory so the regression check knows your manifest.
.github/workflows/decimal.ymlOnce. Wires the GitHub Action.
Your agent codeAdd decimalai.init(api_key=..., <framework>=True) once. Never again.
compatibility-policy.yaml (optional)When you want to tune what counts as high_risk.
pyproject.toml / requirements.txtWhen you upgrade the SDK.
That’s the entire integration surface for the deploy-safety workflow.

Failure modes and where to look

SymptomWhere to start
Action doesn’t comment on PRRegression Check guide → Troubleshooting
Action says “no baseline”You haven’t ingested a manifest in production yet. Deploy once, then run the action.
Action says high_risk on a no-op refactorManifest is hashing something it shouldn’t (e.g. dynamic prompt). See Manifests guide → False drift
Production trace volume dropped to zeroCheck Webhooks for alert config; check the dashboard’s volume chart

When to talk to the platform-team page instead

If your job is more about operating DecimalAI itself — wiring webhooks to PagerDuty, configuring teams + RBAC, debugging multi-agent flows — read DecimalAI for Platform Teams.

What’s next

Quickstart

Start at the top.

Regression Check

The killer workflow for engineers.