See it first — 2 minutes, no waiting for your own data
The impact report and the skills leaderboard are most convincing on real data — so both ship with a one-command sandbox that seeds a realistic agent and trace corpus into your workspace. See the payoff before you instrument anything.For engineers
Catch regressions before they ship.Seeds a v1→v2 agent change and links you straight to the impact report — which production traces the change would break, which may behave differently, and which are unaffected.
For prompt engineers
Find skills that actually work.Seeds three skills with real, varied effectiveness and links you to the ranked registry — per-model pass rates and cross-org activation, not download counts.
You’ll do five things in this guide:
By the end, your team’s next agent change will get an automatic structural impact analysis on the PR — without you writing any eval cases. Here’s the kind of report that lands on the PR:
Run this in Colab
Run the SDK portion interactively — no local setup required.
1. Install the SDK
2. Get Your API Key
Sign in to the DecimalAI Dashboard and navigate to Settings → API Key. Or generate one via the API:3. Instrument Your Agent
- LangChain
- OpenAI Agents
- LlamaIndex
- CrewAI
- AutoGen / AG2
- Any Framework
- Environment Variables
Run this in Colab
Live notebook, no setup — just paste your API key.
Auto-detection depth varies by framework. LangChain and OpenAI Agents (with explicit
install(agent=...)) extract full tool schemas; LlamaIndex / CrewAI / AutoGen extract tool names only. See the capability matrix before deciding which integration to commit to.4. View Your Traces
Open the Traces page in the dashboard. Your first trace should appear within seconds. Each trace is auto-tagged with the of the agent that produced it — this is what powers the regression check in the next step.5. Add the Regression Check to your PRs (recommended)
Now wire DecimalAI into your CI so every PR gets a manifest impact report. This is the most-used capability for engineering teams. Three things, all copy-pasteable below: a tinyscripts/init_for_decimal.py that calls your agent factory, a .github/workflows/decimal.yml that runs it under DECIMALAI_MODE=manifest_only, and your DECIMAL_API_KEY in GitHub Secrets. Here’s what runs on every PR:
1. Add scripts/init_for_decimal.py — five lines that import and call your existing agent factory. In manifest_only mode the SDK reads tools, prompts, and models from the runtime objects, then exits without any LLM calls:
scripts/init_for_decimal.py
.github/workflows/decimal.yml:
.github/workflows/decimal.yml
DECIMAL_API_KEY secret in Settings → Secrets and variables → Actions → New repository secret, with the value from app.decimal.ai/settings.
That’s the whole setup. On your next PR you’ll get a comment like this within ~30 seconds:
Here for skills instead?
The steps above wire up the regression capability — what most teams start with. The skills workflow is a separate, shorter track (no GitHub Action needed):Browse the registry
Find skills ranked by SkillScore in the public registry — no signup.
Prove one helps
A/B-benchmark a skill with
skillevaluation (pip install "skillevaluation[runner]") to measure its lift on your own cases.Install it
Fork it into your workspace and write it to disk with
router.install(...) — see the Skills guide.Next Steps
Regression Check Guide
Full configuration, troubleshooting, and severity tuning for the GitHub Action.
Manifests Guide
What manifests capture, how diffs work, and the compatibility policy model.
Concepts
How traces, manifests, evals, and datasets connect.
Training Pipeline
End-to-end: trace → evaluate → fine-tune.