This page is a curated reading path, not new material. Every link goes to an existing guide. Treat it as the “what to read in what order” map.
The problem this page solves
You have engineering teams shipping agents on DecimalAI. Your job is to make sure:- When something goes wrong, the right person gets paged (webhooks).
- When something goes wrong across multiple agents, you can trace the cross-agent flow (multi-agent debugging).
- The right people have access to the right agents (teams + RBAC).
- The deployment satisfies your org’s security and compliance bar.
Reading path (in order)
Step 0 — new to DecimalAI? (2 minutes)
Before wiring anything up, get a feel for what DecimalAI does. Run the 2-minute demo:The 2-Minute Demo walkthrough explains what the report is telling you. Prefer prose? Skim the Introduction first.
Understand the multi-agent debugging model
Multi-agent guide — how orchestrator → sub-agent handoffs get traced, how
parent_trace_id links them, and what the dashboard shows for cross-agent flows.Read this even if your engineers don’t think they have a multi-agent system. Any agent that calls another agent (e.g. a router calling specialists) shows up here.Wire webhooks for production alerts
Webhooks guide — outbound HTTP callbacks. The event registry includes
regression.detected, regression.resolved, manifest.changed, usage.warning, usage.limit_reached, payment.failed, and payment.confirmed.The common pattern: webhook → PagerDuty / Opsgenie / Slack. Specifically: subscribe to regression.detected to feed your existing incident system, and regression.resolved to auto-close.Set up teams and per-agent RBAC
Teams guide — workspaces, teams, per-agent role assignments, the audit log.The common shape: one workspace per business unit, one team per product, agents owned by the team that ships them. Platform team gets workspace-admin; product engineers get team-member.
Review the security model
Security page — encryption at rest / in transit, data residency, retention policies, SOC-2 status, and the redaction / PII handling model.Pay special attention to the redaction section if your agents see PII — DecimalAI’s default is to store full payloads; redaction is opt-in via SDK config.
Configure trace retention + budgets
Pricing page for the per-tier retention limits.Trace volume budgets are set per workspace. Set conservative budgets early — you can raise them; you can’t easily un-page someone who got an after-hours bill alert.
Operational dashboards you’ll live in
| Dashboard | What it tells you |
|---|---|
Traces volume chart (/traces) | Daily ingest rate per agent. Watch for sudden drops (instrumentation broken) or spikes (someone reran a backfill). |
Eval verdict mix (/) | What fraction of traces are landing as keep / repair / replay / drop. Drift in this mix usually means a model regression or a policy change. |
| Manifest timeline (per agent) | Which manifest is active. When you see two manifests both labeled “active” — that’s a red flag (rollback in progress, or stale traffic from a previous deploy). |
Regression alerts (/alerts) | Currently-firing alerts and their dismissal history. |
Integration patterns
Page on regression-check failure (PagerDuty)
Page on regression-check failure (PagerDuty)
Subscribe a webhook to
regression.detected with severity threshold = high_risk. Webhook fires to PagerDuty Events API. Auto-page the on-call for the affected agent’s team.See Webhooks → Event types.Sync DecimalAI alerts to Slack
Sync DecimalAI alerts to Slack
Webhook → simple Cloud Run function → Slack incoming webhook for the team channel. Filter by
agent_name to route to the right team’s channel.Export traces to your data warehouse
Export traces to your data warehouse
Two options: (1) periodic JSONL export via the Datasets API into your bucket, or (2) webhook-based per-trace streaming via the Webhooks guide. Option 1 is cheaper for daily/weekly aggregates; option 2 is needed for real-time dashboards.
Mirror DecimalAI to your existing APM (Datadog / New Relic)
Mirror DecimalAI to your existing APM (Datadog / New Relic)
Use the generic OTel integration on the SDK side. The SDK can dual-emit to DecimalAI and your existing OTel collector, so you get DecimalAI’s manifest-aware view and your APM’s flame-graph view without instrumenting twice.
When to talk to the engineers page instead
If your job is more about writing agent code — adding traces, configuring regression checks, debugging your own agent’s behavior — read DecimalAI for Engineers first.What’s next
Multi-agent
Cross-agent trace linking and orchestrator patterns.
Webhooks
Event types, payloads, retry / signature verification.
Teams
Workspaces, RBAC, audit log.
Security
Encryption, data residency, SOC-2.