Skip to main content
If something isn’t working, this is the first place to check. Start with the triage table, then open the matching section below for the fix.
SymptomLikely causeSection
Traces aren’t appearing in the dashboardMissing key, wrong base URL, no framework flag, unflushed buffer, wrong workspaceTraces aren’t appearing
401 Unauthorized on every requestKey missing, malformed, or expired401 Unauthorized
402 Payment RequiredPlan quota exhausted402 Payment Required
409 Conflict on skill sync or registry installResource already exists (idempotent) — safe to ignore409 Conflict
429 Too Many RequestsPer-plan rate limit429 Too Many Requests
Traces appear but no evaluation / verdictNo eval policy, silent check failure, or worker lagNo evaluation / verdict
regression-check says “no manifest”Manifest not registered for this agent + refManifest not registered
SDK is blocking my agent / requests are slowClient re-created in hot path, or stale SDKSDK is blocking my agent
Multi-agent traces aren’t linkingparent_trace_id not propagatedMulti-agent traces
Webhooks aren’t firingBad URL, event disabled, slow handlerWebhooks aren’t firing
Most often one of:
  1. DECIMAL_API_KEY is missing or wrong. Confirm with echo $DECIMAL_API_KEY — it should start with dai_sk_. If you’re using .env files, make sure they’re loaded before decimalai.init().
  2. Wrong base URL. If you’re self-hosting or pointing at staging, set DECIMAL_BASE_URL explicitly:
    export DECIMAL_BASE_URL=https://api.decimal.ai
    
  3. Framework flag missing. decimalai.init(api_key="...") alone doesn’t capture traces — you also need a framework flag:
    decimalai.init(api_key="...", langchain=True)   # or openai_agents, llamaindex, crewai, autogen, otel
    
    See Tracing for the full matrix.
  4. Process exited before flush. The SDK buffers traces and flushes on a background timer. An atexit handler drains the buffer on normal interpreter shutdown, so most short-lived scripts are covered automatically. The gap is processes that are hard-killed (SIGKILL, os._exit(), a crash) before atexit runs. For those, flush explicitly before exit:
    import decimalai
    
    try:
        run_agent()
    finally:
        decimalai.flush()  # block until the buffer drains
    
  5. Wrong workspace. SDK traces are routed by the API key you send them with — and, if set, the project you pass to init() (sent as the X-Decimal-Project header). There is no team= parameter or DECIMAL_TEAM env var. Confirm the dashboard’s active workspace matches the key, and that any decimalai.init(project="...") value is the one you expect. See Teams & Workspaces for how routing resolves.
The API returns four status codes you’ll see most often. This table is the at-a-glance summary; the per-code detail follows in the accordions below.
CodeMeaningFirst thing to try
401Key missing, malformed, or expiredRe-issue the key; confirm the Bearer dai_sk_… header
402Plan quota exhaustedRead detail for the metric; wait for reset or upgrade
409Resource already exists (idempotent)Safe to ignore — no action needed
429Rate limit hitHonor Retry-After; batch your requests
401 detail. Your API key is missing, malformed, or expired.
# Confirm header format
curl https://api.decimal.ai/api/v1/agents \
  -H "Authorization: Bearer dai_sk_YOUR_KEY"
Re-issue the key in Settings → API Keys. Old keys are revoked when you create a new one with the same label.If you’re in Clerk dashboard mode and getting 401 on export endpoints specifically, this is a known issue (getApiKey() legacy fallback). Workaround: use a workspace API key explicitly.
You’ve hit your plan’s quota — usually traces ingested per month or SFT rows generated.The detail field names the exhausted metric:
{ "detail": "Plan limit reached: traces_ingested (5000 / 5000 for plan=free)" }
Either:
  • Wait for the next billing period (resets at month boundary).
  • Upgrade in Settings → Billing.
The dashboard banner shows your current usage. If usage feels too high, look for runaway tests or accidental dev traffic hitting prod keys.
This is safe to ignore — it means the resource already exists in an equivalent state.
  • Skill sync 409: the body hash already matches an existing version. No new version was created.
  • Registry install 409: you’ve already installed this skill in this org.
Both endpoints are idempotent by design.
You’re hitting the per-plan rate limit. Responses include Retry-After:
HTTP/1.1 429 Too Many Requests
Retry-After: 12
The SDK respects this automatically. If you’re calling the API directly, sleep for the indicated seconds.To reduce request count, send traces in batches instead of one-at-a-time. The SDK automatically batches when buffer thresholds are hit. For direct API use, hit POST /api/v1/traces/batch with up to 100 traces per call.See Errors for the full rate limit table.
Evaluations run asynchronously by default. After ingest:
  1. Background eval worker scores each trace against the active policy.
  2. Decision engine computes a unified verdict (pass / fail / review).
Common causes of missing verdicts:
  • No evaluators configured for the agent. Attach one from the Evaluate dashboard’s Auto-Scoring panel, or register evaluators via /api/v1/evaluators.
  • Custom eval check failed silently. Check the trace detail page → “Eval Errors” section.
  • Background worker hasn’t caught up. New traces typically score within 30s. Refresh the dashboard.
The GitHub Action looks up the manifest by agent_name + the git ref it’s running against. If the manifest hasn’t been registered, the action skips the check.Fix: run scripts/init_for_decimal.py (or your equivalent) with DECIMALAI_MODE=manifest_only as a step before decimal-labs/regression-check@v1:
- name: Register manifest
  env:
    DECIMALAI_MODE: manifest_only
    DECIMAL_API_KEY: ${{ secrets.DECIMAL_API_KEY }}
  run: python scripts/init_for_decimal.py

- uses: decimal-labs/regression-check@v1
  with:
    api-key: ${{ secrets.DECIMAL_API_KEY }}
    agent-name: support-agent
manifest_only mode runs the manifest-extraction code path without actually invoking your agent.
The SDK ingests traces in a background thread by default — the request path should never block on network I/O. If you’re seeing latency added to your agent:
  1. Make sure you call decimalai.init() once at startup and reuse it — don’t construct new clients in hot paths.
  2. Confirm you’re on a current SDK (pip install -U decimalai); background flush has been the default for a long time.
For parent-child agent calls to show up as a tree, the parent must propagate the trace_id:If you’re using LangGraph, CrewAI, or the OpenAI Agents SDK, parent–child linkage happens automatically — just verify the framework flag is set on decimalai.init(). For custom orchestrators, capture the parent’s trace ID and pass it to the child’s parent_trace_id:
import decimalai

# Parent (orchestrator) — capture its trace ID, then hand it to the child
with decimalai.start_trace(agent_name="orchestrator") as parent:
    parent.log_llm_call(model="gpt-4o", input=msgs, output=resp)
    parent_id = parent.get_trace_id()

    # Child (sub-agent) — link it by passing parent_trace_id
    with decimalai.start_trace(
        agent_name="researcher",
        parent_trace_id=parent_id,
    ) as child:
        child.log_llm_call(model="gpt-4o", input=sub_msgs, output=sub_resp)
The platform displays the tree as long as parent_trace_id is set on the child trace.
  1. Confirm the URL. curl -X POST <your-url> from your terminal — does the endpoint accept the request?
  2. Confirm the event is enabled. Settings → Notifications → Enabled events.
  3. 5-second per-attempt timeout. A handler that takes longer than 5 seconds counts as a failed attempt. Acknowledge fast (return 200), then process asynchronously.
  4. Failed deliveries are retried with backoff, so a transient outage self-heals — but make your handler idempotent (deduplicate on the X-Decimal-Event-Id header) so a redelivered event isn’t processed twice. See Webhooks for signing and retry detail.

Tracing

How auto-detection picks up your framework, how spans are stitched together, and what you can override.

Manifests

The capability matrix per framework — useful when “my tools/prompts aren’t being captured” is the actual problem.

Errors

Full list of error codes the API can return, and what each one means.

Webhooks

Delivery semantics, HMAC signing, and retry — read this first if webhooks aren’t firing.

Still stuck?