Skip to main content
Those tools detect regressions by running your eval suite on a new agent version. That works only if you’ve already written eval cases — most teams haven’t, and the ones they have are usually stale.DecimalAI works differently. Your production traces are tagged with the manifest they ran under. When you propose a manifest change, we identify which traces depended on what’s changing and tell you the structural blast radius — no eval suite required.The tools are complementary. Use both if you want behavioral verification on the eval surface; use DecimalAI alone if you don’t have an eval suite yet. See Why DecimalAI? for a detailed comparison.
No. Pre-deploy regression checks run entirely against our trace store — we don’t run your agent, so we don’t need your LLM credentials.The Playground does need an LLM key when you click “Run”, but it’s BYOK — you paste your own OpenAI/Gemini key into Settings and it stays there. We never proxy or store outbound LLM traffic on your behalf.
No, and we don’t recommend that. DecimalAI runs alongside LangSmith, Braintrust, Langfuse, Phoenix, etc. Pipe traces into both — we add the manifest layer underneath. The integrations work via OpenTelemetry, so most tools coexist cleanly.The exception: if you’re using one of those tools only for regression detection and not for trace search/eval/etc., you may be able to drop it after adopting DecimalAI’s regression check.
For each trace: agent name, input/output text, LLM call messages (prompts and completions), tool calls (name + arguments + results), token counts, latency, cost estimate, the manifest hash, and any eval scores you push.See Security for retention periods, encryption details, and PII handling.
On the Enterprise plan, yes. The platform is a FastAPI backend + Postgres + a Next.js dashboard — all components run in your VPC. The regression-check GitHub Action also has a self-hosted runner mode.For evaluation, we publish container images for the worker components (LLM-judge orchestrator, dataset builder). Contact sales@decimal.ai for the deployment guide.
The SDK requires Python 3.10+ (requires-python = ">=3.10").On Python 3.9 or older, pip install decimalai silently resolves to an outdated release that predates the demo command and several framework integrations. If decimalai demo reports an unknown command, check python --version first.
Yes — that’s the recommended first step. With just an API key:
pip install decimalai
decimalai demo regression   # impact report on a seeded v1→v2 agent change
decimalai demo skills       # registry ranked by seeded effectiveness data
Each command seeds realistic demo data into your workspace (prefixed [Demo]) and prints a link straight to the result. decimalai demo reset removes it all. No agent code, no framework setup, no LLM keys.
First-class integration:
  • LangChain / LangGraph
  • OpenAI Agents SDK
  • LlamaIndex (v0.10.20+)
  • CrewAI (via OTel)
  • AutoGen / AG2 (via OTel)
Any framework that emits OpenTelemetry GenAI spans works through decimalai.init(otel=True). For custom Python code without a framework, use @decimalai.trace() directly. See Tracing for setup details.
Traces from any provider are accepted — the SDK stores model name + provider as free-form strings. Cost estimation works out of the box for OpenAI, Anthropic, Google, Mistral, and Cohere model families.The Playground supports OpenAI, Gemini, and Anthropic. LLM-judge evaluators run on Gemini with an automatic OpenAI fallback.
Not yet. The Python SDK is the only first-party client today. For TypeScript projects, you can hit the REST API directly — see Authentication.If TypeScript support is critical for you, please open an issue on decimal-labs/decimalai-python — usage signals shape priority.
Workspace admins can delete a workspace from Settings → Workspaces — this purges all traces, manifests, datasets, and skills owned by that workspace.For per-user data deletion (GDPR), the dedicated endpoint is on the roadmap. For now, email support@decimal.ai with the user IDs to purge and we’ll run it manually.
Production runs on AWS us-east-1. Enterprise customers can request eu-west-1 or other regions; see Security for the current list.
For 5xx errors or unexpected behavior, include the X-Request-ID response header — it lets us look up the failing request in logs.