Changelog - DecimalAI

This page tracks notable changes to the platform API, Python SDK, and decimal-labs/regression-check GitHub Action. Dates use ISO 8601. The platform follows a rolling release model — changes ship to api.decimal.ai continuously. The SDK and Action follow Semantic Versioning and are tagged on GitHub. For breaking-change notices, subscribe to the GitHub release feeds:

For what’s coming next, see the Roadmap.

2026-06-24

SDK 0.6.0 — experiments API removed

SDK (decimalai-python 0.6.0)

🗑️ Removed the experiments API. The agent/dataset experiment runner (experiment(), run_experiment(), compare_experiments(), the offline Eval() helper) and the matching client methods backed /api/v1/experiments, which was never shipped and always returned 404. The endpoint has been formally retired. Use regression-check for pre-deploy A/B (POST /api/v1/regression-check), the regression timeline for post-deploy comparison, and skill version compare (/api/v1/skills/analytics/compare) for skill diffs.

2026-06-18

Webhooks, billing, and reliability

Platform

✨ HMAC webhook signing — outbound webhooks now carry an X-Decimal-Signature header for verifiable delivery.
✨ Webhook retry + delivery log — failed deliveries retry with exponential backoff, with a per-webhook delivery log in the UI.
✨ Stripe billing end-to-end — checkout + customer portal wired through for self-serve plan upgrades.
✨ Anthropic in Playground — the Claude provider now sits alongside OpenAI and Gemini in the prompt-testing playground.

SDK (decimalai-python)

✨ atexit flush handler — buffered traces flush on script exit, so short-lived scripts no longer lose traces silently.

2026-06-08

Demo sandbox + SkillScore v2

SDK 0.4.0 (pip install decimalai, requires Python 3.10+)

✨ One-command demo sandbox — see both demos on seeded data in ~2 minutes, before instrumenting anything:
- decimalai demo regression — seeds a v1→v2 agent change + trace corpus, runs the regression check, links straight to the impact report.
- decimalai demo skills — seeds three skills with varied effectiveness, links to the ranked registry.
- decimalai demo reset — removes all [Demo] -prefixed data; your own agents and skills are never touched.
✨ decimalai init now surfaces the demo commands in its next-steps output.

Platform

✨ SkillScore v2 — the registry score is now a quality-only composite (0–100): live eval pass rate + AI-judge quality, gated on sample size. Popularity and maintenance no longer affect the score. Skills under 10 activations/30d are relegated below scored skills in the default sort instead of hidden.
✨ Leaderboard axes: Highest SkillScore (default) · Biggest Improvement (measured lift vs no-skill baseline) · Most Efficient (token savings) · Top live rating.

GitHub Action (decimal-labs/regression-check)

✨ Honest behavioral nudge — when a PR’s diff contains a model change and behavioral-check is off, the impact comment now shows how many recorded calls can be verified and how (behavioral-check: real or post-deploy bisect). No fabricated counts.
🔧 behavioral-check: mock no longer renders a meaningless equivalent/changed split (the mock stub always read ~100% changed); it now reports the eligible-call count and points at real.

2026-05-20

Skills wedge release

The skill registry layer that knows what works — registry, router, and observability shipped as one product.Registry

✨ Per-model effectiveness on every registry skill — see the pass rate a skill gets on GPT-5 vs Claude Opus vs Gemini Flash, computed from production traces. “Best with” badge marks the highest-passing model.
✨ Real “Most Effective” sort ranks by SkillScore (with a minimum-activations gate so cold-start skills don’t dominate). New separate sort=popular for raw activation count.
✨ Activation sparkline on every public skill page — 30-day daily trend, server-rendered SVG, zero JS.
✨ Version diff viewer lets unauthenticated visitors compare any two published versions side-by-side.
✨ Popular forks surfaced on detail pages so consumers can find community-iterated variants.
✨ Integration snippets (Python SDK · pull · curl · agent-runtime paths) on every detail page with copy-to-clipboard.
✨ 25 new flagship official skills authored — code review, API design, data/SQL, prompt engineering, agent design, ops, docs, security. All Apache-2.0.
🔧 Default browse view hides bulk-imported skills with under 10 activations so the registry feels curated. Use the Imported tab or search to see all 3,000+ imports.

Router (new docs)

✨ The SkillRouter is now a first-class product surface with its own page in the API reference. Documents the three strategies (full menu / smart route / on-demand body), response shape, telemetry, policy controls, and smart-routing internals.

Observability

✨ Weekly skill degradation digest — opt-in email per workspace when a skill’s pass rate drops ≥15% week-over-week with at least 20 baseline activations. Thresholds tunable via env vars.

Share & embed

✨ /skills/<slug> is the new canonical public URL for a registry skill. Legacy /skills/<slug> continues to work; both share the same OG image.
✨ OpenGraph cards dynamically rendered per skill — name, SkillScore, per-model row, activation count. Twitter, LinkedIn, and Slack unfurls show the effectiveness data on every share.
✨ Embed widget at /embed/skills/<slug> — drop a 380×180px iframe into a README or blog post showing live effectiveness. Light + dark theme via ?theme=.

CLI

✨ decimalai skills pull <slug> — pull any public registry skill to disk with no signup. Writes ./<slug>/SKILL.md. Read-only (no fork, no telemetry); signup is only required to install + activate tracking.

New public registry endpoints

GET /api/v1/registry/skills/{id}/activations — daily activation series for the sparkline.
GET /api/v1/registry/skills/{id}/versions/{version_number} — body markdown for any published version (powers the public diff viewer).
GET /api/v1/registry/skills/{id}/lineage already existed; now surfaced on the public detail page as “Popular community forks”.

2026-05

Phase 2 release

Platform

✨ Skills lifecycle is generally available: create, version, fork, subscribe, publish to registry, analytics.
✨ Public skills registry (/skills) with SkillScore effectiveness ranking (Quality / Popularity / Maintenance).
✨ Prompt Testing playground promoted from internal tool to first-class feature (/playground), with BYOK support for OpenAI and Gemini.
✨ Multi-agent topology graph + per-sub-agent compatibility dashboard.
✨ Workspace CRUD + RBAC role model in place (enforcement landing in next release; see rollout plan D-4).
🔧 Manifest registration is idempotent by hash — repeated POST /manifests returns existing IDs.

SDK (decimalai-python)

✨ decimalai.init(langchain=True | openai_agents=True | llamaindex=True | crewai=True | autogen=True | otel=True) covers 6+ frameworks.
✨ Skill auto-discovery from .claude/skills/, .agents/skills/.
✨ Bidirectional skill sync (POST /skills/sync + SkillRouter.pull_missing()).
✨ @decimalai.trace() decorator for any Python function.

GitHub Action (decimal-labs/regression-check)

✨ Initial release. Computes structural diff between PR manifest and production manifest; posts impact report as a PR comment.
✨ manifest_only SDK mode for CI: runs manifest extraction without invoking the agent.

2026-04

Phase 1 release

Platform

✨ Hero workflow: manifest change → batch compatibility re-score → Impact Report banner → Auto-Repair + Build Dataset stepper → JSONL export.
✨ Training Data Health dashboard at / (health ring, category bars).
✨ Drift detection toast + sidebar compat badges.

SDK

✨ First public version. Manifest capture, trace ingest, framework adapters.