| Term | Definition | Learn More |
|---|---|---|
| Agent | The system being versioned. Its configuration is captured as a manifest; traces, manifests, and datasets are all scoped to an agent. | Execution Model |
| agentversion | The open spec (and PyPI package) for the manifest, diff, and compatibility-decision format DecimalAI builds on. | Versioning & Compatibility |
| Baseline | The manifest a regression check diffs against — typically the last manifest seen in production. | Regression Check |
| Compat Status | Convenience field on a trace: keep, repair, replay, or drop. (compatible/incompatible are accepted as legacy aliases that fold to keep/drop.) Authoritative source is the TraceCompat table. | Versioning & Compatibility |
| Compatibility Report | Analysis generated when a new manifest is registered, classifying every existing trace as keep/repair/replay/drop. | Versioning & Compatibility |
| Compatibility Score | 0.0–1.0 metric measuring how well a trace matches a manifest version. Category = compatibility on the EvalScore record. | Evaluation |
| Compatibility Verdict | Per-trace outcome: keep, repair, replay, or drop. | Versioning & Compatibility |
| Component | A single versioned piece of a manifest: tool, model, prompt, skill, subagent, or output_schema. | Versioning & Compatibility |
| Component Verdict | Per-component outcome when diffing old vs new manifests: COMPATIBLE, REPAIRABLE, INCOMPATIBLE, or MISSING. | Versioning & Compatibility |
| Content Hash | Per-component SHA-256 fingerprint. Changes when the component’s definition changes. | Versioning & Compatibility |
| Dataset | Curated training data built from filtered, scored production traces. | Skills & Data Pipeline |
| Decision Engine | System that combines quality + compatibility scores into a single keep/repair/replay/drop verdict per trace. | Evaluation |
| Degraded | Parent trace status when some (not all) child traces errored. | Multi-Agent Systems |
| Delegation | Orchestrator → sub-agent task assignment. Control returns to orchestrator after sub-agent completes. | Multi-Agent Systems |
| Detection Source | How a manifest was created: auto (from traces) or manual (via SDK/API). | Versioning & Compatibility |
| DPO | Direct Preference Optimization. Dataset format with chosen/rejected pairs for preference training. | Skills & Data Pipeline |
| Drift | When a sub-agent’s actual config diverges from what the orchestrator’s manifest expects. | Multi-Agent Systems |
| Episode | Compatibility engine synonym for “trace” — the same RunTrace record, just referred to in a compat context. | Execution Model |
| Eval Score | Single evaluation result: name, score (0.0–1.0), passed (bool), source, category. | Evaluation |
| Eval Verdict | Aggregate trace-level outcome: pass, fail, or review. Computed from all eval scores. | Evaluation |
| Evaluator | A configured quality check — deterministic, LLM-as-judge, or custom. | Evaluation |
| Handoff | Lateral transfer of control between peer agents. Unlike delegation, the original agent may not regain control. | Multi-Agent Systems |
| Impact Report | The structural, per-PR output of a regression check: each production trace marked HIGH / MEDIUM / LOW IMPACT by the manifest diff. The structural axis — distinct from the Compatibility Report (the data-lifecycle side). | Regression Check |
| Impact Severity | HIGH / MEDIUM / LOW — how structurally a manifest change touches a trace (“was this trace affected?”). Orthogonal to the keep/repair/replay/drop Compatibility Verdict. Not the same scale as component Severity (none/minor/moderate/major). | Compatibility Policies |
| LLM Call | Single model invocation — rendered prompt, completion, tokens, latency, cost, tool calls. | Execution Model |
| Manifest | Snapshot of an agent’s full configuration (tools, models, prompts, skills, sub-agents) at a point in time. | Versioning & Compatibility |
| Manifest Hash | SHA-256 fingerprint of manifest structure. Same hash + same agent = idempotent. | Versioning & Compatibility |
| Manifest Status | Lifecycle state: active, superseded, or draft. | Versioning & Compatibility |
| Orchestrator | An agent that delegates to sub-agents. Inferred from manifest components or trace linkage — not explicitly declared. | Multi-Agent Systems |
| Parent Trace | An orchestrator’s trace record that child traces link back to via parent_trace_id. | Multi-Agent Systems |
| Quality Score | Eval score measuring output quality (relevance, helpfulness, safety). Category = quality on EvalScore. | Evaluation |
| Regression Check | The pre-deploy GitHub Action that diffs a candidate manifest against the baseline and posts an Impact Report comment on the PR. | Regression Check |
| Repair | Deterministic fix of a trace to match a new manifest. Zero LLM cost. | Skills & Data Pipeline |
| Replay | Re-running a historical trace against the current agent to compare outputs. | Skills & Data Pipeline |
| Revert | When the same manifest hash reappears — the old manifest is reactivated, the current one superseded. | Versioning & Compatibility |
| Session | Group of traces forming a multi-turn conversation. Linked by session_id. | Execution Model |
| Severity | How impactful a component change is: none, minor, moderate, or major. | Versioning & Compatibility |
| SFT | Supervised Fine-Tuning. Dataset format with input→output pairs for imitation learning. | Skills & Data Pipeline |
| Skill | Reusable instruction file (SKILL.md) that modifies agent behavior. Not a tool. | Skills & Data Pipeline |
| Skill Activation | Record of which skills were loaded during a trace. | Skills & Data Pipeline |
| skillevaluation | The open spec (and PyPI package) for A/B benchmarking a skill — runs each test case with and without the skill, then reports the measured lift. | skillevaluation |
| SkillScore | The 0–100 quality composite that ranks skills in the registry — from benchmark lift, live eval pass rates, and AI-judge quality, not install counts. | SkillScore |
| Source Type | Where a trace came from: production, playground, test, or replay. | Execution Model |
| Span | Timed segment within a trace (llm, tool, retriever, other). Nests via parent_span_id. | Execution Model |
| Sub-agent | Agent receiving delegated work. Identified by parent_trace_id on its trace. | Multi-Agent Systems |
| Surface | Policy grouping for compatibility rules: tool_registry, model_runtime, prompt_stack, skill_registry, subagents, output_contract. | Versioning & Compatibility |
| Trace | A single, complete agent execution from input to output. The atomic unit of the platform. | Execution Model |
| Trajectory | ML/RL term for a full (state, action, reward) sequence. DecimalAI traces are similar but use eval scores instead of explicit rewards. | Execution Model |
| Turn | A single interaction within a session. Each turn produces one trace. | Execution Model |
| Version Label | Human-readable manifest identifier (v1, v2, v3). Auto-incremented — not semantic versioning. | Versioning & Compatibility |
Core Concepts
Glossary
Quick-reference definitions for every term used in DecimalAI.
Alphabetical one-line definitions. For full explanations with diagrams, follow the Learn More links to the relevant concept page.