YAML-first agent specs: run with `oa run` or generate a full Python project with `oa init`.
Project description
Open Agent Spec (OA)
Define AI agents as contracts, not scattered prompts.
Open Agent Spec lets you define an agent once in YAML, validate inputs and outputs against a schema, and either run it directly with oa run or generate a Python scaffold with oa init.
Why This Exists
Most agent systems are hard to reason about:
- outputs are not strictly typed
- behaviour is buried in prompts
- logic is split across Python, Markdown, and framework abstractions
- swapping models often breaks things in subtle ways
The Idea
Open Agent Spec treats an agent like infrastructure.
Think OpenAPI or Terraform, but for AI agents.
You define:
- input schema
- output schema
- prompts
- model configuration
Then OA enforces the boundary:
input -> LLM -> validated output
If the output does not match schema, the task fails fast with a validation error.
For example, this shape mismatch can silently break downstream systems:
{"msg":"hello"}
instead of:
{"response":"hello"}
Super Quick Start
Install (Python 3.10+):
pipx install open-agent-spec
oa init aac
oa validate aac
export OPENAI_API_KEY=your_key_here
oa run --spec .agents/example.yaml --task greet --input '{"name":"Alice"}' --quiet
With OA you can:
- define tasks, prompts, model config, and expected I/O in YAML
- run a spec directly without generating code first
- keep
.agents/*.yamlin your repo and call them from CI - generate a Python project scaffold when you want to customize implementation
First Run
Shortest path from install to a working agent:
1. Create the agents-as-code layout (aac = repo-native .agents/ directory):
oa init aac
This creates:
.agents/
├── example.yaml # minimal hello-world spec
├── review.yaml # code-review agent that accepts a diff file
├── change.diff # sample diff for immediate review-agent testing
└── README.md # quick usage notes
2. Validate the generated specs:
oa validate aac
3. Set an API key for the engine in your spec (OpenAI by default):
export OPENAI_API_KEY=your_key_here
4. Run the example agent:
oa run --spec .agents/example.yaml --task greet --input '{"name":"Alice"}' --quiet
--quiet prints the task output JSON only, good for piping to jq or scripting:
{
"response": "Hello Alice!"
}
Omit --quiet for the full execution envelope with Rich formatting.
5. Run the review agent with the bundled sample diff:
oa run --spec .agents/review.yaml --task review --input .agents/change.diff --quiet
Or review your own change:
git diff > change.diff
oa run --spec .agents/review.yaml --task review --input change.diff --quiet
Write Your Own Spec
Start from this shape:
open_agent_spec: "1.5.0"
agent:
name: hello-world-agent
role: chat
intelligence:
type: llm
engine: openai
model: gpt-4o
tasks:
greet:
description: Say hello to someone
input:
type: object
properties:
name:
type: string
required: [name]
output:
type: object
properties:
response:
type: string
required: [response]
prompts:
system: >
You greet people by name.
user: "{{ name }}"
Validate first, then run:
oa validate --spec agent.yaml
oa run --spec agent.yaml --task greet --input '{"name":"Alice"}' --quiet
Features
Multi-task pipelines with depends_on
Chain tasks declaratively. OA merges upstream outputs into downstream inputs automatically — no glue code required.
tasks:
extract:
description: Pull key facts from raw text.
# ... input / output / prompts
summarise:
description: Summarise the extracted facts.
depends_on: [extract] # extract's output is merged into summarise's input
# ... prompts
depends_on is a data contract, not execution control. OA has no branching, loops, or conditionals by design. See examples/multi-task/.
Tools — native, MCP, and custom
Let the model call tools declared in the spec. Three backends, zero SDK dependencies.
tools:
reader:
type: native
native: file.read # built-in: file.read/write, http.get/post, env.read
search:
type: mcp
endpoint: http://localhost:3000 # any MCP server (JSON-RPC 2.0 over HTTP)
classifier:
type: custom
module: my_pkg.tools:ClassifierTool # your own Python class
tasks:
analyse:
tools: [reader, search, classifier]
# ...
See examples/file-reader/ and examples/mcp-search/.
Spec composition — delegate tasks to other specs
A task can hand off its implementation to another spec entirely. Great for building shared specialist agents that many pipelines reuse.
tasks:
sentiment_of_summary:
description: Delegate to the shared sentiment specialist.
spec: ./shared/sentiment.yaml # local path or oa:// registry URL
task: analyse_sentiment
depends_on: [summarise] # upstream outputs merged in automatically
See examples/spec-composition/.
Spec Registry — share specs via oa://
Publish and consume specs from the hosted registry at openagentspec.dev/registry/. Reference them with the oa:// shorthand — the runner resolves and fetches them automatically.
tasks:
review:
spec: oa://prime-vector/code-reviewer # resolves to latest hosted spec
task: review
Browse the registry at openagentspec.dev/registry. Available specs: summariser, classifier, sentiment, code-reviewer, keyword-extractor, memory-retriever.
History threading — stateless multi-turn chat
Pass prior conversation turns as a history input field. OA injects them into the LLM message list between system and user turns. OA never stores history — your application manages the list.
tasks:
chat:
input:
type: object
properties:
message: {type: string}
history:
type: array
description: Prior turns injected by the caller. OA never writes to this field.
oa run --spec spec.yaml --task chat \
--input '{"message":"What did I just say?","history":[{"role":"user","content":"Hello"},{"role":"assistant","content":"Hi there!"}]}'
See examples/chat-agent/.
Memory retriever — LLM re-ranker for long-term memory
Your application fetches candidate turns from an external store. The memory-retriever registry spec uses an LLM to select the most relevant ones and returns them as a history array ready to inject into any chat task.
tasks:
recall:
spec: oa://prime-vector/memory-retriever
task: retrieve # input: query + candidates → output: history + memory_count
respond:
depends_on: [recall]
spec: ./chat-agent/spec.yaml
task: chat
Immutable Inference Sandboxing (IIS)
Declare hard execution constraints in the spec. The runner enforces them before any tool call reaches the I/O layer — no network connection opened, no file handle created, no exception to catch.
sandbox:
tools:
allow: [file.read, http.get] # SANDBOX_TOOL_VIOLATION if anything else is called
http:
allow_domains: [api.example.com] # SANDBOX_DOMAIN_VIOLATION for other hosts
file:
allow_paths: [./data/] # SANDBOX_PATH_VIOLATION for paths outside this prefix
tasks:
restricted:
sandbox: # per-task override tightens the root sandbox
tools:
allow: [file.read]
See examples/sandboxed-agent/.
Behavioural contracts
Declare what the model output must contain. The behavioural-contracts library enforces the contract after parsing, before the result is returned.
behavioural_contract:
version: "1.0"
response_contract:
output_format:
required_fields: [confidence] # CONTRACT_VIOLATION if missing
tasks:
classify:
behavioural_contract:
response_contract:
output_format:
required_fields: [label] # effective required_fields: [confidence, label]
Install: pip install 'open-agent-spec[contracts]'
Multiple engines
Switch models by changing one line. All engines except Anthropic and Codex speak the OpenAI Chat Completions API over raw HTTP — no SDK required.
intelligence:
type: llm
engine: openai # openai | anthropic | grok | xai | cortex | local | codex | custom
model: gpt-4o-mini
npm / Node.js CLI
Run OA specs from Node.js without Python.
npm install -g @prime-vector/open-agent-spec
oa-run --spec agent.yaml --task greet --input '{"name":"Alice"}'
Supports OpenAI and Anthropic, depends_on chains, and history threading.
Generate a Python Scaffold
If you want editable generated code instead of running the YAML directly:
oa init --spec agent.yaml --output ./agent
Generated structure:
agent/
├── agent.py
├── models.py
├── prompts/
├── requirements.txt
├── .env.example
└── README.md
Core Idea
Most agent projects end up hand-rolling the same pieces:
- prompt templates
- model configuration
- task definitions
- routing glue
- runtime wrappers
OA moves those concerns into a declarative spec so they can be reviewed, versioned, and reused.
The intended model is:
- spec defines the agent contract
oa runexecutes the spec directlyoa initgenerates a starting implementation when you need code- external systems can orchestrate multiple specs however they want
OA deliberately does not prescribe:
- orchestration
- evaluation
- governance
- long-running runtime architecture
Common Commands
| Command | Purpose |
|---|---|
oa init aac |
Create .agents/ with starter specs |
oa validate aac |
Validate all specs in .agents/ |
oa validate --spec agent.yaml |
Validate one spec |
oa test agent.test.yaml |
Run YAML eval cases (model + assertions on task output); --quiet for CI JSON |
oa run --spec agent.yaml --task greet --input '{"name":"Alice"}' --quiet |
Run one task directly from YAML |
oa init --spec agent.yaml --output ./agent |
Generate a Python scaffold |
oa update --spec agent.yaml --output ./agent |
Regenerate an existing scaffold |
Specification
The formal specification defines what a conforming OA runtime must do, independent of any specific implementation.
| Resource | Contents |
|---|---|
| spec/open-agent-spec-1.5.md | Formal specification — normative MUST/SHOULD/MAY requirements for OA 1.5.0 |
| spec/schema/oas-schema-1.5.json | Canonical JSON Schema for validating spec documents |
| spec/conformance/README.md | Conformance test structure and contribution guide |
An independent implementor can build a conforming runtime from spec/open-agent-spec-1.5.md alone.
More Detail
| Resource | Contents |
|---|---|
| openagentspec.dev | Project website |
| docs/REFERENCE.md | Spec structure, engines, templates, .agents/ usage |
| examples/multi-agent | Multi-agent orchestration example — manager, workers, task board, dashboard |
| Repository | Source, issues, workflows |
Notes
- The CLI command is
oa(notoas). - Python 3.10+ is required.
oa runrequires the relevant provider API key for the engine in your spec.
About
- OA Open Agent Spec was dreamed up by Andrew Whitehouse in late 2024, with a desire to give structure and standardisation to early agent systems
- In early 2025 Prime Vector was formed taking over the public facing project
License
MIT | see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file open_agent_spec-1.5.1.tar.gz.
File metadata
- Download URL: open_agent_spec-1.5.1.tar.gz
- Upload date:
- Size: 95.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9437078837f97f72860e12e1ddbe7683fcd50105a42ba02ccbaadf826b9d4c2a
|
|
| MD5 |
7f3b50947465dfd63db94f7cd7510b42
|
|
| BLAKE2b-256 |
9dc6cb98c0ad05e76c6726a6dbc3c60ccadc87176bf45f0d2d5185b210dd926b
|
File details
Details for the file open_agent_spec-1.5.1-py3-none-any.whl.
File metadata
- Download URL: open_agent_spec-1.5.1-py3-none-any.whl
- Upload date:
- Size: 103.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6686c0036786bdfe0e39d1a350c7f5aadd2b611a31179248529740eea100829a
|
|
| MD5 |
22cf775775f2bae35041d10302f0e946
|
|
| BLAKE2b-256 |
0b1359f8489dd1a54bfd5ba971769ba96e6b4efc5585b4955ebc8b3111db459c
|