Agent Brain
How the OpenTool runtime stack works: > tool_os as the workspace, Autopilot as the governed runtime, and Arbiter as the admin brain.
The human-agent contract
The human should not need to babysit the runtime. The human gives the task, scope, and authority. The runtime carries the work inside policy and keeps the work visible.
Hierarchy: OpenTool defines the platform and policy model. > tool_os is the workspace where that model becomes visible. Autopilot handles bounded autonomous work. Arbiter watches and routes at admin scope.
- Human: define the task, define the authority, and watch the runtime.
- Runtime: act only through declared tools, workflows, and guardrails.
- Trust: comes from visible state, visible policy decisions, and visible exceptions.
- Arbiter: the standing admin runtime that watches the system and routes work into bounded lanes.
The agentic loop
Every agent interaction in > tool_os follows a loop. The agent receives your message, reasons about what to do, requests tool calls, and the system evaluates those requests before executing them. This loop repeats until the task is complete.
The critical difference from other agent systems: step 4 always happens. Every tool call passes through the policy engine before execution. There is no bypass. The agent cannot skip this step, and neither can the developer. This is Law 1.
Law 1: Every tool invocation passes through the policy engine. No exceptions. No bypass. No admin override that skips evaluation. The policy engine is gravity — it applies to everything.
What tools the agent has
The agent's tool set is determined by the intersection of what the manifest declares and what the entity's permissions allow. Available tools include:
Additional tools come from installed plugins and MCP servers. Each one ships with an opentool.json manifest that declares its capabilities and required permissions.
How tool calls work
When the agent decides to use a tool, it emits a structured tool_use event with the tool name and input parameters. This is not free-form text — it is a typed request that the system can parse, evaluate, and audit.
The policy engine
Every tool call is intercepted by the PolicyEngine before execution. The policy engine evaluates the request against the current entity's role, the tool's manifest, and any org-wide guardrails. It returns one of four outcomes:
- ALLOW — The tool call is permitted. Execution proceeds. The most common outcome for well-configured entities.
- BLOCK — The tool call is denied. The agent receives an error message explaining why. Example: an agent trying to delete a production database without the
infra:destructivepermission. - QUARANTINE — The tool call is held for human review. It appears in the Quarantine app surface where an admin can approve or reject it. Used for high-risk operations that need a human in the loop.
- REDIRECT — The tool call is rerouted. The policy engine substitutes a safer alternative — for example, redirecting a write to a sandbox environment instead of production.
Law 1: Policy engine is gravity
The governance model is built on three laws. The first law is the foundation:
Law 1 — Gravity: The policy engine is physics, not policy. It applies to every entity, every tool call, every time. There is no admin escape hatch. There is no "trusted" mode that skips evaluation. If you can bypass the policy engine, the entire governance model collapses.
This is an intentional design constraint. Many governance tools offer admin overrides or "break glass" mechanisms. We don't. If a tool call is blocked, you change the policy — you don't skip the evaluation. The audit trail is complete because there are no exceptions.
The circuit breaker
The circuit breaker exists because agents can enter loops, misinterpret instructions, or attempt operations at a scale the user didn't intend. The stop button is always available, always works, and always takes precedence over whatever the agent is doing.
Audit trail
Every step of the agentic loop is instrumented with OpenTelemetry. Every agent turn is a span. Every tool call is a child span. Every policy decision is recorded. This means:
- You can reconstruct any agent session after the fact
- You can see exactly which tools were called, with what parameters, and what the policy engine decided
- You can correlate agent actions with infrastructure events
- The Replay app surface provides a timeline view of any session
- Audit data exports to your existing observability stack via OTLP