--- url: /getting-started.md description: >- Five-minute walkthrough to install abtree, hand a behaviour tree to your agent, and watch it drive a workflow end-to-end. --- # Getting started A five-minute walkthrough: install abtree, hand a tree to your agent, and watch it drive. ## Install ::: code-group ```sh [macOS / Linux] curl -fsSL https://github.com/flying-dice/abtree/releases/latest/download/install.sh | sh ``` ```powershell [Windows] irm https://github.com/flying-dice/abtree/releases/latest/download/install.ps1 | iex ``` ::: Verify: ```sh abtree --version ``` You'll see a version number. If you don't, restart your terminal so the new `PATH` takes effect. ## Concepts in 60 seconds Three words worth knowing: * **Tree** — a YAML file describing a workflow. Lives in `.abtree/trees/`. * **Execution** — one run of a tree, bound to a piece of work. Persists as JSON in `.abtree/executions/`. * **Step** — the smallest unit. Either an `evaluate` (a precondition the agent confirms) or an `instruct` (work the agent performs). abtree is a CLI **for agents**. You don't drive executions yourself — you hand a brief to your agent and it runs the loop. Three commands carry the whole protocol: `abtree next` to ask "what now?", `abtree eval` to answer a precondition, `abtree submit` to report an outcome. JSON in, JSON out. ## 1. Set up a workspace ```sh mkdir my-abtree-demo && cd my-abtree-demo mkdir -p .abtree/trees/hello-world curl -fsSL https://raw.githubusercontent.com/flying-dice/abtree/main/.abtree/trees/hello-world/TREE.yaml \ -o .abtree/trees/hello-world/TREE.yaml ``` `hello-world` is a small tree: classify the time of day, then pick the matching greeting from a four-way selector. It exercises three of the four behaviour-tree primitives — `sequence`, `selector`, and `action` — in a few dozen lines. ## 2. Hand it off to your agent In Claude Code, ChatGPT, or any agent that can run shell commands, send: ```text Run the abtree hello-world tree end-to-end. Start by running 'abtree --help' to learn the execution protocol, then create an execution with 'abtree execution create hello-world "first run"' and drive it through every step until you see status: done. ``` That is the entire human-side interaction. The agent reads the protocol from `--help`, creates an execution, and runs the loop autonomously. ## 3. What the agent does under the hood Each turn, the agent calls one command and reads its JSON response. The very first `abtree next` on any execution is a runtime-level gate that hands the agent the execution protocol — every execution starts here, regardless of which tree it's running: ```json { "type": "instruct", "name": "Acknowledge_Protocol", "instruction": "Read the runtime protocol below in full..." } ``` The agent reads the protocol and acknowledges: ```sh abtree submit first-run__hello-world__1 success ``` After the gate, `abtree next` returns the tree's first real step: ```json { "type": "instruct", "name": "Determine_Time", "instruction": "Check the system clock to get the current hour..." } ``` The agent does the work — checks the clock, classifies the hour as `morning` — then writes the result and submits: ```sh abtree local write first-run__hello-world__1 time_of_day "morning" abtree submit first-run__hello-world__1 success ``` The next call returns an `evaluate`: ```json { "type": "evaluate", "name": "Morning_Greeting", "expression": "$LOCAL.time_of_day is \"morning\"" } ``` The agent reads the expression, decides it holds, and answers: ```sh abtree eval first-run__hello-world__1 true ``` The loop repeats — `next` → do the work or judge the precondition → `submit` or `eval` — until: ```json { "status": "done" } ``` The agent never sees the rest of the tree. Just the next request. ## 4. The execution diagram abtree regenerates a Mermaid diagram at `.abtree/executions/first-run__hello-world__1.mermaid` after every state change. Here's what a completed `hello-world` run looks like — green nodes succeeded, uncoloured ones were skipped. ```mermaid --- title: "hello-world (complete)" --- flowchart TD Hello_World{{"Hello World\n[sequence]"}} 0_Determine_Time["Determine Time\n[action]"] Hello_World --> 0_Determine_Time style 0_Determine_Time fill:#4ade80,stroke:#16a34a,color:#052e16 0_Choose_Greeting{{"Choose Greeting\n[selector]"}} Hello_World --> 0_Choose_Greeting style 0_Choose_Greeting fill:#4ade80,stroke:#16a34a,color:#052e16 0_1_Morning_Greeting["Morning Greeting\n[action]"] 0_Choose_Greeting --> 0_1_Morning_Greeting style 0_1_Morning_Greeting fill:#4ade80,stroke:#16a34a,color:#052e16 0_1_Afternoon_Greeting["Afternoon Greeting\n[action]"] 0_Choose_Greeting --> 0_1_Afternoon_Greeting 0_1_Evening_Greeting["Evening Greeting\n[action]"] 0_Choose_Greeting --> 0_1_Evening_Greeting 0_1_Default_Greeting["Default Greeting\n[action]"] 0_Choose_Greeting --> 0_1_Default_Greeting ``` The cursor advanced through the sequence. The selector chose Morning Greeting after its `evaluate` precondition held — the afternoon, evening, and default branches were never entered. ## What just happened Your agent drove a structured workflow without you writing a system prompt, without a JSON schema in its context, without chain-of-thought. The tree handed it exactly one task at a time, and only let it advance when it proved the task was complete. That's the core idea: **deterministic structure for non-deterministic agents.** ## Next * [Why behaviour trees?](/concepts/) — the problem they solve * [State, branches, and actions](/concepts/state) — how the building blocks fit together * [Writing your own trees](/guide/writing-trees) — YAML structure walkthrough * [CLI reference](/guide/cli) — every command, every flag --- --- url: /concepts.md description: >- Why use behaviour trees for AI agents — the same hierarchical decision structure used by game AI and robotics, applied to LLM workflows. --- # Why behaviour trees? A behaviour tree is a hierarchical structure for organising decisions. It was invented for video-game AI — the kind of NPCs that have to choose between *patrol*, *attack*, *flee*, or *call for backup* without breaking immersion. From there it spread to robotics, where reliability matters more than cleverness. abtree brings the same idea to LLM agents. ## The problem You can describe almost any workflow to a modern LLM in a single Markdown document. It will mostly work. Then it won't. The two failure modes: ### 1. Instruction fatigue A long system prompt is supposed to tell the agent everything: the format of the answer, the order of operations, the failure cases, the edge cases. But model attention is finite. As prompts grow past a few hundred lines, agents start to: * Skip steps they "remember" from earlier. * Confuse the order of operations. * Forget invariants stated up front. * Hallucinate fields you defined. The usual fix is to repeat yourself. The prompt grows. The problem worsens. ### 2. Non-determinism Even when the agent reads every word, decisions are made probabilistically. Run the same task twice and you might get different choices. For exploratory work, that's fine. For workflows where reproducibility matters — code review, deployments, structured data extraction — it's a liability. ## The fix: a formal logic layer A behaviour tree separates **what to do** from **when to do it**. The tree defines the structure: what runs first, what runs in parallel, what counts as success, when to fall back. The agent only sees the current step. Three things change: 1. **The agent's working set shrinks** to one instruction at a time. No more 2,000-line prompt to skim. 2. **Decisions become explicit.** A `selector` says "try the morning branch first; if that fails, try afternoon." The agent doesn't choose — the tree does. 3. **Progress is verifiable.** Every action ends only after an `evaluate` invariant has been satisfied. The next pages walk through the building blocks: [state](/concepts/state), then [branches and actions](/concepts/branches-and-actions). Take them in order — each builds on the previous. --- --- url: /concepts/state.md description: >- Two state scopes in abtree — $LOCAL is a per-execution blackboard agents read and write; $GLOBAL is a read-only world model. --- # State Behaviour trees are stateful by design. abtree separates state into **two scopes**, written explicitly, never implicit. ## $LOCAL — the workflow's blackboard `$LOCAL` is a key-value store private to one execution. Actions read from it, write to it, and use it to thread data between steps. Examples: * `$LOCAL.greeting = "Good morning, Alice!"` — output of one step, input to the next. * `$LOCAL.confidence_score = 0.92` — a number computed during the run. * `$LOCAL.error_log = [...]` — an accumulating list. `$LOCAL` is initialised when the execution is created. Every state change persists immediately to the execution's JSON document — kill the process and resume tomorrow. ## $GLOBAL — the world model `$GLOBAL` describes the **environment** the agent is operating in. You don't *set* `$GLOBAL` values from inside the execution — you *observe* them. Examples: ```yaml state: global: user_name: retrieve by running the shell command "whoami" current_branch: the output of git rev-parse --abbrev-ref HEAD api_endpoint: https://api.example.com tone: friendly ``` Notice the first two values aren't literals — they're **instructions** for how to fetch them. The agent reads `$GLOBAL.user_name`, sees a sentence, and runs `whoami`. The third is a literal that never changes during the execution. The fourth is a configuration knob. ## Why two scopes? The distinction matters. `$LOCAL` is something **your tree creates**. `$GLOBAL` is something **the world tells you**. Putting them in different scopes makes the contract explicit: * An action that reads `$GLOBAL.user_name` knows the value comes from the environment. * An action that reads `$LOCAL.greeting` knows the value was computed earlier in the execution. Mixing them — like a single "context" object — hides where data came from. That's the bug surface that bites agentic systems hardest: was this value something I produced, or something I read? abtree makes you answer up front. ## Reading and writing ```sh # Read all of $LOCAL abtree local read # Read a specific path (dot-notation) abtree local read greeting # Write a value abtree local write greeting "Good morning, Alice!" # Read $GLOBAL abtree global read abtree global read user_name ``` Values are JSON-parsed when possible, so `abtree local write ready true` stores a boolean, not the string `"true"`. `$GLOBAL` is read-only via the CLI — values come from the tree's `state.global` block at execution creation. ## Next * [Branches and actions](/concepts/branches-and-actions) — the four primitives that drive the tree. --- --- url: /concepts/branches-and-actions.md description: >- The four primitives of an abtree behaviour tree — sequence, selector, parallel, and action — and how each one defines control flow. --- # Branches and actions Three branch types. One leaf type. That's the whole language. ## Branches Branches define the **flow of control**. They have children. Their job is to coordinate which children run, in what order, and what counts as success. ### Sequence Run children in order. **All must succeed.** If any child fails, the sequence fails. Use it for linear workflows where each step depends on the previous one. ```yaml type: sequence name: Deploy_Service children: - type: action name: Run_Tests - type: action name: Build_Image - type: action name: Push_To_Registry ``` If `Run_Tests` fails, the sequence aborts. The build never happens. The push never happens. ### Selector Run children in order until one **succeeds**. If all fail, the selector fails. This is your decision-making primitive — the equivalent of an if/else chain. ```yaml type: selector name: Choose_Greeting children: - type: action name: Morning_Greeting steps: - evaluate: $LOCAL.time_of_day is "morning" - instruct: ... - type: action name: Afternoon_Greeting steps: - evaluate: $LOCAL.time_of_day is "afternoon" - instruct: ... ``` The selector tries `Morning_Greeting`'s evaluate first. If it passes, the morning instruct runs and the selector finishes. Otherwise it falls through to `Afternoon_Greeting`. ### Parallel Run all children. **All must succeed.** Use it when steps are independent and can be done in any order. ```yaml type: parallel name: Gather_Context children: - type: action name: Check_Weather - type: action name: Check_News ``` The agent gets both `instruct` requests and can satisfy them in any order. If either fails, the parallel fails. ## Actions Actions are the **leaves** of the tree. Each is a small, focused unit of work made of two kinds of step: ```yaml type: action name: Determine_Time steps: - evaluate: $LOCAL.now is set - instruct: | Get the current hour from the system clock. Classify as "morning", "afternoon", or "evening". Store at $LOCAL.time_of_day. ``` ### `evaluate` A precondition. A semantic boolean expression checked against `$LOCAL` and `$GLOBAL`. The agent reads it, decides if it's true, and submits the answer with `abtree eval true|false`. If `false`, the action fails immediately. The runtime advances by branch rules: a sequence aborts; a selector tries the next child. ### `instruct` The work. Free-form prose telling the agent what to do. The agent does it, writes results to `$LOCAL`, and calls `abtree submit success` to advance. An action can have multiple steps — alternating evaluates and instructs — to handle multi-stage logic in a single leaf. ## Putting it together A real tree: ```yaml type: sequence # do these in order children: - type: action # step 1: figure out the time name: Determine_Time steps: - instruct: ... - type: selector # step 2: pick a branch by time of day name: Choose_Greeting children: - { Morning_Greeting } - { Afternoon_Greeting } - { Evening_Greeting } - { Default_Greeting } - type: parallel # step 3: gather context concurrently name: Gather_Context children: - { Check_Weather } - { Check_News } - type: action # step 4: compose the final response name: Compose_Response steps: - evaluate: $LOCAL.weather is set and $LOCAL.news is set - instruct: ... ``` Four primitives. Sixteen lines of structure. Reproducible execution. The bundled `hello-world` covers the first three (sequence, selector, action); `improve-codebase` is a real-world parallel. ## How the loop runs When you call `abtree next `, the runtime walks the tree from the root, looking for the next pending step: 1. It descends into the first incomplete child of a sequence, or the first untried child of a selector, or all children of a parallel. 2. It returns the first pending `evaluate` or `instruct` it finds. 3. You answer with `abtree eval` or `abtree submit`. 4. The runtime updates state, recomputes the cursor, and waits for the next `abtree next`. You never need to track "where am I" yourself. The cursor lives in the JSON document. Restart your terminal, restart your agent — the next `abtree next` picks up exactly where you left off. ## Next * [Writing your own tree](/guide/writing-trees) — turn this into YAML. * [CLI reference](/guide/cli) — every command, every flag. --- --- url: /guide/writing-trees.md description: >- Walkthrough of the abtree YAML schema — name, version, state, and tree — using the bundled hello-world example as the reference. --- # Writing your own tree This page walks through the YAML structure of a tree using the bundled `hello-world` example as the reference. ## File layout Trees live in `.abtree/trees//TREE.yaml`. The slug (the folder name) becomes the tree name shown in `abtree tree list`. The folder gives the tree somewhere to keep its own fragments and playbooks alongside the definition. ``` .abtree/ trees/ hello-world/ TREE.yaml refine-plan/ TREE.yaml my-big-workflow/ TREE.yaml fragments/ auth.yaml executions/ # populated as you create executions first-run__hello-world__1.json first-run__hello-world__1.mermaid ``` ### Project-local vs user-global `abtree tree list` searches two directories: 1. `.abtree/trees/` in the **current working directory** — project-local, committed with the code. 2. `~/.abtree/trees/` in your **home directory** — user-global, available in every project. The project-local copy wins if both define the same slug. Drop a tree in `~/.abtree/trees/` to make it your default everywhere; commit a same-named file under `.abtree/trees/` to override it for one project. Executions always go into the cwd's `.abtree/executions/` regardless of where the tree was sourced from. ### Splitting a tree across files Large trees can be split using JSON-Schema-style `$ref` references. abtree resolves them at execution-creation time via [`@apidevtools/json-schema-ref-parser`](https://github.com/APIDevTools/json-schema-ref-parser), so the rest of the runtime sees one fully-resolved snapshot. ```yaml # .abtree/trees/big-workflow/TREE.yaml name: big-workflow version: 1.0.0 description: Composed of separately-authored fragments. tree: type: sequence name: Big_Workflow children: - $ref: "./fragments/auth.yaml" - $ref: "./fragments/work.yaml" - $ref: "./fragments/cleanup.yaml" ``` ```yaml # .abtree/trees/big-workflow/fragments/auth.yaml type: sequence name: Auth_Sequence children: - { type: action, name: Login, steps: [...] } - { type: action, name: Verify, steps: [...] } ``` `$ref` accepts: * **Relative paths** (`./fragments/auth.yaml`) — resolved against the file containing the `$ref`. * **Absolute paths** (`/home/.../shared/tree.yaml`). * **URLs** (`https://example.com/shared-trees/auth.yaml`). A fragment file is just a node — it does NOT carry the top-level `name`, `version`, `description`, `state` keys. Those live on the root tree only. Each fragment is the value for the position it's referenced from (a single composite or action node). The merged tree is written into the execution's `snapshot` field at execution-creation time, so editing fragments after creation does not affect existing executions — only new executions pick up the change. ## Top-level structure ```yaml name: hello-world version: 1.0.0 description: Greet a user based on time of day. state: local: time_of_day: null greeting: null global: user_name: retrieve by running the shell command "whoami" tree: type: sequence name: Hello_World children: [...] ``` | Field | Purpose | |---|---| | `name` | Slug. Must match the filename. | | `version` | Free-form. Bump when you change the tree. | | `description` | One-line description shown in `abtree tree list`. | | `state.local` | Initial `$LOCAL` keys. `null` is fine — they get filled in by actions. | | `state.global` | `$GLOBAL` values. Strings are interpreted as instructions for how to fetch them. | | `tree` | The root node. Always a single node — usually a `sequence`. | ## State The `state.local` block defines the *shape* of `$LOCAL` at execution creation. Use `null` for slots that get populated by actions during the run. Use literal values for defaults. ```yaml state: local: time_of_day: null # filled in by Determine_Time greeting: null # filled in by Choose_Greeting branch global: user_name: retrieve by running the shell command "whoami" tone: friendly language: english ``` `$GLOBAL` values that look like sentences ("retrieve by running...") are interpreted by the agent at runtime — they're prompts, not data. Literal strings and numbers are constants the agent reads as-is. ## Tree The `tree:` block is the root. It's a single node. In practice you'll almost always start with a `sequence` so steps run in order. ### Composite nodes `sequence`, `selector`, and `parallel` all share the same shape: ```yaml type: sequence | selector | parallel name: Friendly_Name # used in mermaid diagrams children: - { ... node 1 ... } - { ... node 2 ... } ``` ### Action nodes ```yaml type: action name: Friendly_Name steps: - evaluate: - instruct: - evaluate: - instruct: ``` You can have any number of steps in any order. They run sequentially within the action — the agent finishes step 1 before step 2 appears. ### Retries (any node) Any node — action or composite — can carry a `retries: N` config. When the runtime sees that node fail, it wipes the node's internal bookkeeping (status, step index, descendants), bumps an internal retry counter, and re-ticks the node from a clean slate. After N retries are exhausted, the failure propagates normally. ```yaml type: sequence name: Write_And_Review retries: 2 # one initial attempt + 2 retries = 3 total attempts children: - { type: action, name: Write, steps: [...] } - { type: action, name: Review, steps: [...] } ``` User state in `$LOCAL` (drafts, counters, review notes) **persists across retries** — that's the whole feedback channel. Internal state (which actions have run, where the cursor is) is wiped between attempts. This is the canonical replacement for the older "selector of N hand-written passes" shape — one retry config, one fragment, instead of N near-identical siblings. ## Naming Use **PascalCase with underscores** for node names: `Choose_Greeting`, `Determine_Time`. Mermaid diagrams render `_` as spaces, so `Choose_Greeting` becomes "Choose Greeting" in the rendered output. Tree slugs (the folder name) are **kebab-case**: `hello-world`, `improve-codebase`. ## Worked example The full hello-world tree, annotated: ```yaml name: hello-world version: 2.0.0 description: Greet a user based on time of day. state: local: time_of_day: null # filled by Determine_Time greeting: null # filled by the Choose_Greeting branch that wins global: user_name: retrieve by running the shell command "whoami" tone: friendly language: english tree: type: sequence name: Hello_World children: # 1. Single action with one instruct step. - type: action name: Determine_Time steps: - instruct: > Check the system clock. Classify as "morning", "afternoon", or "evening". Store at $LOCAL.time_of_day. # 2. Selector — only one branch wins, the rest are skipped. - type: selector name: Choose_Greeting children: - type: action name: Morning_Greeting steps: - evaluate: $LOCAL.time_of_day is "morning" - instruct: Compose a cheerful morning greeting... - type: action name: Afternoon_Greeting steps: - evaluate: $LOCAL.time_of_day is "afternoon" - instruct: Compose a warm afternoon greeting... - type: action name: Default_Greeting steps: - instruct: Compose a neutral greeting... # no evaluate = always passes ``` `hello-world` covers `sequence`, `selector`, and `action`. The fourth primitive — `parallel` — runs all children at once and succeeds only if every child succeeds. Drop one in when you have two independent reads that don't depend on each other: ```yaml - type: parallel name: Gather_Context children: - { type: action, name: Read_Schema, steps: [...] } - { type: action, name: Read_Conventions, steps: [...] } ``` `improve-codebase` ships a real-world parallel — four metric scorers running concurrently against the same codebase. ## Editing your own Copy a bundled tree to a new file and tweak. Try: * Add a new `evening_off_hours` branch to `Choose_Greeting` with an evaluate that fires after 22:00. * Wrap the selector's chosen greeting and a follow-up `Compose_Closing` action in a final `sequence`, so the closing always runs after the greeting is set. * Add a `parallel` after `Choose_Greeting` to fan out two enrichment actions before the tree finishes. Every change is reflected the next time you run `abtree execution create `. ## Validation abtree validates the YAML on load. If a tree is malformed, `abtree tree list` won't include it and `abtree execution create` will print the error. ## Next * [CLI reference](/guide/cli) — every command, every flag. --- --- url: /guide/designing-workflows.md description: >- Reference for assistants helping a human design a new abtree behaviour tree. Decision rules for the four primitives, the YAML shape, common idioms (bounded retries, gates, human approval), and the gotchas that come from abtree having no native loops. --- # Designing workflows This page is reference material for an LLM helping a human design a new abtree behaviour tree. It assumes you (the assistant) already know the YAML syntax from [Writing trees](/guide/writing-trees) and the primitive semantics from [Branches and actions](/concepts/branches-and-actions). What follows is the layer above: given the syntax, what shapes do you reach for, and what shapes are footguns? ## The four primitives — when to use each Behaviour trees in abtree are made of one **action** node type and three **composite** node types (`sequence`, `selector`, `parallel`). Pick by the question you're answering: | Question | Primitive | |---|---| | "Do these in order, all must succeed." | `sequence` | | "Try these in order until one works." | `selector` | | "Do all of these; the order doesn't matter." | `parallel` | | "This is a unit of work the agent performs." | `action` | Every leaf is an `action`. Every non-leaf is a composite. The root is conventionally a `sequence`. ### Sequence Children run top-to-bottom. **Any failure aborts.** Use for linear pipelines where each step depends on the previous one's success. ```yaml type: sequence name: Deploy_Service children: - { type: action, name: Run_Tests, ... } - { type: action, name: Build_Image, ... } - { type: action, name: Push_Registry, ... } ``` If `Run_Tests` fails, the build never happens. The push never happens. The execution ends with `status: failed`. ### Selector Children run top-to-bottom **until one succeeds**. If all fail, the selector fails. This is the BT equivalent of an if/elif/else chain. The "decision" is encoded by each child's `evaluate` precondition — the first child whose evaluate passes runs its instruct. ```yaml type: selector name: Choose_Greeting children: - type: action name: Morning_Greeting steps: - evaluate: $LOCAL.time_of_day is "morning" - instruct: Compose a morning greeting... - type: action name: Default_Greeting steps: - instruct: Compose a neutral greeting... # no evaluate = always passes ``` Always have a no-evaluate fallback as the last child if you need an "else" branch — a selector with no winning child fails the whole branch. ### Parallel All children run; **all must succeed.** abtree returns each child's request to the agent in turn — the agent satisfies them in any order. Use for genuinely independent fan-out (gathering context from multiple sources, running multiple checks). ```yaml type: parallel name: Gather_Context children: - { type: action, name: Check_Weather, ... } - { type: action, name: Check_News, ... } ``` If you can't justify the order being arbitrary, use a `sequence` instead. ### Action The leaf. A unit of work paired with one or more steps. Each step is either an `evaluate` (a precondition the agent confirms `true` / `false`) or an `instruct` (free-form prose describing the work, the agent reports `success` / `failure` / `running`). ```yaml - type: action name: Determine_Time steps: - evaluate: $LOCAL.now is set - instruct: | Get the current hour from the system clock. Classify as "morning", "afternoon", or "evening". Store at $LOCAL.time_of_day. ``` Steps run in order within an action. If any `evaluate` fails or any `instruct` is submitted as `failure`, the action fails immediately and the parent composite handles the consequence. ## The YAML skeleton Every tree starts the same way: ```yaml name: version: description: state: local: : null # filled by actions during the run : null global: : # read-only after creation tree: type: sequence # almost always sequence at the root name: children: - { ... } ``` `$LOCAL` keys default to `null` when unset; actions populate them. `$GLOBAL` values that look like sentences are interpreted by the agent at runtime (e.g. `user_name: retrieve by running the shell command "whoami"`); literal strings or numbers are constants. For the full field reference see [Writing trees](/guide/writing-trees). ## Common idioms ### Idiom: bounded code-then-test (retries on a sequence) The canonical "iterate until satisfied" shape. Wrap one `[code → test]` sequence with `retries: N`. The runtime resets the sequence's internal state and re-ticks on failure, up to N times. User state in `$LOCAL` (counters, drafts, notes) persists across retries. ```yaml tree: type: sequence name: Reach_Threshold retries: 3 children: - $ref: "./fragments/pass.yaml" # one fragment, retried up to 4× total ``` ```yaml # fragments/pass.yaml type: sequence name: Pass children: - { type: action, name: Increment, steps: [...] } - type: action name: Test steps: - evaluate: $LOCAL.counter is greater than $LOCAL.threshold - instruct: Threshold reached. ``` One fragment, one retry config — replaces N hand-written passes. **When to reach for this:** the work is meaningful at each iteration — write code, then run tests; revise a draft, then review; gather data, then check completeness. Each pass should be something you'd want to inspect in a Mermaid trace. **Older alternative — selector of passes:** before runtime retries, the same shape was authored as `selector` with N near-identical children, each a separate `[code → test]` sequence. It still works, but it duplicates structure. Prefer `retries` for new trees. **An anti-pattern:** modelling iteration as a cycle (`test` `$ref`s back to `increment`). Cycles are preserved in the snapshot but cannot be ticked — abtree fails fast on a cyclic edge by design. Use `retries` instead. ```yaml type: selector name: Reach_Threshold children: - $ref: "./fragments/pass.yaml" # pass 1 - $ref: "./fragments/pass.yaml" # pass 2 - $ref: "./fragments/pass.yaml" # pass 3 - $ref: "./fragments/pass.yaml" # pass 4 ``` ```yaml # fragments/pass.yaml type: sequence name: Pass children: - { type: action, name: Increment, steps: [...] } - type: action name: Test steps: - evaluate: $LOCAL.counter is greater than $LOCAL.threshold - instruct: Threshold reached. ``` Each pass is a real, observable, resumable step. The bound is explicit in the tree (count the children). **When to reach for this:** the work is meaningful at each iteration — write code, then run tests; revise a draft, then review; gather data, then check completeness. Each pass should be something you'd want to inspect in a Mermaid trace. **An anti-pattern that looks similar but isn't:** modelling iteration as a cycle (`test` `$ref`s back to `increment`). Cycles are preserved in the snapshot but cannot be ticked — abtree fails fast on a cyclic edge by design. Use the selector-of-passes shape instead. ### Idiom: bounded retries (selector of attempts) The same shape as above, applied to retries against transient failure. Each attempt may also do code+test internally; the selector caps the number of full attempts. ```yaml type: selector name: Write_With_Retries children: - type: sequence name: First_Pass children: - { type: action, name: Write, ... } - { type: action, name: Review_Pass_1, ... } - type: sequence name: Second_Pass children: - { type: action, name: Revise, ... } # reads notes from Pass 1 - { type: action, name: Review_Pass_2, ... } - type: sequence name: Third_Pass children: - { type: action, name: Final_Revise, ... } - { type: action, name: Review_Pass_3, ... } ``` Each pass writes failure notes to a shared `$LOCAL._notes` key before failing, so the next pass has something to act on. Three passes is conventional; pick a number that bounds the cost. ### Idiom: tight inner loop inside one action When the iteration is **not** meaningful at each step — e.g. polling a value, retrying a flaky API call, or any "cap at N tries internally" pattern — fold the loop into a single `instruct` and let the agent enforce the bound. ```yaml - type: action name: Wait_For_Service steps: - evaluate: $LOCAL.endpoint is set - instruct: | Poll $LOCAL.endpoint up to 10 times with a 1s delay between attempts. If the service responds 200, set $LOCAL.ready to true and submit success. After 10 attempts, submit failure. ``` **When to reach for this:** the inner step is uninteresting on its own — you'd never trace it in Mermaid. The runtime sees one action; the loop is the agent's contract. **Trade-off:** the bound lives in prose, not the tree. Less observable, less resumable, but tighter. Use selector-of-passes when each iteration is a step worth seeing; use this when it isn't. ### Idiom: instruct-then-evaluate gate When a gate needs to record *why* it failed, run the check inside an `instruct` (so the agent populates `$LOCAL._notes`), then gate on the result with a final `evaluate`. The plain three-evaluate form ends the action on the first failure with no chance to write notes. ```yaml - type: action name: Review_Gate steps: - evaluate: $LOCAL.draft is set - instruct: | Run three checks against $LOCAL.draft. If all pass, set $LOCAL.review_notes to "approved". If any fails, write the specific failure to $LOCAL.review_notes. - evaluate: $LOCAL.review_notes is "approved" - instruct: All checks passed. Confirm and store $LOCAL.final_path. ``` ### Idiom: human-approval gate abtree doesn't have a native "wait for human" primitive. Express the wait as an `evaluate` on a flag the human sets via `abtree local write`, paired with an `instruct` telling the agent to wait. ```yaml - type: action name: Human_Approval_Gate steps: - evaluate: $LOCAL.draft is set - instruct: | Present the draft to the human. Wait for them to confirm by calling `abtree local write approved true`. While waiting, you may submit `running`. Do NOT submit success until they confirm. - evaluate: $LOCAL.approved is true - instruct: Proceed with the approved draft. ``` The agent uses `submit running` to ack-and-pause without advancing the cursor. The human's `local write` is what unblocks the next `evaluate`. ### Idiom: plan-approved gate A common variant of the human gate: a downstream tree (`implement` is the bundled example) refuses to run unless an upstream `refine-plan` execution produced a plan with `reviewed_by` populated. Encode it as an early action whose `instruct` checks the file: ```yaml - type: action name: Check_Plan_Approval steps: - evaluate: $LOCAL.change_request is set - instruct: | Find the plan in plans/ matching $LOCAL.change_request. Read the frontmatter. If reviewed_by is empty, return failure with a note that codeowner approval is needed. Otherwise store the full plan content at $LOCAL.plan_content. ``` The action either succeeds (plan content available) or fails (parent sequence aborts, surfacing the missing approval). ### Idiom: parallel context-gathering with shared dependency When multiple branches need to read a value produced by an earlier step, that step has to be in a parent `sequence`, not the parallel itself. Don't fight this — accept that fan-out happens after fan-in. ```yaml type: sequence children: - { type: action, name: Compute_Common_Input, ... } # writes $LOCAL.x - type: parallel name: Branch_On_X children: - { type: action, name: Use_X_For_Foo, ... } # reads $LOCAL.x - { type: action, name: Use_X_For_Bar, ... } # reads $LOCAL.x ``` Each parallel branch can have its own `evaluate: $LOCAL.x is set` precondition for safety. ### Idiom: globals as parameterless retrieval directives When a chunk of work has well-known guidance — code-review checklists, design heuristics, security-review playbooks — don't reproduce it inside an `instruct` and don't store a raw URL or path either. Store the **retrieval directive itself** in `$GLOBAL`. Actions invoke it by name. The natural home for a per-tree playbook is alongside its `TREE.yaml`, e.g. `.abtree/trees//playbooks/.md`: ```yaml state: global: review_playbook: | Read the file at .abtree/trees/my-review/playbooks/review.md (relative to the project root) and return its full body as text. tree: ... - type: action name: Run_Review steps: - evaluate: $LOCAL.target is set - instruct: > Use $GLOBAL.review_playbook to assess $LOCAL.target. Capture findings at $LOCAL.findings. ``` The global is a parameterless directive: "read X, return text." The action composes against the result. Multiple actions in the same tree can invoke the same global without repeating the read boilerplate. **Why this shape:** * **Action prose stays focused.** Each `instruct` says *what to do with the result*, not how to retrieve it. * **Single source of truth.** One place defines where the playbook lives. Swap the path in one spot to repoint every action that uses it. * **Composable.** Multiple actions can invoke the same global (`Use $GLOBAL.review_playbook's pre-flight against …`, `Use $GLOBAL.review_playbook's posting rules to …`) without duplicating retrieval instructions. * **Curated.** Local files let you trim third-party guidance to your project's lens — strip vendor-specific tooling, tighten the bar, add house rules — without forking the upstream document. * **Reproducible.** A playbook checked into the repo is git-tracked; flows created against today's tree run against today's playbook. **Variants:** the directive's body can describe any retrieval — read a file, fetch a URL, query an internal docs system. Local file is the default because it's reproducible and curatable; reach for HTTP only when you genuinely need the upstream's evolving copy. ### Idiom: split a large tree across files For trees that exceed a screenful of YAML, factor out reusable subtrees with JSON-Schema-style `$ref`. abtree resolves references at execution-creation time, so the runtime always sees one assembled snapshot. ```yaml tree: type: sequence children: - $ref: "./fragments/auth.yaml" # relative to this file - $ref: "/srv/abtree/shared/cleanup.yaml" # absolute path - $ref: "https://example.com/audit.yaml" # remote URL ``` The fragment file is a single node — same shape as any inline child: ```yaml # fragments/auth.yaml type: sequence name: Auth_Sequence children: - { type: action, name: Login, steps: [...] } ``` Fragments do NOT carry top-level `name` / `version` / `description` / `state`. Those live only on the root tree. ### Idiom: optional pre-step that doesn't block If a step is "do this if you can, otherwise skip", wrap it in a `selector` whose second child is a no-op: ```yaml - type: selector name: Try_Cache_Then_Continue children: - { type: action, name: Read_Cache, ... } # may fail - type: action name: Skip_Cache steps: - instruct: No cache — continue without it. ``` The selector always succeeds: either the cache read worked, or the no-op did. ## Naming and structure rules * **Tree slug** (the YAML `name` and the folder name): kebab-case (`hello-world`, `improve-codebase`). * **Node names**: PascalCase with underscores (`Choose_Greeting`, `Check_Weather`). Mermaid renders `_` as space. * **Composite names** describe the *decision*: `Choose_Greeting`, `Gather_Context`, `Write_With_Retries`. Action names describe the *work*: `Determine_Time`, `Compose_Response`. * **Root sequence name** is usually `_Workflow`. * **`$LOCAL` keys** are the variables the tree creates; **`$GLOBAL` keys** are the world the tree reads. Don't mix. ## Gotchas — things that look right but aren't ### No native loops abtree has no repeater, no while-condition, no "back to step N". Anything that needs to retry must be expressed as a finite series of `selector` children. If a workflow needs unbounded iteration, fold the iteration into a single `instruct` and let the agent handle it internally — but cap it ("at most 3 attempts, then submit failure"). ### No unbounded retries A `selector` with N children gives you N attempts. There's no shape that gives unlimited attempts. This is intentional — unbounded retries are a footgun for agents. ### Every action needs an evaluate precondition Even when "obviously the precondition holds", write the evaluate. It documents the contract, gives the runtime a chance to short-circuit on bad state, and surfaces failures earlier with clearer messages. Pure-instruct actions (no evaluate) are reserved for the last child of a selector that's serving as a fallback. ### `$LOCAL` keys are scoped to one execution `$LOCAL` is per-execution, not per-tree. Two executions of the same tree have isolated `$LOCAL`. Don't design as if state persists across runs — if you need cross-run state, the agent has to explicitly read/write external files via the instruct text. ### Internal bookkeeping keys are reserved abtree writes `_node_status__` and `_step__` keys to `$LOCAL` to track cursor state across resumption. Don't write to these keys; don't expect to read them in actions. They're documented in [Inspecting executions](/guide/inspecting-executions) for diagnostics, not for use. ### A selector with all evaluate-gated children needs a default If every child has an `evaluate` precondition that might fail, the selector fails when none match. If you want a "none of the above" branch, add a no-evaluate action as the last child. ### Ordering inside a `parallel` Don't depend on parallel children running in YAML order. The agent receives requests for each child in turn, but is free to satisfy them in any sequence. If you need ordering, use `sequence`. ### `submit running` keeps the cursor put Use `submit running` only when waiting on something external (a human approval, a long-running tool). The execution stays in `performing` phase; `abtree next` returns the same instruct. Don't use it to "skip" an instruct. ## Worked design — the "review with retries" pattern Putting the idioms together: a Write → Review → Retry workflow. ```yaml - type: selector name: Write_And_Review children: - type: sequence name: First_Pass children: - type: action name: Write steps: - evaluate: $LOCAL.brief is set - instruct: Write the artefact. Store at $LOCAL.draft. - type: action name: Review_Pass_1 steps: - evaluate: $LOCAL.draft is set - instruct: | Run the review checks against $LOCAL.draft. Set $LOCAL.review_notes to "approved" on success or concrete failure notes otherwise. - evaluate: $LOCAL.review_notes is "approved" - instruct: Approved. Store $LOCAL.final_path. - type: sequence name: Second_Pass children: - type: action name: Revise steps: - evaluate: $LOCAL.review_notes is set and not "approved" - instruct: Revise $LOCAL.draft per the notes. - type: action name: Review_Pass_2 steps: # ... same shape as Review_Pass_1 ... - type: sequence name: Third_Pass children: # ... final attempt before the selector exhausts ... ``` This combines: bounded retries via selector-of-attempts, instruct-then-evaluate gates that populate notes-on-failure, and a clear failure mode (selector exhausts → execution fails with the latest review\_notes preserved for the human to read). ## Process for designing a new tree When a human asks "help me design a tree for ``", work in this order: 1. **Name the success state.** What single sentence describes "the workflow finished correctly"? That's the post-condition the root sequence must establish. 2. **List the discrete tasks.** Each task → one action with an `instruct`. Each task's *precondition* → that action's `evaluate`. 3. **Group dependent tasks into sequences.** "Do A before B" → `sequence: [A, B]`. 4. **Identify decisions.** Each "if X then Y else Z" → `selector` with evaluate-gated children. 5. **Identify fan-out.** Each "do these in any order" → `parallel`. 6. **Identify gates.** Each "the human / a downstream system must approve" → an `evaluate` on a flag they set. 7. **Identify retries.** Each "we should try this a few times before giving up" → `selector` of N attempts, each carrying notes from the previous failure. 8. **State the input contract.** What `$LOCAL` keys must be set before the first action evaluates? Document them in `state.local`. 9. **Sketch the tree top-down**, then walk the failure modes — what happens if action N fails? Does the parent composite handle it the way the design intended? 10. **Save as `.abtree/trees//TREE.yaml`** and run `abtree tree list` to validate the YAML. ## Next * [Writing trees](/guide/writing-trees) — full YAML field reference. * [Inspecting executions](/guide/inspecting-executions) — what the runtime writes back as an execution runs. * [Branches and actions](/concepts/branches-and-actions) — primitive semantics in detail. * [Examples](/examples) — six ready-to-use trees that exercise every idiom on this page. --- --- url: /guide/inspecting-executions.md description: >- How to inspect an abtree execution — the JSON document, the Mermaid diagram, what each field means, and how to debug a stuck cursor. --- # Inspecting executions You drove an execution. abtree wrote two files to disk. This page explains what's in them, where to find them, and what to look for when something doesn't go as expected. ## File layout Every execution produces two files in `.abtree/executions/`: ``` .abtree/ executions/ first-run__hello-world__1.json ← the full execution document first-run__hello-world__1.mermaid ← a live execution diagram ``` The basename is the **execution ID** — kebab-cased summary, two underscores, tree slug, two underscores, an incrementing counter. abtree generates it for you when you run `abtree execution create`; it's stable for the life of the execution. Both files are regenerated atomically on every state change (every `eval`, `submit`, or `local write`). Open them in any editor, `cat` them, commit them, ship them as artefacts — they're plain UTF-8 text. ## The JSON document The JSON file is the source of truth for one execution. Every command — `next`, `eval`, `submit`, `local read` — reads from this document. There is no in-memory state the file doesn't contain; kill the process and the next `abtree next` resumes exactly where you left off. Top-level shape: ```json { "id": "first-run__hello-world__1", "tree": "hello-world", "summary": "first run", "status": "running", "snapshot": "", "cursor": "", "phase": "performing", "created_at": "2026-05-09T11:59:22.076Z", "updated_at": "2026-05-09T11:59:28.256Z", "local": { ... }, "global": { ... } } ``` ### Field reference | Field | Meaning | |---|---| | `id` | The execution ID. Matches the filename. | | `tree` | Slug of the tree this execution was created from. | | `summary` | The human label you passed to `execution create`. | | `status` | `running`, `complete`, or `failed`. The terminal state of the workflow. | | `snapshot` | A JSON-encoded copy of the tree definition at execution-creation time. The execution runs against this snapshot, not the live YAML — editing `.abtree/trees//TREE.yaml` after creation does not affect existing executions. | | `cursor` | A JSON-encoded position inside the tree. `null` means "no step in flight"; otherwise an object like `{"path":[1,0],"step":1}` pointing at a node and a step within it. | | `phase` | `idle` (no current request), `performing` (an `instruct` is in flight, awaiting `submit`), or `evaluating` (an `evaluate` is in flight, awaiting `eval`). | | `created_at` / `updated_at` | ISO 8601 timestamps. `updated_at` advances on every mutation. | | `local` | The `$LOCAL` blackboard — per-execution key/value state your tree reads and writes. | | `global` | The `$GLOBAL` world model — read-only environment values defined in the tree's `state.global` block. | > The term **blackboard** comes from the BT and game-AI literature. It's just a key/value store scoped to one execution, used to pass data between steps. ### Runtime bookkeeping Beside `local` and `global`, every execution document has a `runtime` field. This is **internal state owned by the tick engine** and is never exposed by `abtree local read` / mutated by `abtree local write` — the CLI's local commands only ever touch `doc.local`. ```json { "runtime": { "node_status": { "0": "success", "1.0": "failure", ... }, "step_index": { "1.0": 1, ... }, "retry_count": { "1": 2, ... } } } ``` | Subfield | Meaning | |---|---| | `node_status` | `success` or `failure` for every node the runtime has settled. Keys are dot-joined positions (e.g. `1.0` is the first child of the second top-level node). | | `step_index` | Current step within an action — used to resume a multi-step action without losing your place. | | `retry_count` | Times the runtime has consumed a retry on a node with `retries:` config. Compared against the node's configured limit on each failure. | Older executions (created before the runtime/local split) had these keys mixed in with `local` under prefixes like `_node_status__*` and `_step__*`. abtree migrates them lazily on the next read — the legacy keys disappear from `local` and reappear under `runtime`. ## The Mermaid diagram The `.mermaid` file is a live tree-shaped trace of what the runtime has done so far. Open it in any Mermaid renderer — GitHub previews them inline, VS Code has a preview extension, the `mermaid-cli` tool exports PNG/SVG. Three colour states tell you everything: | Node colour | Meaning | |---|---| | **Green** (`#4ade80`) | The node ran and succeeded. | | **Red** (`#f87171`) | The node ran and failed. | | **Uncoloured** (default substrate) | The runtime never reached this node — usually because a sibling selector branch won, or a parent already failed. | Two diagram shapes carry meaning too: * **`{{rhombus-style}}`** — a composite node (`sequence`, `selector`, or `parallel`). The label includes `[sequence]`, `[selector]`, or `[parallel]` so you know which. * **`["rectangle"]`** — an action (a leaf — work the agent performs). A completed `hello-world` run looks like this: ```mermaid --- title: "hello-world (complete)" --- flowchart TD Hello_World{{"Hello World\n[sequence]"}} 0_Determine_Time["Determine Time\n[action]"] Hello_World --> 0_Determine_Time style 0_Determine_Time fill:#4ade80,stroke:#16a34a,color:#052e16 0_Choose_Greeting{{"Choose Greeting\n[selector]"}} Hello_World --> 0_Choose_Greeting style 0_Choose_Greeting fill:#4ade80,stroke:#16a34a,color:#052e16 0_1_Morning_Greeting["Morning Greeting\n[action]"] 0_Choose_Greeting --> 0_1_Morning_Greeting style 0_1_Morning_Greeting fill:#4ade80,stroke:#16a34a,color:#052e16 0_1_Afternoon_Greeting["Afternoon Greeting\n[action]"] 0_Choose_Greeting --> 0_1_Afternoon_Greeting 0_1_Evening_Greeting["Evening Greeting\n[action]"] 0_Choose_Greeting --> 0_1_Evening_Greeting 0_1_Default_Greeting["Default Greeting\n[action]"] 0_Choose_Greeting --> 0_1_Default_Greeting ``` Every reachable node is green. The selector picked Morning Greeting; the afternoon, evening, and default branches stayed uncoloured because a sibling already won. The sequence advanced through every direct child top to bottom. ## Debugging a stuck execution Three pieces of the JSON document point at the cursor — together they tell you what the runtime is waiting on: | Field | Tells you | |---|---| | `status` | `running` if the execution is still in flight; `complete` or `failed` if it terminated. | | `phase` | `evaluating` if `abtree next` will return an `evaluate`; `performing` if it will return an `instruct`; `idle` if `abtree next` will tick the tree and pick the next request. | | `cursor` | The path-and-step pointer into the tree. `{"path":[2,1],"step":0}` means "the second child of the third top-level node, step zero". | Common situations: * **`status: running`, `phase: idle`, `cursor: null`.** Healthy mid-execution state between requests. Call `abtree next` to advance. * **`phase: performing` for hours.** The agent picked up an `instruct` and never reported back. The execution is waiting for `abtree submit success | failure`. Resume it by submitting, or call `abtree execution reset ` to start over. * **`status: failed`.** A `selector` exhausted all its children, or an action in a `sequence` failed. Look at the `_node_status__*` keys in `$LOCAL` to see which node was the immediate cause; look at the leaf's `evaluate` expression in the `snapshot` to see why it didn't pass. * **The mermaid diagram has red nodes but `status: running`.** A failure was recorded but a parent (selector) is still trying alternatives. The execution is fine — read the next `abtree next` to see what's coming. For a richer dump, `abtree execution get ` returns the same JSON as the on-disk file, formatted to stdout. Useful for piping into `jq` or `python -m json.tool`. ## Next * [CLI reference](/guide/cli) — every command that mutates these files. * [Writing your own tree](/guide/writing-trees) — the YAML the `snapshot` field captures. * [Branches and actions](/concepts/branches-and-actions) — the four primitives you'll see in the diagram. --- --- url: /guide/cli.md description: >- Complete CLI reference for abtree — every command outputs JSON, designed to be driven by another agent. Trees, executions, state, install. --- # CLI reference Every command outputs JSON. That's deliberate — abtree is meant to be driven by another agent, and JSON is its native input. ## Trees ### `abtree tree list` Lists every available tree as an array of slugs. Trees live one per folder, with the definition at `/TREE.yaml`. The folder gives the tree somewhere to keep its own fragments and playbooks alongside the definition. Trees are loaded from two locations: | Location | Purpose | |---|---| | `.abtree/trees//TREE.yaml` (cwd) | Project-local trees, committed alongside the code they apply to. | | `~/.abtree/trees//TREE.yaml` | User-global trees, available in every project. | Project-local wins on duplicate slugs — drop `~/.abtree/trees/refine-plan/TREE.yaml` for a default refine-plan tree, override it per-project by committing a `.abtree/trees/refine-plan/TREE.yaml` to the repo. ```sh $ abtree tree list [ "hello-world", "refine-plan", "deploy" ] ``` ## Executions ### `abtree execution create ` Create a new execution from a tree. The summary is a human label — kebab-cased, it becomes part of the execution ID. ```sh $ abtree execution create hello-world "first run" { "id": "first-run__hello-world__1", "tree": "hello-world", "summary": "first run", "local": { ... }, "global": { ... } } ``` ### `abtree execution list` List every execution with status and phase. ```sh $ abtree execution list [ { "id": "first-run__hello-world__1", "tree": "hello-world", "summary": "first run", "status": "running", "phase": "performing" } ] ``` ### `abtree execution get ` Full execution document: metadata, snapshot, cursor, `$LOCAL`, `$GLOBAL`. ### `abtree execution reset ` Reset an execution to its initial state. Status returns to `running`, all `$LOCAL` keys revert to their tree defaults. Useful for re-running an execution after fixing a tree. ## Execution loop ### `abtree next ` Get the next step. Returns one of: ```json { "type": "evaluate", "name": "...", "expression": "..." } { "type": "instruct", "name": "...", "instruction": "..." } { "status": "done" } { "status": "failure" } ``` ### `abtree eval ` Submit the result of an `evaluate` request. The agent reads the expression, decides whether it holds against current state, and reports back. ### `abtree submit ` Submit the result of an `instruct` request. * `success` advances the cursor. * `failure` marks the action failed; the runtime backs out by branch rules. * `running` keeps the execution in performing state — useful when the work takes time and you want to ack-and-continue later. ## State ### `abtree local read [path]` Read from `$LOCAL`. With no path, returns the whole scope. With a dot-notation path, returns one value. ```sh $ abtree local read first-run__hello-world__1 greeting { "path": "greeting", "value": "Good morning, Alice!" } ``` ### `abtree local write ` Write a value at the given path. Values are JSON-parsed when possible — `true`, `42`, `"hello"`, `[1,2,3]` all work. ### `abtree global read [path]` Read from `$GLOBAL`. Read-only via the CLI. ## Help ### `abtree --help` Prints the full execution protocol — the same content an LLM driving abtree needs to know. Designed for an agent that runs `--help` first to learn the loop. ## Environment variables | Variable | Effect | |---|---| | `ABTREE_EXECUTIONS_DIR` | Overrides the executions directory. Default: `.abtree/executions/` in the cwd. Accepts absolute paths, relative paths (resolved against cwd), or `~/`-prefixed paths. | Use `ABTREE_EXECUTIONS_DIR` to keep execution state outside the repo (e.g. on a shared volume), or to point multiple repos at the same execution store: ```sh export ABTREE_EXECUTIONS_DIR=~/.local/state/abtree-executions abtree execution list # all executions across every project, in one place ``` Trees are still loaded from `.abtree/trees/` (cwd) and `~/.abtree/trees/` (global) — only the executions directory is overridable. ## Exit codes | Code | Meaning | |---|---| | `0` | Success. | | `1` | User error (missing execution, invalid input, bad arguments). | The JSON output is always written to stdout. Errors go to stderr. --- --- url: /agents/execute.md --- # Execution Protocol abtree is a durable behaviour tree engine. Executions bind a tree to a piece of work and persist as JSON documents in `.abtree/executions/`, with two state scopes: * `$LOCAL` — per-execution blackboard (read/write) * `$GLOBAL` — world model (read-only) Internal bookkeeping (cursor, retry counts, per-node status) lives in a `runtime` field on the execution document — invisible to `local read` and not mutable via `local write`. You don't manage it; the engine does. ::: warning STRICT Never read tree files directly. All interaction goes through this CLI. ::: ## Routing ```text No arguments → execution list; resume an existing execution or pick a tree → resume that execution → create a new execution (remaining args = summary) list → show all executions ``` ## Create protocol ```text abtree execution create abtree local write change_request "" abtree next ← begin execution loop ``` ## Execution loop Call `abtree next ` to get the next request. Repeat until done. ### Response: `evaluate` ```json { "type": "evaluate", "name": "...", "expression": "..." } ``` Procedure — **DO NOT** skip steps: 1. Parse the expression. Identify every `$LOCAL.` and `$GLOBAL.` referenced. 2. For EACH referenced path, call: ```text abtree local read (for $LOCAL refs) abtree global read (for $GLOBAL refs) ``` Record the actual returned value. Do not skip this step even if you wrote the value yourself one command ago. 3. Apply the expression's truth condition against those actual values and ONLY those values. No inference from context, memory, or "obvious" assumptions. 4. Call: `abtree eval true|false` ::: warning STRICT Skipping step 2 corrupts the gate. The store is the source of truth, not your context. Even when the answer "feels obvious", read it. ::: ### Response: `instruct` ```json { "type": "instruct", "name": "...", "instruction": "..." } ``` Procedure: 1. Read the instruction in full. 2. Perform the work named. Use real tools — file I/O, web search, shell commands, sub-agents — as the instruction directs. 3. Write any produced values to `$LOCAL` via `abtree local write`. 4. Call: `abtree submit success|failure|running`. Use `running` only when waiting on something external (e.g. a human approval). Do NOT use `running` to skip an instruct. ::: warning STRICT Every value written to `$LOCAL` must come from an explicit source named in the instruction (tool, command, `$LOCAL`/`$GLOBAL` path, or a literal fallback). If the source is ambiguous, call `submit failure`. Do not infer, guess, or invent. ::: ### Response: `done` / `failure` ```json { "status": "done" } { "status": "failure" } ``` Tree terminated. Report the outcome to the human. ## Available trees Run `abtree tree list` for the live set. Bundled trees include: `hello-world`, `refine-plan`, `implement`, `technical-writer`, `improve-codebase`. ## State commands ```text abtree local read [path] Read from $LOCAL abtree local write Write to $LOCAL abtree global read [path] Read from $GLOBAL ``` ## Reporting (per action) ```text [execution-id] ✓ Action_Name → success|failure ``` --- --- url: /agents/author.md --- # Tree Authoring Guide Authoring an abtree tree means writing a YAML file that an agent can drive deterministically through `abtree next`, `eval`, and `submit`. Trees live in `.abtree/trees//TREE.yaml` (project-local) or `~/.abtree/trees//TREE.yaml` (user-global). The folder name is the slug. Project-local shadows global on slug collision. ::: tip Run `abtree docs schema` to print the JSON Schema, or reference the published copy via the YAML language-server comment: ```yaml # yaml-language-server: $schema=https://abtree.dev/schemas/tree.schema.json ``` ::: ## File shape ```yaml name: my-tree # slug, lowercase, hyphenated. Required. version: 1.0.0 # semver string. Pure label; not parsed. Required. description: short text # optional. state: # optional. local: {...} # initial $LOCAL keys for every execution. global: {...} # initial $GLOBAL keys; read-only at runtime. tree: # the root node. Required. ... ``` ## Node primitives There are four. Three composites and one leaf. | Type | Behaviour | Result | |------------|--------------------------------------------------------------|---------------------------------------| | `sequence` | Tick children left-to-right. Stops on first failure. | success iff all children succeeded. | | `selector` | Tick children left-to-right. Stops on first success. | success iff any child succeeded. | | `parallel` | Tick all children. No short-circuit. | success iff all children succeeded. | | `action` | Leaf. Carries a list of `steps`, each `evaluate` or `instruct`. | success iff every step succeeded. | Every node carries a `name` (used in `abtree next` output and the mermaid render). Composites carry `children: [...]`. Actions carry `steps: [...]`. ## Step kinds (action only) ### `evaluate` ```yaml - evaluate: "$LOCAL.foo == 'bar'" ``` The agent reads `$LOCAL.foo`, applies the expression, and calls `abtree eval true|false`. Expressions are opaque strings — abtree does not parse them. Phrasing is the contract between the tree author and the agent. ### `instruct` ```yaml - instruct: "do the thing, write the result to $LOCAL.bar" ``` The agent performs the work, writes any produced values via `abtree local write`, and calls `abtree submit success|failure|running`. ## Retries Any node can carry `retries: N` (positive integer). On failure, the runtime wipes the node's runtime subtree (its own `node_status`/`step_index` and all descendants') and re-attempts from a clean slate, up to N times. User-written `$LOCAL` data is preserved across retries — that is the whole point of the feedback loop. ## `$ref` fragments Split a tree across multiple YAML files using JSON-Schema-style `$ref`. Relative paths, absolute paths, and URLs are dereferenced at load time: ```yaml tree: type: sequence name: Top children: - $ref: "./fragments/auth.yaml" - $ref: "./fragments/work.yaml" ``` The dereferenced object must itself be a valid node (composite or action). Cyclic refs are not expanded — they are preserved literally as `$ref` nodes that surface a clean failure if the runtime ever ticks them. ## Worked example ```yaml # yaml-language-server: $schema=https://abtree.dev/schemas/tree.schema.json name: my-tree version: 1.0.0 description: short summary state: local: target: null result: null tree: type: sequence name: Top children: - type: action name: Set_Target steps: - instruct: "decide a target. write to $LOCAL.target" - type: selector name: Try_Strategies retries: 2 children: - type: action name: Fast_Path steps: - evaluate: "$LOCAL.target is small" - instruct: "do the fast thing. write to $LOCAL.result" - type: action name: Slow_Path steps: - instruct: "do the slow thing. write to $LOCAL.result" ``` ## Validation | Mechanism | What it covers | |----------------|-----------------------------------------------------------------------------------------------| | Schema check | `tests/trees-schema.test.ts` parses every tree in `.abtree/trees/` through `TreeFileSchema`. | | CLI errors | Malformed trees fail `abtree execution create` with a path-prefixed message: `tree.steps: Too small: expected array to have >=1 items`. | | Editor LSP | The `# yaml-language-server: $schema=...` comment enables completions, tooltips, and inline error highlights in any YAML LSP client. | ## Reporting (per tree authored) ```text [tree-slug] ✓ valid → run `abtree tree list` to confirm it loads ``` --- --- url: /agents/schema.md --- # JSON Schema abtree publishes a [JSON Schema](https://json-schema.org/) for tree YAML files so editors and validators can verify a tree before it ever touches the CLI. ## Sources * **CLI:** `abtree docs schema` prints the schema to stdout. Byte-identical to the committed file. * **Repo:** [`tree.schema.json`](https://github.com/flying-dice/abtree/blob/main/tree.schema.json) on `main`. * **Release:** every GitHub release ships `tree.schema.json` as an asset. * **Stable URL:** `https://abtree.dev/schemas/tree.schema.json`. ## Editor integration Add a YAML language-server comment at the top of every tree file: ```yaml # yaml-language-server: $schema=https://abtree.dev/schemas/tree.schema.json name: my-tree version: 1.0.0 tree: type: action name: Greet steps: - instruct: say hello ``` VS Code with the Red Hat YAML extension, Neovim with `yaml-language-server`, and any other LSP client that speaks the same protocol will then surface completions, type tooltips, and inline error highlights as you author the tree. The `$schema` keyword as a top-level YAML field is also accepted by the parser if you prefer to embed it inline rather than as a comment. ## CI validation The repository's test suite parses every YAML in `.abtree/trees/` through `TreeFileSchema` (`tests/trees-schema.test.ts`), and a separate CI job (`schema` in `.gitlab-ci.yml`) regenerates the JSON Schema from the zod source and fails the build if the committed file has drifted. Both run on every push. ## Source of truth The schema is generated from `src/schemas.ts` via `src/schemas.ts:buildJsonSchema()`, which is the single function called by both `scripts/generate-schema.ts` (build time) and `cmdDocsSchema` (runtime). The committed `tree.schema.json` is the build output, kept fresh by CI. --- --- url: /examples.md description: >- Ready-to-use abtree behaviour trees, installable in one command — hello-world, refine-plan, implement, technical-writer, improve-codebase. --- # Examples registry Ready-to-use behaviour trees. Each entry includes the YAML files, a one-liner to copy them into your local `.abtree/trees//`, and a Claude handover command that briefs Claude to drive the execution with abtree. Trees live in `.abtree/trees//TREE.yaml`. The folder gives the tree somewhere to keep its own fragments and playbooks alongside the definition. Every install command is idempotent — safe to re-run. Existing files are overwritten with the latest version from `main`. *** ## Hello World A small workflow that greets a user based on time of day. The selector picks one of four greetings (morning / afternoon / evening / default) using time-of-day evaluates as preconditions. Demonstrates `sequence`, `selector`, and `action` — the three primitives most workflows lean on. Use this first if you're learning abtree. **Files** * `hello-world/TREE.yaml` — main **Install** ```sh mkdir -p .abtree/trees/hello-world \ && curl -fsSL https://raw.githubusercontent.com/flying-dice/abtree/main/.abtree/trees/hello-world/TREE.yaml \ -o .abtree/trees/hello-world/TREE.yaml ``` **Run with Claude** ```sh claude "Run the abtree hello-world tree end-to-end. Start by running 'abtree --help' to learn the execution protocol, then create an execution with 'abtree execution create hello-world \"smoke test\"' and drive it through every step until you see status: done." ``` *** ## Refine a plan Turn a one-line change request into a hardened, codeowner-reviewable plan. The execution analyses intent, drafts a structured plan (frontmatter + summary + requirements + technical approach + acceptance criteria + risks), critiques it as a Staff Engineer, and saves the result to `plans/.md`. The `reviewed_by` field stays empty until a codeowner approves it. **Files** * `refine-plan/TREE.yaml` — main **Install** ```sh mkdir -p .abtree/trees/refine-plan \ && curl -fsSL https://raw.githubusercontent.com/flying-dice/abtree/main/.abtree/trees/refine-plan/TREE.yaml \ -o .abtree/trees/refine-plan/TREE.yaml ``` **Run with Claude** ```sh claude "Use the abtree refine-plan tree to turn this change request into a plan: . Run 'abtree --help' first to learn the protocol, then create the execution, write the change_request to LOCAL, and drive it to completion. Show me the final plan path." ``` *** ## Implementation workflow A two-stage pipeline for shipping changes. **refine-plan** produces an approved plan under `plans/`. **implement** reads it back, scores complexity, optionally escalates to an architect on high-complexity work, and writes the code. implement refuses to start on an un-reviewed plan — `reviewed_by` must be set. **Files** * `implement/TREE.yaml` — main * `refine-plan/TREE.yaml` — sub-workflow (run first to produce the plan) **Install** ```sh for t in implement refine-plan; do mkdir -p ".abtree/trees/${t}" \ && curl -fsSL "https://raw.githubusercontent.com/flying-dice/abtree/main/.abtree/trees/${t}/TREE.yaml" \ -o ".abtree/trees/${t}/TREE.yaml" done ``` **Run with Claude** ```sh claude "I want to . First run the abtree refine-plan tree to produce a plan at plans/.md, then pause for me to add my name to reviewed_by. Once I confirm approval, run the abtree implement tree against the plan and write the code. Use 'abtree --help' to learn the protocol." ``` *** ## Technical writer Take a documentation goal, ground it in the repo's `STYLEGUIDE.md` (or draft one and gate on human approval if none exists), find or build a home in the docs tree, write to it, and gate-check the result against three rules — does it fit the structure, does the narrative flow, is it one concept? Up to three write/review passes before the workflow surfaces persistent failures to the human. Standalone workflow; no upstream spec required. **Files** * `technical-writer/TREE.yaml` — main **Install** ```sh mkdir -p .abtree/trees/technical-writer \ && curl -fsSL https://raw.githubusercontent.com/flying-dice/abtree/main/.abtree/trees/technical-writer/TREE.yaml \ -o .abtree/trees/technical-writer/TREE.yaml ``` **Run with Claude** ```sh claude "Use the abtree technical-writer tree to document . Run 'abtree --help' first to learn the protocol, then create the execution, write the goal to LOCAL, and drive it to completion. If a styleguide doesn't exist yet, draft one and pause for me to approve before continuing." ``` *** ## Improve codebase A continuous code-quality improvement cycle. The agent confirms intent and a green test baseline, then runs a parallel scoring pass on four metrics (DRY, SRP, coupling, cohesion) — each scorer records observations, severity, risk, and a cost/benefit estimate. A Senior-Principal critique hardens the findings, an online lookup gathers best-practice patterns, and the human approves the triaged queue. The refactor stage then iterates through each item: high-risk items get a blast-radius critique first; every item gets up to **three attempts** to (implement → full regression test → focused re-score) before halting. After the queue drains, a final parallel reassessment compares against the snapshotted baseline and emits a pass / partial verdict. **Files** * `improve-codebase/TREE.yaml` — main **Install** ```sh mkdir -p .abtree/trees/improve-codebase \ && curl -fsSL https://raw.githubusercontent.com/flying-dice/abtree/main/.abtree/trees/improve-codebase/TREE.yaml \ -o .abtree/trees/improve-codebase/TREE.yaml ``` **Run with Claude** ```sh claude "Run the abtree improve-codebase tree on this repo. Use 'abtree --help' to learn the protocol. Set $LOCAL.change_request to a one-line scope ('full repo' / 'just the auth module' / 'DRY only'); $GLOBAL.test_command to the project's regression test command. Drive Check_Intent through to Cycle_Verdict, pausing for my approval at the triage gate. Surface the baseline-vs-final delta and any items that hit the per-item attempt cap." ``` *** ## Submitting your own Trees are just YAML — see [Writing trees](/guide/writing-trees) for the format. Open a PR against [`flying-dice/abtree`](https://github.com/flying-dice/abtree) adding your tree to `.abtree/trees//TREE.yaml` and an entry on this page, and it'll ship in the next release.