Definition

Full YAML source for Test Tree.
Source: .abtree/trees/test-tree/TREE.yaml
yaml
# yaml-language-server: $schema=https://abtree.dev/schemas/tree.schema.json
name: test-tree
version: 1.2.0
description: |
  Run a BDD test spec (TEST__<scenario>.yaml) against its target tree,
  capture the mermaid trace, compare the run's final state against the
  spec's `then` assertions, and write a markdown test report that embeds
  the mermaid diagram.

  The caller seeds $LOCAL.test_path with the path to the
  TEST__<scenario>.yaml file (e.g. ".abtree/trees/refine-plan/TEST__happy-path-in-session.yaml").
  The runner does the rest.

state:
  local:
    test_path: null            # caller-seeded: path to TEST__<scenario>.yaml
    test_spec: null            # parsed spec: { feature, tree, background, scenario }
    target_execution_id: null  # execution id of the run-under-test
    final_local: null          # full $LOCAL of the target execution at termination
    final_status: null         # "done" | "failure"
    mermaid_diagram: null      # raw contents of the .mermaid trace file
    assertions: null           # [{name, expected, actual, pass}, …]
    overall_result: null       # "pass" | "fail"
    report_path: null          # path to the rendered markdown report (lives next to the TEST spec)

tree:
  type: sequence
  name: Test_Runner
  children:
    - type: action
      name: Load_Spec
      steps:
        - evaluate: $LOCAL.test_path is set
        - instruct: >
            Read the YAML file at $LOCAL.test_path. Parse it and store the
            parsed object at $LOCAL.test_spec. The parsed object MUST contain
            at minimum: `tree` (slug of the tree under test), `scenario.name`,
            `scenario.given`, `scenario.when`, `scenario.then`. If any of
            those are missing or unreadable, submit failure — a malformed
            spec is not a test the runner can drive.

    - type: action
      name: Drive_Workflow
      steps:
        - evaluate: $LOCAL.test_spec is set
        - instruct: >
            Create a fresh execution of the tree named in $LOCAL.test_spec.tree
            with a summary derived from the scenario name. Store the new
            execution id at $LOCAL.target_execution_id.

            Seed the target execution's $LOCAL from
            $LOCAL.test_spec.background.initial.local (if present) via
            `abtree local write <target> <path> <json-value>` for each entry.

            Acknowledge the protocol gate on the target execution
            (first `abtree next` returns Acknowledge_Protocol — submit success).

            Drive the target execution to completion by walking each `when`
            step in scenario order. For evaluate steps the agent MUST read
            every $LOCAL/$GLOBAL path the expression references via
            `abtree local/global read <target>` before calling
            `abtree eval <target> true|false`. For instruct steps the agent
            MUST perform the work named (file I/O, shell, sub-agent — whatever
            the instruction directs) and write any produced values to $LOCAL
            via `abtree local write <target> …` before calling
            `abtree submit <target> success|failure|running`.

            FIXTURE-DRIVEN SIDE EFFECTS (VCR semantics). The runner does
            NOT invent values for side effects. Instead, the TEST spec
            cements the simulated outcomes in a `fixtures` block:

              fixtures:
                side_effects:
                  <key>:               # e.g. mr_open, git_push, http_post
                    <field>: <value>   # e.g. url, branch, commit_sha

            When an instruction in the target tree directs a side effect
            whose execution would normally require external authorisation
            (real git push, real MR/PR open, network calls, destructive
            shared-state ops), the runner:

              1. Looks up the matching key under $LOCAL.test_spec.fixtures.side_effects.
              2. If a fixture exists, writes its prescribed fields to the
                 target's $LOCAL exactly as the instruction directs (e.g.
                 `mr_open.url` → `$LOCAL.mr_url`), then submits success.
              3. If NO fixture exists for that side effect, the runner
                 must either (a) perform the side effect for real
                 (only when the operator has explicitly authorised it), or
                 (b) submit failure on that instruct. The runner never
                 fabricates a stand-in.

            This authority extends only to *external* side effects. Values
            that come from a real local tool/read named in the instruction
            ($LOCAL/$GLOBAL reads, file frontmatter scans, etc.) must
            still come from the actual source, never from a fixture.

            The driver loop terminates when `abtree next <target>` returns
            `{status: done}` or `{status: failure}`. If the runner hits an
            instruct that has no fixture and the runner is not authorised
            to perform it live, submit failure on the *target* execution
            and continue — the comparison step will then record the failure
            mode against the spec's expected status.

    - type: action
      name: Capture_Trace
      steps:
        - evaluate: $LOCAL.target_execution_id is set
        - instruct: >
            Read the mermaid trace at
            `.abtree/executions/<$LOCAL.target_execution_id>.mermaid` and
            store its full contents (verbatim) at $LOCAL.mermaid_diagram.
            If the file does not exist, submit failure.
        - instruct: >
            Read the target execution's full $LOCAL via
            `abtree local read <$LOCAL.target_execution_id>` and store the
            returned object at $LOCAL.final_local.

            Read the target execution's terminal status. Call
            `abtree next <$LOCAL.target_execution_id>` one more time — the
            runtime returns `{status: done}` or `{status: failure}` on a
            completed execution — and store that status string at
            $LOCAL.final_status.

    - type: action
      name: Compare_Assertions
      steps:
        - evaluate: $LOCAL.final_local is set and $LOCAL.final_status is set
        - instruct: >
            For each assertion in $LOCAL.test_spec.scenario.then, compute
            a record { name, expected, actual, pass }:

              - then.status → name "status", expected = scenario.then.status,
                actual = $LOCAL.final_status, pass = expected == actual.

              - then.local.<key> → name "local.<key>", expected = the spec
                value (treat literal scalars as equality, treat strings
                starting with "matches " / "starts with " / "non-empty" as
                their corresponding predicates), actual = $LOCAL.final_local[key],
                pass = predicate(actual).

              - then.files.<key> → name "files.<key>", expected = the spec
                value, actual = the on-disk reality (file existence,
                frontmatter scan), pass = predicate(actual). If a
                files-assertion needs information the runner did not
                capture, record pass=false with actual="not captured" — do
                not invent.

              - then.git.* / then.mr.* / other external-side-effect
                assertions → if Drive_Workflow served the side effect from
                a `fixtures.side_effects` entry in the spec, record the
                fixture value as `actual` (prefixed "(fixture) ") and set
                pass=true provided the fixture satisfies the spec's
                pattern. If the side effect was performed live and
                observed, record the observed value (prefixed "(live) ")
                and pass=true/false per the predicate. If the side effect
                was neither fixture-served nor live-observed, record
                pass=false with actual="not captured".

            Store the array at $LOCAL.assertions. Do not skip assertions —
            an unverifiable, un-fixture-served assertion is a failed
            assertion, not an absent one.
        - instruct: >
            Compute $LOCAL.overall_result: "pass" if every record in
            $LOCAL.assertions has pass=true, else "fail".

    - type: action
      name: Write_Report
      steps:
        - evaluate: $LOCAL.assertions is set and $LOCAL.overall_result is set
        - instruct: >
            Compose a markdown test report with exactly these sections:

              # Test report — <scenario.name>

              **Tree:** <test_spec.tree>
              **Spec:** <test_path>
              **Target execution:** <target_execution_id>
              **Overall:** <overall_result> (uppercased: PASS / FAIL)

              ## Final $LOCAL
              (table: key | value — rendered from $LOCAL.final_local)

              ## Assertions
              (table: Name | Expected | Actual | Pass — rendered from $LOCAL.assertions)

              ## Trace
              ```mermaid
              <verbatim contents of $LOCAL.mermaid_diagram>
              ```

            Write it next to the spec, in the same directory as
            $LOCAL.test_path. The filename is
            `REPORT__<scenario-stem>__<YYYYMMDDTHHMMSSZ>.md`, where
            `<scenario-stem>` is the basename of $LOCAL.test_path with
            the `TEST__` prefix and `.yaml` extension stripped (so
            `.abtree/trees/hello-world/TEST__morning.yaml` produces
            `.abtree/trees/hello-world/REPORT__morning__<ts>.md`).
            Store the resulting path at $LOCAL.report_path.
        - evaluate: $LOCAL.report_path is set
Definition ​

Definition