Workflow Systems

Layered technical illustration of agent infrastructure beyond the protocol layer, including orchestration, policy, durability, observability, and operator controls

What Comes After MCP: The Next Layer of Agent Infrastructure

The live demo repo for this series is 67ailab/harness-engineering. For this final post, I did not change the repo before publishing; the codebase discussed here is the current public state at commit 7d01dae, the same commit introduced in the previous post when the repo gained a real blueprint export. That matters because this article is not about an imaginary next step. It is about what the current repo already makes obvious once you stop looking at MCP as the finish line. ...

Layered technical diagram of an agent harness with CLI, runner, policy, tools, tracing, memory, workflow, approval gate, and persisted artifacts

A Reference Blueprint for a Production Agent Harness

The live demo repo for this series is 67ailab/harness-engineering, and for this post I did change the repo before publishing. The new repo commit is 7d01dae, which adds a real blueprint export to the demo so the architecture in this article is not just a hand-drawn diagram in prose. You can now run: PYTHONPATH=src python3 -m harness_engineering.cli blueprint --pretty PYTHONPATH=src python3 -m harness_engineering.cli blueprint --format markdown PYTHONPATH=src python3 -m harness_engineering.cli blueprint --format mermaid That feature lives mainly in: ...

Technical illustration of planner, executor, and reviewer components connected by explicit handoffs and an approval gate before a final file write

Multi-Agent Systems Without the Theater

The live demo repo for this series is 67ailab/harness-engineering, and for this post I did change the repo before publishing. The new capability shipped in commit dadf203, which adds a small but real multi-agent mode to the demo: the harness can now run with explicit planner, executor, and reviewer roles, persist role activity, record handoffs, and expose those artifacts through the CLI and saved run files. The core changes are in: ...

Technical illustration of an agent workflow paused at an approval gate while a human reviewer decides whether to continue

Human-in-the-Loop Done Properly

The live demo repo for this series is 67ailab/harness-engineering, and for this post I did change the repo before publishing. The new capability shipped in commit 352fba2, which adds a first-class pending-approval inspection surface to the existing approval-gated harness. The key changes are in src/harness_engineering/runner.py, src/harness_engineering/cli.py, and src/harness_engineering/store.py. That matters because most writing about “human in the loop” in agent systems is still weirdly sloppy. A model says “should I proceed?”, a human types “yes”, and the demo declares the governance problem solved. It is not solved. In production, approval is not a vibe, not a chat convention, and not a magical hidden boolean inside the runtime. It is a workflow boundary with state, context, inspection, and recovery semantics. ...

Layered agent memory diagram showing working context, session state, and retrieval memory around a checkpointed workflow

Memory Architecture for Agents: Context, Sessions, and State

The live demo repo for this series is 67ailab/harness-engineering, and for this post I did change the repo before publishing. The new capability shipped in commit d20e352, which adds an explicit memory-layer model to the demo instead of treating every stored value as one blurry thing called “memory.” The core addition is src/harness_engineering/memory.py, plus wiring in src/harness_engineering/store.py and src/harness_engineering/cli.py so every run now emits a memory.json snapshot and the CLI exposes a memory command. ...

Engineering workflow diagram with checkpoints, event history, approval gate, and pause-resume arrows

Durable Execution Is the Difference Between a Demo and a System

The live demo repo for this series is 67ailab/harness-engineering, and for this post I did change the repo before publishing. The new capability shipped in commit 9612b58, which adds persisted run summaries plus replay-oriented history inspection to the existing approval-gated harness. The key changes are in src/harness_engineering/store.py and src/harness_engineering/cli.py. That addition matters because durable execution is where most agent demos quietly stop being honest. It is easy to show a model calling tools in one uninterrupted run. It is much harder to explain what happens when execution pauses for approval, the process dies, the machine reboots, the reviewer returns malformed output, or an operator needs to understand what state the run is actually in. ...

Systems diagram showing an agent harness with workflow nodes, approval gates, manager-worker branches, and handoff arrows

Orchestration Patterns: Loops, Graphs, Managers, and Handoffs

The live demo repo for this series is 67ailab/harness-engineering, and for this post I did add a real repo capability before publishing. The repo now includes a workflow export layer in src/harness_engineering/workflow.py, plus a workflow CLI command in src/harness_engineering/cli.py that renders the current harness orchestration as structured JSON or Mermaid. That change shipped in commit a007c08. That may sound like a documentation flourish. It is not. The point of an orchestration post is not to wave vaguely at boxes and arrows. It is to make the runtime’s control structure explicit enough that you can inspect it, reason about it, and argue about whether it is the right one. ...

Blueprint-style diagram of an agent runtime surrounded by tools, state, traces, approvals, and outputs

Anatomy of an Agent Harness

The live demo repo for this series is 67ailab/harness-engineering, and this post stays anchored to the code that exists there today. I did not add a new repo capability for this article. The point of this installment is to dissect the current harness as it actually stands: what lives in src/harness_engineering/, how the pieces fit together, and which parts are carrying the reliability burden. That matters because “agent” is now a dangerously overloaded word. Many teams still mean either a model that can call functions or a prompt loop with some memory and tool wrappers. Those are ingredients, not a runtime anatomy. ...