Technical illustration of an agent workflow feeding event traces into a compact observability panel and evaluation checklist

Tracing, Observability, and Evals for Agent Systems

The live demo repo for this series is 67ailab/harness-engineering, and for this post I did change the repo before publishing. The new capability shipped in commit 85c762c, which adds two concrete things the repo was missing: a persisted trace-summary surface for every run a lightweight eval runner with trace-aware fixtures The key changes are in src/harness_engineering/tracing.py, src/harness_engineering/store.py, src/harness_engineering/cli.py, and the new src/harness_engineering/evals.py module, plus starter fixtures in sample_data/evals/basic.json. That matters because a lot of agent writing still treats observability as an afterthought and evals as a benchmark spreadsheet. In practice, most production pain shows up somewhere else: ...

May 7, 2026 · 67 AI Lab
A digital isometric map of a futuristic infrastructure city with data pathways and autonomous agents.

The SRE Landscape: A Map of the Territory

If you ask five engineers to define Site Reliability Engineering (SRE), you will get five different answers. For some, it is simply “operations with a software mindset.” For others, it is strictly about error budgets and Service Level Objectives (SLOs). And for a growing number in 2026, it is the discipline of managing the AI agents that manage the systems. But before we can discuss Agentic SRE—the automation of reliability work by autonomous AI—we must agree on what work is actually being done. You cannot automate what you do not understand. ...

February 14, 2026 · 67 AI Lab