The live demo repo for this series is 67ailab/harness-engineering, and for this post I did change the repo before publishing. The new repo commit is 3f2ec5d, which adds a checked-in baseline policy file at policy/default.json and tightens PolicyEngine so relative policy paths resolve from the policy file location rather than from the caller’s current working directory.
That sounds like a small change. It is small in lines of code. It is not small in meaning.
Security failures in agent systems usually do not come from one dramatic exploit. They come from vague boundaries:
- a model can write somewhere you did not mean it to write
- a provider key leaks from local config into a public repo
- a policy file behaves differently depending on where the operator runs the command
- a “safe” tool becomes unsafe because nobody defined its action category clearly
- a local demo quietly depends on a model endpoint that was never actually verified
Those are harness problems.
That is why Post 11 is not about benchmark red-team theater or generic zero-trust slogans. It is about the boring but decisive mechanics around an agent loop: policy config, write boundaries, provider auth handling, secret hygiene, and verification.
In this repo, the relevant code now lives in:
src/harness_engineering/policy.pysrc/harness_engineering/tools.pysrc/harness_engineering/runner.pysrc/harness_engineering/provider.pysrc/harness_engineering/cli.pyscripts/secret_scan.pypolicy/default.jsontests/test_harness.py
And I verified the repo state before writing this post with:
cd /home/james/.openclaw/workspace/harness-engineering
make check
PYTHONPATH=src python3 -m harness_engineering.cli doctor
For this run, make check passed, the secret scan passed, and doctor successfully reached the repo-local OpenAI-compatible endpoint and returned MODEL_OK for gemma4.
What changed in the repo since the previous post
Post 10 added explicit planner, executor, and reviewer handoffs. That made the harness more inspectable. Post 11 needed a different kind of inspectability: not who handed work to whom, but what the runtime is allowed to do at all.
The repo already had a meaningful security foundation before this run:
Tool.action_categoryinsrc/harness_engineering/tools.pyPolicyEngine.evaluate()insrc/harness_engineering/policy.py- pre-execution policy checks in
HarnessRunner._execute()insrc/harness_engineering/runner.py - repo-local provider loading via
load_dotenv()andload_model_config()insrc/harness_engineering/provider.py doctor_check()insrc/harness_engineering/provider.py.envignore rules in.gitignore- secret scanning in
scripts/secret_scan.py
But there was still an honest gap between “security logic exists” and “this repo has a policy configuration story that an operator could actually use.”
The missing piece was a first-class policy artifact.
So I added policy/default.json as a checked-in baseline policy, documented it in README.md, and fixed PolicyEngine so relative roots in policy files are resolved against the policy file directory. That last detail matters more than it sounds. If the meaning of a policy file changes depending on the shell cwd, then the policy is not really stable configuration. It is ambient behavior.
Security in an agent harness is mostly boundary work
I think people often compress “agent security” into one bucket, but there are really at least four different concerns in this demo:
- Policy — what actions are allowed at all?
- Approval — what actions require a human before execution?
- Auth and provider config — how does the harness obtain model access safely?
- Secret hygiene — how do we avoid leaking credentials into git and artifacts?
The repo addresses each one separately, and that separation is healthy.
Approval is not the same as policy. A risky action can be policy-allowed and still require explicit approval. Auth is not the same as policy either. You can have a perfectly configured API key and still have a badly bounded runtime. And secret hygiene is not the same as auth. A key can work fine technically while still being handled recklessly.
This is exactly the sort of separation that harness engineering is supposed to force.
The policy layer: simple, explicit, and local
The core policy abstraction is PolicyEngine in src/harness_engineering/policy.py.
The critical methods and types are:
PolicyEngine.from_file()PolicyEngine.describe()PolicyEngine.evaluate()default_policy_config()load_policy_file()PolicyDecision
The model here is intentionally small.
The harness does not attempt OS sandboxing. It does not manage syscall profiles. It does not isolate network egress. Instead, it does something narrower and more demonstrable:
- each tool has an explicit
action_category - risky write behavior is classified as
filesystem_write - write targets are checked against allowed roots
- denials are persisted in trace and summary artifacts
That categorization starts in src/harness_engineering/tools.py, where default_registry() registers the tools and marks finalize_report as both risky and filesystem_write:
search_mock→read_onlyextract_facts→transformdraft_report→model_generationfinalize_report→filesystem_write
I like this design because the category is attached to the tool definition itself, not inferred from vague runtime guesses.
Then HarnessRunner._execute() in src/harness_engineering/runner.py calls self.policy.evaluate(tool_name, kwargs) before tool execution. If the action is denied, the run fails cleanly, the denial is traced, and the decision is persisted. That means policy is not a README promise. It is part of execution state.
Why policy/default.json matters
Before this run, the repo had sample_data/policy/restrictive.json, which was useful as a denial example. But a denial example is not the same thing as a real baseline configuration.
Now the repo also has:
policy/default.json
That file makes the policy surface concrete and operator-visible. It says, in data rather than just code:
- which tools are enabled
- which action category each tool belongs to
- which output roots are allowed for writes
That sounds mundane, but mundane is exactly what you want from a policy artifact. Policy should be readable, boring, reviewable, and diffable.
The more interesting improvement is the relative-path fix in PolicyEngine._resolve_policy_path().
Previously, a relative path such as .runs would resolve from the current working directory. Now it resolves from self.config_base_dir, which is derived from the policy file path. PolicyEngine.describe() also exposes both configured roots and resolved roots, making it easier to inspect what the runtime actually thinks the policy means.
That is a subtle but important hardening move. Configuration files should be portable artifacts, not traps.
tests/test_harness.py now includes test_policy_file_relative_roots_resolve_from_file_location, which verifies exactly that behavior.
Approval is adjacent to policy, not a substitute for it
This repo continues to model approval separately from policy, and I think that is correct.
In HarnessRunner.run_until_pause_or_complete(), the harness evaluates the would-be finalize_report write before the approval gate. If the write target falls outside allowed roots, the run fails immediately. If the write is policy-allowed, the harness creates a structured pending action and pauses in waiting_approval.
So the sequence is:
- policy check says whether the action is allowed at all
- approval gate says whether the human authorizes this particular execution
That is a better model than treating approval as the only guardrail. If something is disallowed by policy, there should not even be an approval UI for it.
This separation also matches MCP’s security guidance reasonably well. The MCP tools specification explicitly says tools are model-controlled, but also says there should be a human in the loop with the ability to deny invocations and clear UI around tool exposure and confirmations. Protocols can standardize tool shape. They do not replace harness policy.
Provider auth handling: local, explicit, and testable
The provider/auth side of the repo lives in src/harness_engineering/provider.py.
The important functions are:
load_dotenv()load_model_config()create_client_from_env()doctor_check()OpenAICompatibleClient.list_models()OpenAICompatibleClient.chat()
What I like here is not sophistication. It is legibility.
load_model_config() prefers repo-local HARNESS_* variables such as:
HARNESS_MODEL_PROVIDERHARNESS_MODEL_NAMEHARNESS_OPENAI_BASE_URLHARNESS_OPENAI_API_KEY
and falls back to broader environment names like MODEL_PROVIDER and OPENAI_API_KEY.
That preference order is a practical choice. It reduces accidental collisions with other shell-level credentials on a workstation or CI environment.
Then doctor_check() validates two things before you trust local-model behavior:
- can the endpoint answer
GET /models? - can it answer a minimal
/chat/completionsrequest correctly?
That is exactly what the repo’s article-writing workflow asked for, and it is the right level of proof for this demo. In this run, doctor returned:
- provider:
openai_compatible - model:
gemma4 - status:
ok - message:
MODEL_OK
That matters because too much agent writing quietly assumes a model backend exists and works. If a post depends on local-model behavior, the harness should show a real connectivity check.
For external reference, this shape is aligned with the common OpenAI-style pattern of a model listing endpoint plus chat completions over a list of messages. The repo is not claiming full vendor compatibility across every edge case. It is implementing the small slice it actually uses.
Secret hygiene: simple controls beat wishful thinking
The repo’s secret posture is intentionally basic, but real:
.gitignoreexcludes.envand.env.*while allowing.env.example.env.examplecontains placeholders onlyscripts/secret_scan.pyscans tracked files for obvious API key patternsmake checkruns both tests and secret scanning before a push
This is not a vault. It is not a centralized secrets platform. But it does reflect sound practice.
OWASP’s Secrets Management Cheat Sheet stresses centralization, standardization, access control, and automation of secret handling. This repo does not solve all of that, but it does at least avoid the most embarrassing failure mode for a public demo project: committing real credentials by accident.
That is the right standard for a teaching repo. Not complete enterprise secret governance, but evidence that the maintainers are not sleepwalking into leakage.
What the demo proves
This demo now proves a few things that are worth taking seriously.
1. Policy can be a real runtime primitive in a small harness
You do not need a giant platform to get value from policy. A tiny harness can still:
- classify actions explicitly
- deny unsafe writes before execution
- persist policy decisions for later inspection
- distinguish policy denial from approval withholding
2. Configuration details are part of security, not documentation garnish
The policy/default.json addition and relative-path fix are a good example. Security often depends on boring interpretation rules. If configuration is ambiguous, operators will eventually deploy the wrong thing.
3. Local-model auth should be verifiable, not assumed
The combination of repo-local HARNESS_* variables and doctor_check() gives the demo a credible local-provider story. That is much better than claiming “supports local models” without a health check.
4. Public demo repos can practice decent secret hygiene without overengineering
Ignoring .env, keeping .env.example clean, and running a secret scan before push is not glamorous. It is still real engineering discipline.
What it still does not solve
This is the important section.
The repo is better now, but it is still a small harness demo. It does not solve:
- network egress control
- subprocess sandboxing
- container or VM isolation
- per-user identity and authorization policy
- scoped credential minting
- secret rotation or revocation workflows
- audit-grade policy administration
- supply-chain controls on tool implementations
- SSRF protection for arbitrary network tools
- host-level enforcement outside the Python process
It also does not model the nastier parts of real agent security, such as prompt injection across untrusted tools, exfiltration attempts through secondary channels, or capability partitioning across multiple trust zones.
So no, this repo is not a secure agent platform. It is a practical demonstration that harness-level policy and auth hygiene are concrete engineering work, not vibes.
Honest limitations
I have three main reservations about the current design.
First, the policy model is still tied mostly to filesystem writes. That is useful, but incomplete. As soon as the harness gains networked tools, subprocess tools, or external side effects, the category model needs to widen.
Second, load_dotenv() is intentionally tiny. That keeps the repo dependency-light, but it also means the parser is not trying to be a full-featured dotenv implementation. For a demo, fine. For a broader system, I would want stricter config handling and clearer validation errors.
Third, the secret scan is pattern-based. Pattern scans are good tripwires, not guarantees. They reduce obvious mistakes; they do not prove the absence of secrets.
Still, I prefer this repo with these modest controls over a more impressive-looking repo with none of them.
The broader lesson
The broader lesson of Post 11 is that security in agent systems is mostly about making boundaries explicit enough that operators can reason about them.
That means:
- explicit tool categories
- explicit risky actions
- explicit policy files
- explicit provider config precedence
- explicit health checks
- explicit secret hygiene steps
The industry loves to talk about agent autonomy. I think most teams would be better served by talking about agent controllability.
If your harness cannot answer simple questions like these, it is not ready:
- What can this tool write, exactly?
- Which config file defines that boundary?
- How is that config resolved?
- Which environment variables actually win?
- Did we verify the model endpoint before relying on it?
- What prevents a credential from ending up in git?
This repo can now answer those questions more cleanly than it could yesterday. That is the kind of progress I trust.
References
- Live repo: https://github.com/67ailab/harness-engineering
- MCP tools specification: https://modelcontextprotocol.io/specification/2025-06-18/server/tools
- OWASP Secrets Management Cheat Sheet: https://cheatsheetseries.owasp.org/cheatsheets/Secrets_Management_Cheat_Sheet.html
- OpenAI API docs, models overview: https://developers.openai.com/api/docs/models
- OpenAI API docs, chat completions overview: https://developers.openai.com/api/reference/chat-completions/overview