Eng leader adopting AI agents
Writing — Field Notes
Field Notes. Long-form essays on shipping software with AI agents — plan-of-record, evidence layers, agent runtimes, and the anti-patterns engineering teams ship when a model is in the loop.
Start here
Three reader-paths through the 14 posts. Pick the one closest to your job today.
Engineer building agent tooling
Security / platform / TPM at scale
-
21 patterns I keep seeing in teams shipping with AI coding agents.
A partial, in-progress list of the recurring dysfunctions, anti-patterns, and quiet wins from teams that ship software with AI coding agents day-to-day. Working notes, not a framework.
-
Deploy is not release. Release is not launch.
Three different verbs, three different owners, three different days. Decoupling them is the only honest way to ship software in 2026.
-
Cloud coding agents read your repo. Yours doesn't have to.
Why local-first is the right default for AI coding agents — and what choosing it actually costs.
-
Agents forget two ways. Most memory tools only fix one.
Plan-of-record and evidence layer are different primitives, solving different forms of forgetting. Most agent runtimes collapse them into one — or skip both — and pay the cost in the wrong direction.
-
Your AI subscription has a rate limit. Your agent doesn't know about it.
Flat-rate AI plans hide a cliff. The agent burns through your weekly Claude Max window, gets throttled mid-task, loses an hour of context. The fix isn't a different model — it's giving the agent visibility into the cap before it hits.
-
Agents forget. The plan should remember.
AI coding agents lose context every session. The work doesn't. What survives the reset is the plan-of-record — and most teams don't have one.
-
Your AI's memory is someone else's database.
Hosted memory layers and vector DBs hand the user's preferences, conversation history, and inferred claims to a vendor's infrastructure. The feature ships; the ownership doesn't.
-
LLM calls broke your resilience playbook. Add these back.
Retry, circuit breaker, timeout — the patterns that kept services up for two decades don't cover the failure modes of services that call LLMs and tools. Here's what's missing.
-
Three MCP tools in, you've started rebuilding Gin. Stop.
Hand-rolling MCP servers on raw SDKs means re-solving input decoding, validation, schema generation, errors, middleware, and transports — five times, in five repos, each slightly wrong. The protocol is the easy part.
-
Your security scanner can't see your AI code.
SAST tools stopped evolving at HTTP request validation. AI features live on top of HTTP, but the failure modes — prompt injection, embedding leakage, agent over-privilege, MCP hardening — happen at the LLM call site, in places traditional scanners don't look.
-
Switch statements are a state machine in denial.
Every team modeling order lifecycles, payment sagas, incident workflows — and now AI agent runtimes — ends up with the same 400-line switch block, three boolean flags, and a Slack channel where the bugs surface. The fix is to make the FSM a first-class artifact.
-
The Logger's Trilemma was a sampling artifact.
The Logger's Trilemma says you can have two of speed, developer experience, and observability. The trilemma was a sampling problem; modern slog handlers can ship all three.
-
An agent runtime is five primitives. Most fake at least three.
Memory, time perception, commitment tracking, typed action, control plane — the missing infrastructure beneath the model. Most agent frameworks glue substitutes together and call it done.
-
Every 200-team rollout ends in a Google Sheet. Fix it in Jira.
Security patches, migrations, compliance audits — large-scale initiatives are fan-out problems Jira's defaults don't solve. Teams reach for spreadsheets and lose every benefit of the workflow they left.
No posts in this filter yet.