Notes on building with AI.
Occasional writing on agent infrastructure, practical automation, and the gap between what AI demos promise and what production systems require.
-
Agent Infrastructure · 2026
The boring layer: why agent orchestration is mostly plumbing
Everyone wants to talk about the model. The hard part is the infrastructure around it — routing, retries, observability, graceful failure. What it actually takes to make a multi-agent workflow hold up in a real environment.
-
Voice · 2026
On-device STT in 2026: what local models are actually good at
A practical look at building voice transcription on Apple Silicon — where local models beat cloud APIs, where they don't, and what the latency/accuracy tradeoffs look like with real usage data.
-
Research · 2025
Using LLMs for systematic review screening: an honest benchmark
Not a theoretical exploration — a direct comparison of model output against a human-annotated gold set of 1,000+ papers. Recall, precision, where the models fail, and how to structure prompts for this class of task.
-
Automation · 2025
Prompt vs. hook: when to encode behavior in config and when in code
In agent systems there's a recurring decision: should this behavior live in a system prompt, a config rule, or a hard runtime hook? The answer matters more than it looks. A framework for thinking about enforcement layers.
More on the way
These pieces are in draft. More will appear here as projects wrap up and ideas solidify. If something above interests you, feel free to reach out — I'm happy to share notes earlier.
Get in touch →