Production engineering.
The unglamorous engineering disciplines that separate AI products that ship from AI products that get cancelled. Telemetry, infra, ops, and operating models.
$OpinionHow much does it cost to build an AI agent in 2026?
Pricing for AI work is opaque. Here's the honest breakdown — what a prototype costs, what production costs, what operations costs, and what makes the numbers move.
Read essay
◯Field notesLangSmith vs Langfuse vs Arize vs Braintrust: comparing AI observability platforms.
Four platforms, four philosophies. We've shipped on all of them. Here's the honest comparison — what each does well, what each doesn't, and how to pick.
Read essay
✓EngineeringHow to write your first AI eval suite without a framework.
You don't need LangSmith, Braintrust, or any platform to ship your first eval suite. Most production-grade evals start as 100 prompts in a JSON file and a script. Here's the playbook.
Read essay
⊘OperationsAI safety in production: a checklist that actually ships.
Safety isn't a content filter you add at the end. It's an architecture. These six layers are non-negotiable before any AI product touches real users.
Read essay
✦EngineeringEvals that actually catch regressions before users do.
The eval suite most teams ship with is a confidence-builder, not a regression detector. Here's the structure we use to catch real failures earlier.
Read essay
▲OperationsGuardrails that survive contact with real users.
Why bolt-on safety layers fail and what production-grade guardrail architecture actually looks like in 2026.
Read essay