Skip to content

Blog

12 posts

Long-form articles and short thoughts on AI, product, and engineering.

AI & LLM5 min read

The Context Window Trap

1M token context windows don't mean you should use them. When to chunk, summarize, or restructure instead of stuffing everything in.

AI & LLM6 min read

LLM Evaluation Is Hard

Building LLM-as-a-Judge for AssessAI taught me why automated evaluation needs human calibration — and where it breaks down.

AI & LLM5 min read

Voice AI in Production

What it takes to deploy conversational Voice AI that handles real phone calls. Latency budgets, failure modes, and lessons from Avoca.

AI & LLM4 min read

Embedding Models Compared: Cost, Quality, Latency

OpenAI ada-002 vs text-embedding-3 vs Cohere vs local models. Real benchmarks from production retrieval systems.

AI & LLM5 min read

AI Coding Tools Ranked: What I Actually Use

Claude Code vs Cursor vs Copilot vs Windsurf. Ranked by someone who uses them all for real work, not demos.

AI & LLM4 min read

Fine-Tuning Is Usually Wrong

A decision framework for when to fine-tune, when to prompt engineer, and when RAG is the right call. Most teams pick fine-tuning too early.

AI & LLM5 min read

Claude vs GPT for Code Generation

An honest comparison from daily use. Where each model wins, where each fails, and what I actually reach for.

AI & LLM5 min read

Multi-Agent Orchestration Lessons

What I learned building a 23-agent system. When to split agents, when to merge them, and why most multi-agent architectures are over-engineered.

AI & LLM5 min read

Building AI Agents That Don't Hallucinate

Grounding, tool use, and verification loops. What it took to ship agents that users actually trust.

AI & LLM4 min read

Prompt Engineering Is Dead

Structured outputs, function calling, and system-level constraints killed the artisanal prompt. Good riddance.

AI & LLM5 min read

Local LLMs Are Good Enough

Running Llama 3, Mistral, and Phi-3 on a MacBook. When cloud APIs are overkill and local inference makes more sense.

AI & LLM4 min read

RAG Patterns That Actually Work

Vector search is the easy part. Here's what matters after you've embedded your documents.