Categories

Agents — Building, debugging and shipping AI agents that actually work.
LLM Apps — End-to-end LLM application engineering: streaming, latency, cost, edge cases.
RAG — Retrieval pipelines that don't hallucinate. Chunking, retrieval, rerank, eval.
Vector Databases — Qdrant, pgvector, Weaviate, Pinecone — head-to-head with real workloads.
Fine-tuning — LoRA, QLoRA, full SFT. When fine-tuning beats prompting and when it doesn't.
Prompt Engineering — System prompts that survive edge cases. Patterns that scaled in production.
Tool Comparisons — Honest, hands-on comparisons of dev tools, model APIs and inference services.
Model Picking — Picking the right model for the task. Coding, reasoning, voice, vision.
Cost Optimization — Cutting your inference bill 5-10x. Real numbers from real apps.
Production Ops — Monitoring, evals, rate-limit handling, retries — running LLMs at scale.