Categories
- Agents — Building, debugging and shipping AI agents that actually work.
- LLM Apps — End-to-end LLM application engineering: streaming, latency, cost, edge cases.
- RAG — Retrieval pipelines that don't hallucinate. Chunking, retrieval, rerank, eval.
- Vector Databases — Qdrant, pgvector, Weaviate, Pinecone — head-to-head with real workloads.
- Fine-tuning — LoRA, QLoRA, full SFT. When fine-tuning beats prompting and when it doesn't.
- Prompt Engineering — System prompts that survive edge cases. Patterns that scaled in production.
- Tool Comparisons — Honest, hands-on comparisons of dev tools, model APIs and inference services.
- Model Picking — Picking the right model for the task. Coding, reasoning, voice, vision.
- Cost Optimization — Cutting your inference bill 5-10x. Real numbers from real apps.
- Production Ops — Monitoring, evals, rate-limit handling, retries — running LLMs at scale.