Creative Genius Creative Genius
Lesson 2 of 5 · 18 min read

Chunking Strategies

Bad chunking is the #1 reason RAG retrieval fails. Get this layer right and everything downstream gets easier.

Chunking = splitting documents into small pieces that get embedded and stored. There's no universal best chunk size — but there are universal mistakes.

Three chunking strategies

  1. Fixed-size. Easy. Often wrong. Splits sentences mid-thought.
  2. Recursive structural. Split on paragraphs first, then sentences, then characters. Default winner for most prose.
  3. Semantic. Use an embedding model to detect topic shifts. Best quality, highest cost.

Add 10–20% overlap between adjacent chunks to avoid losing context at boundaries.

← What RAG Actually Solves (and Doesn't) Picking the Right Embedding Model →