Why Chunking Strategy Matters
Simple chunking works for flat text, but breaks structured documents.
Industry-Leading Retrieval Performance
Evaluated on 3 public structured-document benchmarks spanning legal, academic, and technical standards. 697 documents and 22,000+ gold-standard questions.
Average across all benchmarks (macro-averaged)
| Method | Recall@1 | Recall@5 | MRR@5 | nDCG@5 | Chunks/Doc | Tokens/Chunk | Efficiency |
|---|---|---|---|---|---|---|---|
| Fixed Token (500) | 0.34 | 0.67 | 0.46 | 0.48 | 32 | 492 | 1.37 |
| LangChain RecursiveCharacterTextSplitter | 0.31 | 0.66 | 0.44 | 0.46 | 38 | 398 | 1.67 |
| Docling HierarchicalChunker | 0.28 | 0.59 | 0.39 | 0.41 | 171 | 376 | 1.56 |
| Flat Header Splitter | 0.47 | 0.84 | 0.61 | 0.66 | 11 | 1083 | 0.78 |
| DocSlicer | 0.58 | 0.86 | 0.69 | 0.70 | 38 | 374 | 2.30 |
Evaluation methodology
Recall@k
Percentage of questions where the answer was found in the top k results
MRR (Mean Reciprocal Rank)
How quickly the answer appears. If answer is rank 1: score=1.0, rank 2: score=0.5, rank 3: score=0.33, etc.
nDCG (Normalized DCG)
Overall ranking quality on a 0-1 scale. Higher means relevant results appear earlier and more consistently.
Context Efficiency
Quality per token cost. Measures recall achieved relative to total context sent to the LLM.
Better RAG Outcomes with DocSlicer
See how DocSlicer's structure-aware chunking delivers more accurate answers compared to simple chunking methods.
Apple 10-K Filing
SEC filing with financial statements and risk factors
Ask a Question
Compare Simple Chunking vs DocSlicer responses
Select a question and click send to see the comparison.
Review the document on the left to understand the context.