Lookahead RAG
Speculative retrieval planning for low-latency multi-hop QA.
🎯The Problem
Agentic RAG (iterative tool calling) improves multi-hop QA accuracy but suffers from sequential latency, high cost, and context rot.
💡The Solution
Use a small planner to output a retrieval dependency graph from the question alone, then parallel-retrieve all predicted evidence, then run one final synthesis call.
✨Key Highlights
- Retrieval dependency graph planning
- Parallel retrieval execution
- One-shot synthesis without tool loops