Small Language Models Can Match LLMs as Search Agents With Proper Training, Study Shows
Research shows that while small language models (SLMs) struggle as search agents out of the box, a lightweight fine-tuning approach can bring them to LLM-level performance on complex multi-hop reasoning tasks.
The Problem
SLMs equipped with search tools exhibit surprising behavior:
- Less parametric knowledge than LLMs (expected)
- Search less frequently than needed (unexpected)
- More prone to hallucinations (concerning)
Simply distilling agentic behaviors from LLMs doesn't fully address these issues.
The Solution: Explicit Search Training
The researchers propose a lightweight fine-tuning approach that explicitly trains SLMs to:
- Reliably retrieve — Know when and how to search
- Ground answers — Generate responses based on retrieved evidence
- Avoid adaptive search — Consistent search behavior outperforms complex strategies
Results
| Benchmark | Improvement | Result |
|---|---|---|
| Bamboogle | +17.3 scores | LLM-level |
| HotpotQA | +15.3 scores | LLM-level |
Counterintuitive Finding
"Adaptive search strategies in SLMs often degrade performance, highlighting the necessity of consistent search behavior for reliable reasoning."
Complex, adaptive search strategies actually hurt SLMs. Simple, consistent search patterns work better.
Why It Matters
- Cost efficiency — SLMs are orders of magnitude cheaper to run than LLMs
- Edge deployment — SLMs can run on-device, enabling offline search agents
- Latency — Smaller models respond faster
- Democratization — Makes powerful search agents accessible beyond API-dependent applications
Implications for Agentica
For content platforms and agent services, this research suggests that effective search agents don't necessarily need frontier models — a well-trained SLM with search tools can deliver comparable results at a fraction of the cost.