Large Language Models

Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

ZZorik GekhmanRRoee AharoniEEran OfekMMor GevaRRoi ReichartJJonathan Herzig
Published
March 10, 2026
Authors
6
Word Count
8,674
Code
Includes code

Reasoning unlocks hidden factual knowledge in LLMs through computational buffering and semantic fact priming.

Abstract

While reasoning in LLMs plays a natural role in math, code generation, and multi-hop factual questions, its effect on simple, single-hop factual questions remains unclear. Such questions do not require step-by-step logical decomposition, making the utility of reasoning highly counterintuitive. Nevertheless, we find that enabling reasoning substantially expands the capability boundary of the model's parametric knowledge recall, unlocking correct answers that are otherwise effectively unreachable. Why does reasoning aid parametric knowledge recall when there are no complex reasoning steps to be done? To answer this, we design a series of hypothesis-driven controlled experiments, and identify two key driving mechanisms: (1) a computational buffer effect, where the model uses the generated reasoning tokens to perform latent computation independent of their semantic content; and (2) factual priming, where generating topically related facts acts as a semantic bridge that facilitates correct answer retrieval. Importantly, this latter generative self-retrieval mechanism carries inherent risks: we demonstrate that hallucinating intermediate facts during reasoning increases the likelihood of hallucinations in the final answer. Finally, we show that our insights can be harnessed to directly improve model accuracy by prioritizing reasoning trajectories that contain hallucination-free factual statements.

Key Takeaways

  • 1

    Reasoning substantially expands LLM parametric knowledge boundaries for simple factual questions, nearly doubling pass@100 performance.

  • 2

    Two mechanisms drive reasoning benefits: computational buffer effects and factual priming through generative self-retrieval of related facts.

  • 3

    Hallucinated intermediate facts during reasoning increase likelihood of final answer hallucinations, creating inherent reliability risks.

Limitations

  • Evaluation limited to two closed-book QA datasets; generalization to other knowledge-intensive tasks remains unclear.

  • Hallucination audit relies on search-enabled verification, which may not catch all factual errors in reasoning traces.

Keywords

large language modelsparametric knowledge recallreasoningcomputational buffer effectfactual priminggenerative self-retrievalhallucinationreasoning trajectories

More in Large Language Models

View all
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs | Paperchime