Related papers: Language models can learn implicit multi-hop reasoning, but only if they have lots of training data

Language models can learn implicit multi-hop reasoning, but only if they have lots of training data

URL: http://arxiv.org/abs/2505.17923v1
Date: Fri, 23 May 2025 14:01:56 GMT
Title: Language models can learn implicit multi-hop reasoning, but only if they have lots of training data
Authors: Yuekun Yao, Yupei Du, Dawei Zhu, Michael Hahn, Alexander Koller,
Abstract summary: Implicit reasoning is the ability of a language model to solve multi-hop reasoning tasks in a single forward pass.<n>We show that while such models can indeed learn implicit $k$-hop reasoning, the required training data grows exponentially in $k$.
Score: 51.92147944576878
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Implicit reasoning is the ability of a language model to solve multi-hop reasoning tasks in a single forward pass, without chain of thought. We investigate this capability using GPT2-style language models trained from scratch on controlled $k$-hop reasoning datasets ($k = 2, 3, 4$). We show that while such models can indeed learn implicit $k$-hop reasoning, the required training data grows exponentially in $k$, and the required number of transformer layers grows linearly in $k$. We offer a theoretical explanation for why this depth growth is necessary. We further find that the data requirement can be mitigated, but not eliminated, through curriculum learning.

Related papers

Do Larger Language Models Imply Better Reasoning? A Pretraining Scaling Law for Reasoning [89.17086632436363]
We introduce a synthetic multihop reasoning environment designed to replicate the structure and distribution of real-world large-scale knowledge graphs.<n>Our reasoning task involves completing missing edges in the graph, which requires advanced multi-hop reasoning and mimics real-world reasoning scenarios.<n>To predict the optimal model size for a specific knowledge graph, we find an empirical scaling that linearly maps the knowledge graph search entropy to the optimal model size.
arXiv Detail & Related papers (2025-04-04T17:57:22Z)
Implicit Reasoning in Transformers is Reasoning through Shortcuts [10.351525484558376]
Test-time compute is emerging as a new paradigm for enhancing language models' complex multi-step reasoning capabilities.<n>We investigate how language models perform implicit reasoning in multi-step tasks.
arXiv Detail & Related papers (2025-03-10T17:58:31Z)
Reasoning with Latent Thoughts: On the Power of Looped Transformers [52.84192961524481]
We show that for many synthetic reasoning problems, a $k$-layer transformer looped $L$ times nearly matches the performance of a $kL$-layer non-looped model.<n>Our empirical analysis reveals an intriguing phenomenon: looped and non-looped models exhibit scaling behavior that depends on their effective depth.
arXiv Detail & Related papers (2025-02-24T18:49:05Z)
Transformers in the Service of Description Logic-based Contexts [2.8210912543324658]
We construct the natural language dataset, DELTA$_D$, using the description logic language $mathcalALCQ$. We investigate the reasoning ability of a supervised fine-tuned DeBERTa-based model and of two large language models (GPT-3.5, GPT-4) with few-shot prompting. Our results demonstrate that the DeBERTa-based model can master the reasoning task and that the performance of GPTs can improve significantly even when a small number of samples is provided.
arXiv Detail & Related papers (2023-11-15T13:23:24Z)
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction [51.68385617116854]
Large language models (LLMs) can store a vast amount of world knowledge, often extractable via question-answering. We find a strong correlation between the model's ability to extract knowledge and various diversity measures of the training data.
arXiv Detail & Related papers (2023-09-25T17:37:20Z)
APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning [73.3035118224719]
We propose APOLLO, an adaptively pretrained language model that has improved logical reasoning abilities. APOLLO performs comparably on ReClor and outperforms baselines on LogiQA.
arXiv Detail & Related papers (2022-12-19T07:40:02Z)
ALERT: Adapting Language Models to Reasoning Tasks [43.8679673685468]
ALERT is a benchmark and suite of analyses for assessing language models' reasoning ability. ALERT provides a test bed to asses any language model on fine-grained reasoning skills. We find that language models learn more reasoning skills during finetuning stage compared to pretraining state.
arXiv Detail & Related papers (2022-12-16T05:15:41Z)
Discovering Latent Knowledge in Language Models Without Supervision [72.95136739040676]
Existing techniques for training language models can be misaligned with the truth. We propose directly finding latent knowledge inside the internal activations of a language model in a purely unsupervised way. We show that despite using no supervision and no model outputs, our method can recover diverse knowledge represented in large language models.
arXiv Detail & Related papers (2022-12-07T18:17:56Z)
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge [96.92252296244233]
Large pre-trained language models (LMs) acquire some reasoning capacity, but this ability is difficult to control. We show that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements. Our work paves a path towards open-domain systems that constantly improve by interacting with users who can instantly correct a model by adding simple natural language statements.
arXiv Detail & Related papers (2020-06-11T17:02:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.