Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks
- URL: http://arxiv.org/abs/2403.09479v1
- Date: Thu, 14 Mar 2024 15:20:54 GMT
- Title: Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks
- Authors: Yuncheng Huang, Qianyu He, Yipei Xu, Jiaqing Liang, Yanghua Xiao,
- Abstract summary: We propose a probing framework to investigate whether the atomic skill can spontaneously generalize to complex reasoning tasks.
We then introduce a hierarchical curriculum learning training strategy to achieve better skill generalization.
By leveraging hierarchical curriculum learning, we successfully induce generalization, significantly improve the performance of open-source LMs on complex reasoning tasks.
- Score: 40.7766635942194
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current language models have demonstrated their capability to develop basic reasoning, but struggle in more complicated reasoning tasks that require a combination of atomic skills, such as math word problem requiring skills like arithmetic and unit conversion. Previous methods either do not improve the inherent atomic skills of models or not attempt to generalize the atomic skills to complex reasoning tasks. In this paper, we first propose a probing framework to investigate whether the atomic skill can spontaneously generalize to complex reasoning tasks. Then, we introduce a hierarchical curriculum learning training strategy to achieve better skill generalization. In our experiments, we find that atomic skills can not spontaneously generalize to compositional tasks. By leveraging hierarchical curriculum learning, we successfully induce generalization, significantly improve the performance of open-source LMs on complex reasoning tasks. Promisingly, the skill generalization exhibit effective in cross-dataset and cross-domain scenarios. Complex reasoning can also help enhance atomic skills. Our findings offer valuable guidance for designing better training strategies for complex reasoning tasks.
Related papers
- AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning [38.736190591684]
AtomR is a novel heterogeneous knowledge reasoning framework.
It decomposes complex questions into combinations of three atomic knowledge operators.
AtomR significantly outperforms state-of-the-art baselines across three single-source and two multi-source reasoning benchmarks.
arXiv Detail & Related papers (2024-11-25T15:35:51Z) - Agentic Skill Discovery [19.5703917813767]
Language-conditioned robotic skills make it possible to apply the high-level reasoning of Large Language Models (LLMs) to low-level robotic control.
A remaining challenge is to acquire a diverse set of fundamental skills.
We introduce a novel framework for skill discovery that is entirely driven by LLMs.
arXiv Detail & Related papers (2024-05-23T19:44:03Z) - Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models [68.18370230899102]
We investigate how to elicit compositional generalization capabilities in large language models (LLMs)
We find that demonstrating both foundational skills and compositional examples grounded in these skills within the same prompt context is crucial.
We show that fine-tuning LLMs with SKiC-style data can elicit zero-shot weak-to-strong generalization.
arXiv Detail & Related papers (2023-08-01T05:54:12Z) - A Theory for Emergence of Complex Skills in Language Models [56.947273387302616]
A major driver of AI products today is the fact that new skills emerge in language models when their parameter set and training corpora are scaled up.
This paper takes a different approach, analysing emergence using the famous (and empirical) Scaling Laws of LLMs and a simple statistical framework.
arXiv Detail & Related papers (2023-07-29T09:22:54Z) - Divide & Conquer Imitation Learning [75.31752559017978]
Imitation Learning can be a powerful approach to bootstrap the learning process.
We present a novel algorithm designed to imitate complex robotic tasks from the states of an expert trajectory.
We show that our method imitates a non-holonomic navigation task and scales to a complex simulated robotic manipulation task with very high sample efficiency.
arXiv Detail & Related papers (2022-04-15T09:56:50Z) - Hierarchical Skills for Efficient Exploration [70.62309286348057]
In reinforcement learning, pre-trained low-level skills have the potential to greatly facilitate exploration.
Prior knowledge of the downstream task is required to strike the right balance between generality (fine-grained control) and specificity (faster learning) in skill design.
We propose a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.
arXiv Detail & Related papers (2021-10-20T22:29:32Z) - Complex Skill Acquisition Through Simple Skill Imitation Learning [0.0]
We propose a new algorithm that trains neural network policies on simple, easy-to-learn skills.
We focus on the case in which the complex task comprises a concurrent (and possibly sequential) combination of the simpler subtasks.
Our algorithm consistently outperforms a state-of-the-art baseline in training speed and overall performance.
arXiv Detail & Related papers (2020-07-20T17:06:26Z) - Compositional Generalization by Learning Analytical Expressions [87.15737632096378]
A memory-augmented neural model is connected with analytical expressions to achieve compositional generalization.
Experiments on the well-known benchmark SCAN demonstrate that our model seizes a great ability of compositional generalization.
arXiv Detail & Related papers (2020-06-18T15:50:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.