Related papers: Can Language Models Compose Skills In-Context?

Can Language Models Compose Skills In-Context?

URL: http://arxiv.org/abs/2510.22993v1
Date: Mon, 27 Oct 2025 04:18:59 GMT
Title: Can Language Models Compose Skills In-Context?
Authors: Zidong Liu, Zhuoyan Xu, Zhenmei Shi, Yingyu Liang,
Abstract summary: We investigate the in-context composition ability of language models to perform composite tasks.<n>Simple task examples can have a surprising negative impact on the performance.<n>It is crucial to align examples with the corresponding steps in the composition.
Score: 18.46964936867475
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Composing basic skills from simple tasks to accomplish composite tasks is crucial for modern intelligent systems. We investigate the in-context composition ability of language models to perform composite tasks that combine basic skills demonstrated in in-context examples. This is more challenging than the standard setting, where skills and their composition can be learned in training. We conduct systematic experiments on various representative open-source language models, utilizing linguistic and logical tasks designed to probe composition abilities. The results reveal that simple task examples can have a surprising negative impact on the performance, because the models generally struggle to recognize and assemble the skills correctly, even with Chain-of-Thought examples. Theoretical analysis further shows that it is crucial to align examples with the corresponding steps in the composition. This inspires a method for the probing tasks, whose improved performance provides positive support for our insights.

Related papers

Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following [50.377287115281476]
We show that learning to associate the representations of current and future states with a temporal loss can improve compositional generalization.<n>We evaluate our approach across diverse robotic manipulation tasks as well as in simulation, showing substantial improvements for tasks specified with either language or goal images.
arXiv Detail & Related papers (2025-02-08T05:26:29Z)
In-context Learning in Presence of Spurious Correlations [8.055478206164105]
We study the possibility of training an in-context learner for classification tasks involving spurious features. We find that the conventional approach of training in-context learners is susceptible to spurious features. We propose a novel technique to train such a learner for a given classification task.
arXiv Detail & Related papers (2024-10-04T04:26:36Z)
Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability [12.349247962800813]
Large language models (LLMs) have emerged as powerful tools for many AI problems. They exhibit remarkable in-context learning (ICL) capabilities. How they approach composite tasks remains an open and largely underexplored question.
arXiv Detail & Related papers (2024-07-22T15:22:34Z)
Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models [68.18370230899102]
We investigate how to elicit compositional generalization capabilities in large language models (LLMs) We find that demonstrating both foundational skills and compositional examples grounded in these skills within the same prompt context is crucial. We show that fine-tuning LLMs with SKiC-style data can elicit zero-shot weak-to-strong generalization.
arXiv Detail & Related papers (2023-08-01T05:54:12Z)
SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models. A meta-model can learn on self-supervised prompts consisting of tailored demonstrations. Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z)
Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language Understanding [51.31622274823167]
We propose a hierarchical framework with a coarse-to-fine paradigm, with the bottom level shared to all the tasks, the mid-level divided to different groups, and the top-level assigned to each of the tasks. This allows our model to learn basic language properties from all tasks, boost performance on relevant tasks, and reduce the negative impact from irrelevant tasks.
arXiv Detail & Related papers (2022-08-19T02:46:20Z)
Comparison and Analysis of New Curriculum Criteria for End-to-End ASR [10.698093106994804]
Curriculum Learning is built on the observation that organized and structured assimilation of knowledge has the ability to enable faster training and better comprehension. We employ Curriculum Learning in the context of Automatic Speech Recognition. To impose structure on the training set, we explored multiple scoring functions that either use feedback from an external neural network or incorporate feedback from the model itself.
arXiv Detail & Related papers (2022-08-10T06:56:58Z)
Combining Modular Skills in Multitask Learning [149.8001096811708]
A modular design encourages neural models to disentangle and recombine different facets of knowledge to generalise more systematically to new tasks. In this work, we assume each task is associated with a subset of latent discrete skills from a (potentially small) inventory. We find that the modular design of a network significantly increases sample efficiency in reinforcement learning and few-shot generalisation in supervised learning.
arXiv Detail & Related papers (2022-02-28T16:07:19Z)
Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding [101.24748444126982]
Decomposable tasks are complex and comprise of a hierarchy of sub-tasks. Existing benchmarks, however, typically hold out examples for only the surface-level sub-task. We propose a framework to construct robust test sets using coordinate ascent over sub-task specific utility functions.
arXiv Detail & Related papers (2021-06-29T02:53:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.