A Survey to Recent Progress Towards Understanding In-Context Learning
- URL: http://arxiv.org/abs/2402.02212v3
- Date: Fri, 24 Jan 2025 19:04:04 GMT
- Title: A Survey to Recent Progress Towards Understanding In-Context Learning
- Authors: Haitao Mao, Guangliang Liu, Yao Ma, Rongrong Wang, Kristen Johnson, Jiliang Tang,
- Abstract summary: In-Context Learning (ICL) empowers Large Language Models (LLMs) with the ability to learn from a few examples provided in the prompt.<n>Despite encouragingly empirical success, the underlying mechanism of ICL remains unclear.
- Score: 37.933016939520684
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In-Context Learning (ICL) empowers Large Language Models (LLMs) with the ability to learn from a few examples provided in the prompt, enabling downstream generalization without the requirement for gradient updates. Despite encouragingly empirical success, the underlying mechanism of ICL remains unclear. Existing research remains ambiguous with various viewpoints, utilizing intuition-driven and ad-hoc technical solutions to interpret ICL. In this paper, we leverage a data generation perspective to reinterpret recent efforts from a systematic angle, demonstrating the potential broader usage of these popular technical solutions. For a conceptual definition, we rigorously adopt the terms of skill recognition and skill learning. Skill recognition selects one learned data generation function previously seen during pre-training while skill learning can learn new data generation functions from in-context data. Furthermore, we provide insights into the strengths and weaknesses of both abilities, emphasizing their commonalities through the perspective of data generation. This analysis suggests potential directions for future research.
Related papers
- Language Guided Concept Bottleneck Models for Interpretable Continual Learning [62.09201360376577]
Continual learning aims to enable learning systems to acquire new knowledge constantly without forgetting previously learned information.
Most existing CL methods focus primarily on preserving learned knowledge to improve model performance.
We introduce a novel framework that integrates language-guided Concept Bottleneck Models to address both challenges.
arXiv Detail & Related papers (2025-03-30T02:41:55Z) - In-Context Learning with Topological Information for Knowledge Graph Completion [3.035601871864059]
We develop a novel method that incorporates topological information through in-context learning to enhance knowledge graph performance.
Our approach achieves strong performance in the transductive setting i.e., nodes in the test graph dataset are present in the training graph dataset.
Our method demonstrates superior performance compared to baselines on the ILPC-small and ILPC-large datasets.
arXiv Detail & Related papers (2024-12-11T19:29:36Z) - Explainability in AI Based Applications: A Framework for Comparing Different Techniques [2.5874041837241304]
In business applications, the challenge lies in selecting an appropriate explainability method that balances comprehensibility with accuracy.
This paper proposes a novel method for the assessment of the agreement of different explainability techniques.
By providing a practical framework for understanding the agreement of diverse explainability techniques, our research aims to facilitate the broader integration of interpretable AI systems in business applications.
arXiv Detail & Related papers (2024-10-28T09:45:34Z) - Coding for Intelligence from the Perspective of Category [66.14012258680992]
Coding targets compressing and reconstructing data, and intelligence.
Recent trends demonstrate the potential homogeneity of these two fields.
We propose a novel problem of Coding for Intelligence from the category theory view.
arXiv Detail & Related papers (2024-07-01T07:05:44Z) - Federated Learning driven Large Language Models for Swarm Intelligence: A Survey [2.769238399659845]
Federated learning (FL) offers a compelling framework for training large language models (LLMs)
We focus on machine unlearning, a crucial aspect for complying with privacy regulations like the Right to be Forgotten.
We explore various strategies that enable effective unlearning, such as perturbation techniques, model decomposition, and incremental learning.
arXiv Detail & Related papers (2024-06-14T08:40:58Z) - Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning.
This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z) - Développement automatique de lexiques pour les concepts émergents : une exploration méthodologique [0.0]
This paper presents the development of a lexicon centered on emerging concepts, focusing on non-technological innovation.
It introduces a four-step methodology that combines human expertise, statistical analysis, and machine learning techniques to establish a model that can be generalized across multiple domains.
arXiv Detail & Related papers (2024-06-10T12:58:56Z) - Exploring the landscape of large language models: Foundations, techniques, and challenges [8.042562891309414]
The article sheds light on the mechanics of in-context learning and a spectrum of fine-tuning approaches.
It explores how LLMs can be more closely aligned with human preferences through innovative reinforcement learning frameworks.
The ethical dimensions of LLM deployment are discussed, underscoring the need for mindful and responsible application.
arXiv Detail & Related papers (2024-04-18T08:01:20Z) - A Unified and General Framework for Continual Learning [58.72671755989431]
Continual Learning (CL) focuses on learning from dynamic and changing data distributions while retaining previously acquired knowledge.
Various methods have been developed to address the challenge of catastrophic forgetting, including regularization-based, Bayesian-based, and memory-replay-based techniques.
This research aims to bridge this gap by introducing a comprehensive and overarching framework that encompasses and reconciles these existing methodologies.
arXiv Detail & Related papers (2024-03-20T02:21:44Z) - C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models.
A meta-model can learn on self-supervised prompts consisting of tailored demonstrations.
Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z) - The Contribution of Knowledge in Visiolinguistic Learning: A Survey on
Tasks and Challenges [0.0]
Current datasets used for visiolinguistic (VL) pre-training only contain a limited amount of visual and linguistic knowledge.
External knowledge sources such as knowledge graphs (KGs) and Large Language Models (LLMs) are able to cover such generalization gaps.
arXiv Detail & Related papers (2023-03-04T13:12:18Z) - Vision+X: A Survey on Multimodal Learning in the Light of Data [64.03266872103835]
multimodal machine learning that incorporates data from various sources has become an increasingly popular research area.
We analyze the commonness and uniqueness of each data format mainly ranging from vision, audio, text, and motions.
We investigate the existing literature on multimodal learning from both the representation learning and downstream application levels.
arXiv Detail & Related papers (2022-10-05T13:14:57Z) - Semi-Supervised and Unsupervised Deep Visual Learning: A Survey [76.2650734930974]
Semi-supervised learning and unsupervised learning offer promising paradigms to learn from an abundance of unlabeled visual data.
We review the recent advanced deep learning algorithms on semi-supervised learning (SSL) and unsupervised learning (UL) for visual recognition from a unified perspective.
arXiv Detail & Related papers (2022-08-24T04:26:21Z) - Causal Reasoning Meets Visual Representation Learning: A Prospective
Study [117.08431221482638]
Lack of interpretability, robustness, and out-of-distribution generalization are becoming the challenges of the existing visual models.
Inspired by the strong inference ability of human-level agents, recent years have witnessed great effort in developing causal reasoning paradigms.
This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods.
arXiv Detail & Related papers (2022-04-26T02:22:28Z) - Semantics of the Black-Box: Can knowledge graphs help make deep learning
systems more interpretable and explainable? [4.2111286819721485]
Recent innovations in deep learning (DL) have shown enormous potential to impact individuals and society.
Black-Box nature of DL models and over-reliance on massive amounts of data poses challenges for interpretability and explainability of the system.
This article demonstrates how knowledge, provided as a knowledge graph, is incorporated into DL methods using knowledge-infused learning.
arXiv Detail & Related papers (2020-10-16T22:55:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.