Can Input Attributions Interpret the Inductive Reasoning Process in In-Context Learning?
- URL: http://arxiv.org/abs/2412.15628v3
- Date: Tue, 18 Feb 2025 06:17:50 GMT
- Title: Can Input Attributions Interpret the Inductive Reasoning Process in In-Context Learning?
- Authors: Mengyu Ye, Tatsuki Kuribayashi, Goro Kobayashi, Jun Suzuki,
- Abstract summary: In this paper, we design synthetic diagnostic tasks of inductive reasoning, inspired by the generalization tests in linguistics.<n>The question is whether conventional input attribution (IA) methods can track such a reasoning process, identify the influential example, in in ICL.
- Score: 19.268703560816498
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Interpreting the internal process of neural models has long been a challenge. This challenge remains relevant in the era of large language models (LLMs) and in-context learning (ICL); for example, ICL poses a new issue of interpreting which example in the few-shot examples contributed to identifying/solving the task. To this end, in this paper, we design synthetic diagnostic tasks of inductive reasoning, inspired by the generalization tests in linguistics; here, most in-context examples are ambiguous w.r.t. their underlying rule, and one critical example disambiguates the task demonstrated. The question is whether conventional input attribution (IA) methods can track such a reasoning process, i.e., identify the influential example, in ICL. Our experiments provide several practical findings; for example, a certain simple IA method works the best, and the larger the model, the generally harder it is to interpret the ICL with gradient-based IA methods.
Related papers
- Contextualize-then-Aggregate: Circuits for In-Context Learning in Gemma-2 2B [46.99314622487279]
In-Context Learning (ICL) is an intriguing ability of large language models (LLMs)
We use causal interventions to identify information flow in Gemma-2 2B for five naturalistic ICL tasks.
We find that the model infers task information using a two-step strategy we call contextualize-then-aggregate.
arXiv Detail & Related papers (2025-03-31T18:33:55Z) - Problem-Solving Logic Guided Curriculum In-Context Learning for LLMs Complex Reasoning [20.09682048610498]
In this study, we propose a curriculum ICL strategy guided by problem-solving logic.
We select demonstration examples by analyzing the problem-solving logic and order them based on curriculum learning.
Our method outperforms previous ICL approaches in terms of performance and efficiency.
arXiv Detail & Related papers (2025-02-21T12:00:10Z) - MIRAGE: Evaluating and Explaining Inductive Reasoning Process in Language Models [19.81485079689837]
We evaluate large language models' capabilities in inductive and deductive stages.
We find that the models tend to consistently conduct correct deduction without correct inductive rules.
In the inductive reasoning process, the model tends to focus on observed facts that are close to the current test example in feature space.
arXiv Detail & Related papers (2024-10-12T14:12:36Z) - ParaICL: Towards Robust Parallel In-Context Learning [74.38022919598443]
Large language models (LLMs) have become the norm in natural language processing.
Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples.
We propose a novel method named parallel in-context learning (ParaICL)
arXiv Detail & Related papers (2024-03-31T05:56:15Z) - C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - LaRS: Latent Reasoning Skills for Chain-of-Thought Reasoning [61.7853049843921]
Chain-of-thought (CoT) prompting is a popular in-context learning approach for large language models (LLMs)
This paper introduces a new approach named Latent Reasoning Skills (LaRS) that employs unsupervised learning to create a latent space representation of rationales.
arXiv Detail & Related papers (2023-12-07T20:36:10Z) - In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax [36.98247762224868]
In-context learning (ICL) is now a common method for teaching large language models (LLMs) new tasks.
Do models infer the underlying structure of the task defined by the context, or do they rely on superficial generalizations that only generalize to identically distributed examples?
In experiments with models from the GPT, PaLM, and Llama 2 families, we find large variance across LMs.
The variance is explained more by the composition of the pre-training corpus and supervision methods than by model size.
arXiv Detail & Related papers (2023-11-13T23:52:43Z) - LLMs Learn Task Heuristics from Demonstrations: A Heuristic-Driven Prompting Strategy for Document-Level Event Argument Extraction [12.673710691468264]
We introduce the Heuristic-Driven Link-of- Analogy (HD-LoA) prompting to address the challenge of example selection.
Inspired by the analogical reasoning of human, we propose the link-of-analogy prompting, which enables LLMs to process new situations.
Experiments show that our method outperforms existing prompting methods and few-shot supervised learning methods on document-level EAE datasets.
arXiv Detail & Related papers (2023-11-11T12:05:01Z) - Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement [92.61557711360652]
Language models (LMs) often fall short on inductive reasoning, despite achieving impressive success on research benchmarks.
We conduct a systematic study of the inductive reasoning capabilities of LMs through iterative hypothesis refinement.
We reveal several discrepancies between the inductive reasoning processes of LMs and humans, shedding light on both the potentials and limitations of using LMs in inductive reasoning tasks.
arXiv Detail & Related papers (2023-10-12T17:51:10Z) - Large Language Models as Analogical Reasoners [155.9617224350088]
Chain-of-thought (CoT) prompting for language models demonstrates impressive performance across reasoning tasks.
We introduce a new prompting approach, analogical prompting, designed to automatically guide the reasoning process of large language models.
arXiv Detail & Related papers (2023-10-03T00:57:26Z) - RetICL: Sequential Retrieval of In-Context Examples with Reinforcement Learning [53.52699766206808]
We propose Retrieval for In-Context Learning (RetICL), a learnable method for modeling and optimally selecting examples sequentially for in-context learning.
We evaluate RetICL on math word problem solving and scientific question answering tasks and show that it consistently outperforms or matches and learnable baselines.
arXiv Detail & Related papers (2023-05-23T20:15:56Z) - Active Learning Principles for In-Context Learning with Large Language
Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning.
We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z) - Compositional Exemplars for In-context Learning [21.961094715261133]
Large pretrained language models (LMs) have shown impressive In-Context Learning (ICL) ability.
We propose CEIL (Compositional Exemplars for In-context Learning) to model the interaction between the given input and in-context examples.
We validate CEIL on 12 classification and generation datasets from 7 distinct NLP tasks, including sentiment analysis, paraphrase detection, natural language inference, commonsense reasoning, open-domain question answering, code generation, and semantic parsing.
arXiv Detail & Related papers (2023-02-11T14:02:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.