Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs
- URL: http://arxiv.org/abs/2405.11880v1
- Date: Mon, 20 May 2024 08:51:03 GMT
- Title: Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs
- Authors: Siyu Lou, Yuntian Chen, Xiaodan Liang, Liang Lin, Quanshi Zhang,
- Abstract summary: We propose an axiomatic system to define and quantify the precise memorization and in-context reasoning effects used by the large language model (LLM)
Specifically, the axiomatic system enables us to categorize the memorization effects into foundational memorization effects and chaotic memorization effects.
Experiments show that the clear disentanglement of memorization effects and in-context reasoning effects enables a straightforward examination of detailed inference patterns encoded by LLMs.
- Score: 101.51435599249234
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this study, we propose an axiomatic system to define and quantify the precise memorization and in-context reasoning effects used by the large language model (LLM) for language generation. These effects are formulated as non-linear interactions between tokens/words encoded by the LLM. Specifically, the axiomatic system enables us to categorize the memorization effects into foundational memorization effects and chaotic memorization effects, and further classify in-context reasoning effects into enhanced inference patterns, eliminated inference patterns, and reversed inference patterns. Besides, the decomposed effects satisfy the sparsity property and the universal matching property, which mathematically guarantee that the LLM's confidence score can be faithfully decomposed into the memorization effects and in-context reasoning effects. Experiments show that the clear disentanglement of memorization effects and in-context reasoning effects enables a straightforward examination of detailed inference patterns encoded by LLMs.
Related papers
- Language Agents Meet Causality -- Bridging LLMs and Causal World Models [50.79984529172807]
We propose a framework that integrates causal representation learning with large language models.
This framework learns a causal world model, with causal variables linked to natural language expressions.
We evaluate the framework on causal inference and planning tasks across temporal scales and environmental complexities.
arXiv Detail & Related papers (2024-10-25T18:36:37Z) - Understanding Memorisation in LLMs: Dynamics, Influencing Factors, and Implications [14.818820873377303]
We study whether and to what extent large language models (LLMs) have memorised training data.
We create an experimental framework that is based on repeatedly exposing LLMs to random strings.
We identify factors that make some strings easier to memorise than others, and we identify the role of local prefixes and global context in memorisation.
arXiv Detail & Related papers (2024-07-27T14:00:21Z) - Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models [0.0]
Large Language Models (LLMs) have shown exceptional performance in text processing.
This paper proposes a novel approach to training LLMs using knowledge transfer from a random forest (RF) ensemble.
We generate outputs for fine-tuning, enhancing the model's ability to classify and explain its decisions.
arXiv Detail & Related papers (2024-06-07T13:31:51Z) - Large Language Models are Biased Reinforcement Learners [0.0]
We show that large language models (LLMs) exhibit behavioral signatures of a relative value bias.
Computational cognitive modeling reveals that LLM behavior is well-described by a simple RL algorithm.
arXiv Detail & Related papers (2024-05-19T01:43:52Z) - CausalBench: A Comprehensive Benchmark for Causal Learning Capability of LLMs [27.362012903540492]
The ability to understand causality significantly impacts the competence of large language models (LLMs) in output explanation and counterfactual reasoning.
The ability to understand causality significantly impacts the competence of large language models (LLMs) in output explanation and counterfactual reasoning.
arXiv Detail & Related papers (2024-04-09T14:40:08Z) - Evaluating Interventional Reasoning Capabilities of Large Language Models [58.52919374786108]
Large language models (LLMs) can estimate causal effects under interventions on different parts of a system.
We conduct empirical analyses to evaluate whether LLMs can accurately update their knowledge of a data-generating process in response to an intervention.
We create benchmarks that span diverse causal graphs (e.g., confounding, mediation) and variable types, and enable a study of intervention-based reasoning.
arXiv Detail & Related papers (2024-04-08T14:15:56Z) - The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition [74.04775677110179]
In-context Learning (ICL) has emerged as a powerful paradigm for performing natural language tasks with Large Language Models (LLM)
We show that LLMs have strong yet inconsistent priors in emotion recognition that ossify their predictions.
Our results suggest that caution is needed when using ICL with larger LLMs for affect-centered tasks outside their pre-training domain.
arXiv Detail & Related papers (2024-03-25T19:07:32Z) - Causal Prompting: Debiasing Large Language Model Prompting based on Front-Door Adjustment [32.12998469814097]
A novel causal prompting method based on front-door adjustment is proposed to effectively mitigate Large Language Models (LLMs) biases.
Experimental results show that the proposed causal prompting approach achieves excellent performance across seven natural language processing datasets.
arXiv Detail & Related papers (2024-03-05T07:47:34Z) - Knowledge Verification to Nip Hallucination in the Bud [69.79051730580014]
We demonstrate the feasibility of mitigating hallucinations by verifying and minimizing the inconsistency between external knowledge present in the alignment data and the intrinsic knowledge embedded within foundation LLMs.
We propose a novel approach called Knowledge Consistent Alignment (KCA), which employs a well-aligned LLM to automatically formulate assessments based on external knowledge.
We demonstrate the superior efficacy of KCA in reducing hallucinations across six benchmarks, utilizing foundation LLMs of varying backbones and scales.
arXiv Detail & Related papers (2024-01-19T15:39:49Z) - On the Relation between Internal Language Model and Sequence Discriminative Training for Neural Transducers [52.88268942796418]
Internal language model (ILM) subtraction has been widely applied to improve the performance of the RNN-Transducer.
We show that sequence discriminative training has a strong correlation with ILM subtraction from both theoretical and empirical points of view.
arXiv Detail & Related papers (2023-09-25T13:35:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.