Related papers: Understanding In-Context Learning from Repetitions

Understanding In-Context Learning from Repetitions

URL: http://arxiv.org/abs/2310.00297v3
Date: Wed, 21 Feb 2024 09:21:52 GMT
Title: Understanding In-Context Learning from Repetitions
Authors: Jianhao Yan, Jin Xu, Chiyu Song, Chenming Wu, Yafu Li, Yue Zhang
Abstract summary: This paper explores the elusive mechanism underpinning in-context learning in Large Language Models (LLMs) We quantitatively investigate the role of surface features in text generation, and empirically establish the existence of emphtoken co-occurrence reinforcement By investigating the dual impacts of these features, our research illuminates the internal workings of in-context learning and expounds on the reasons for its failures.
Score: 21.28694573253979
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper explores the elusive mechanism underpinning in-context learning in Large Language Models (LLMs). Our work provides a novel perspective by examining in-context learning via the lens of surface repetitions. We quantitatively investigate the role of surface features in text generation, and empirically establish the existence of \emph{token co-occurrence reinforcement}, a principle that strengthens the relationship between two tokens based on their contextual co-occurrences. By investigating the dual impacts of these features, our research illuminates the internal workings of in-context learning and expounds on the reasons for its failures. This paper provides an essential contribution to the understanding of in-context learning and its potential limitations, providing a fresh perspective on this exciting capability.

Related papers

The broader spectrum of in-context learning [13.111927028942329]
We provide a perspective that situates this type of supervised few-shot learning within a much broader spectrum of metalearned in-context learning. We suggest that any distribution of sequences in which context non-trivially decreases loss on subsequent predictions can be elicited. We close by suggesting that research on in-context learning should consider this broader spectrum in-context capabilities and types of generalization.
arXiv Detail & Related papers (2024-12-05T00:05:11Z)
Mitigating Knowledge Conflicts in Language Model-Driven Question Answering [15.29366851382021]
Two fundamental knowledge sources play crucial roles in document-based question answering and document summarization systems. Recent studies revealed a significant challenge: when there exists a misalignment between the model's inherent knowledge and the ground truth answers in training data, the system may exhibit problematic behaviors during inference. Our investigation proposes a strategy to minimize hallucination by building explicit connection between source inputs and generated outputs.
arXiv Detail & Related papers (2024-11-18T07:33:10Z)
Argumentation and Machine Learning [4.064849471241967]
This chapter provides an overview of research works that present approaches with some degree of cross-fertilisation between Computational Argumentation and Machine Learning. Two broad themes representing the purpose of the interaction between these two areas were identified. We evaluate the spectrum of works across various dimensions, including the type of learning and the form of argumentation framework used.
arXiv Detail & Related papers (2024-10-31T08:19:58Z)
Investigating Expert-in-the-Loop LLM Discourse Patterns for Ancient Intertextual Analysis [0.0]
The study demonstrates that large language models can detect direct quotations, allusions, and echoes between texts. The model struggles with long query passages and the inclusion of false intertextual dependences. The expert-in-the-loop methodology presented offers a scalable approach for intertextual research.
arXiv Detail & Related papers (2024-09-03T13:23:11Z)
Exploring Continual Learning of Compositional Generalization in NLI [24.683598294766774]
We introduce the Continual Compositional Generalization in Inference (C2Gen NLI) challenge. A model continuously acquires knowledge of constituting primitive inference tasks as a basis for compositional inferences. Our analyses show that by learning subtasks continuously while observing their dependencies and increasing degrees of difficulty, continual learning can enhance composition generalization ability.
arXiv Detail & Related papers (2024-03-07T10:54:27Z)
Identifying Semantic Induction Heads to Understand In-Context Learning [103.00463655766066]
We investigate whether attention heads encode two types of relationships between tokens present in natural languages. We find that certain attention heads exhibit a pattern where, when attending to head tokens, they recall tail tokens and increase the output logits of those tail tokens.
arXiv Detail & Related papers (2024-02-20T14:43:39Z)
How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored. Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges. We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z)
The Mystery of In-Context Learning: A Comprehensive Survey on Interpretation and Analysis [20.142154624977582]
In-context learning (ICL) capability enables large language models to excel in proficiency through demonstration examples. In this paper, we present a thorough survey on the interpretation and analysis of in-context learning. We believe that our work establishes the basis for further exploration into the interpretation of in-context learning.
arXiv Detail & Related papers (2023-11-01T02:40:42Z)
Negation, Coordination, and Quantifiers in Contextualized Language Models [4.46783454797272]
We explore whether the semantic constraints of function words are learned and how the surrounding context impacts their embeddings. We create suitable datasets, provide new insights into the inner workings of LMs vis-a-vis function words and implement an assisting visual web interface for qualitative analysis.
arXiv Detail & Related papers (2022-09-16T10:01:11Z)
Learning to Express in Knowledge-Grounded Conversation [62.338124154016825]
We consider two aspects of knowledge expression, namely the structure of the response and style of the content in each part. We propose a segmentation-based generation model and optimize the model by a variational approach to discover the underlying pattern of knowledge expression in a response.
arXiv Detail & Related papers (2022-04-12T13:43:47Z)
Reciprocal Feature Learning via Explicit and Implicit Tasks in Scene Text Recognition [60.36540008537054]
In this work, we excavate the implicit task, character counting within the traditional text recognition, without additional labor annotation cost. We design a two-branch reciprocal feature learning framework in order to adequately utilize the features from both the tasks. Experiments on 7 benchmarks show the advantages of the proposed methods in both text recognition and the new-built character counting tasks.
arXiv Detail & Related papers (2021-05-13T12:27:35Z)
Positioning yourself in the maze of Neural Text Generation: A Task-Agnostic Survey [54.34370423151014]
This paper surveys the components of modeling approaches relaying task impacts across various generation tasks such as storytelling, summarization, translation etc. We present an abstraction of the imperative techniques with respect to learning paradigms, pretraining, modeling approaches, decoding and the key challenges outstanding in the field in each of them.
arXiv Detail & Related papers (2020-10-14T17:54:42Z)
Summarizing Text on Any Aspects: A Knowledge-Informed Weakly-Supervised Approach [89.56158561087209]
We study summarizing on arbitrary aspects relevant to the document. Due to the lack of supervision data, we develop a new weak supervision construction method and an aspect modeling scheme. Experiments show our approach achieves performance boosts on summarizing both real and synthetic documents.
arXiv Detail & Related papers (2020-10-14T03:20:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.