Repetitions are not all alike: distinct mechanisms sustain repetition in language models
- URL: http://arxiv.org/abs/2504.01100v1
- Date: Tue, 01 Apr 2025 18:16:11 GMT
- Title: Repetitions are not all alike: distinct mechanisms sustain repetition in language models
- Authors: Matéo Mahaut, Francesca Franzon,
- Abstract summary: Repetitive sequences emerge under diverse tasks and contexts, raising the possibility that it may be driven by multiple underlying factors.<n>We examine the internal working of language models (LMs) under two conditions that prompt repetition.
- Score: 0.09208007322096534
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text generated by language models (LMs) can degrade into repetitive cycles, where identical word sequences are persistently repeated one after another. Prior research has typically treated repetition as a unitary phenomenon. However, repetitive sequences emerge under diverse tasks and contexts, raising the possibility that it may be driven by multiple underlying factors. Here, we experimentally explore the hypothesis that repetition in LMs can result from distinct mechanisms, reflecting different text generation strategies used by the model. We examine the internal working of LMs under two conditions that prompt repetition: one in which repeated sequences emerge naturally after human-written text, and another where repetition is explicitly induced through an in-context learning (ICL) setup. Our analysis reveals key differences between the two conditions: the model exhibits varying levels of confidence, relies on different attention heads, and shows distinct pattens of change in response to controlled perturbations. These findings suggest that distinct internal mechanisms can interact to drive repetition, with implications for its interpretation and mitigation strategies. More broadly, our results highlight that the same surface behavior in LMs may be sustained by different underlying processes, acting independently or in combination.
Related papers
- Understanding the Repeat Curse in Large Language Models from a Feature Perspective [10.413608338398785]
Large language models (LLMs) often suffer from repetitive text generation.
We propose a novel approach, "Duplicatus Charm", to induce and analyze the Repeat Curse.
arXiv Detail & Related papers (2025-04-19T07:53:37Z) - Deterministic or probabilistic? The psychology of LLMs as random number generators [0.0]
Large Language Models (LLMs) have transformed text generation through inherently probabilistic context-aware mechanisms.<n>Our results reveal that, despite their transformers-based architecture, these models often exhibit deterministic responses when prompted for random numerical outputs.
arXiv Detail & Related papers (2025-02-27T10:45:27Z) - Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing [28.646627695015646]
Repetitive transformations can lead to stable configurations, known as attractors, including fixed points and limit cycles.<n>Applying this perspective to large language models (LLMs), which iteratively map input text to output text, provides a principled approach to characterizing long-term behaviors.<n>Successive paraphrasing serves as a compelling testbed for exploring such dynamics, as paraphrases re-express the same underlying meaning with linguistic variation.
arXiv Detail & Related papers (2025-02-21T04:46:57Z) - Nested replicator dynamics, nested logit choice, and similarity-based learning [56.98352103321524]
We consider a model of learning and evolution in games with action sets endowed with a partition-based similarity structure.
In this model, revising agents have a higher probability of comparing their current strategy with other strategies that they deem similar.
Because of this implicit bias toward similar strategies, the resulting dynamics do not satisfy any of the standard monotonicity rationalitys for imitative game dynamics.
arXiv Detail & Related papers (2024-07-25T07:09:53Z) - From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty [67.81977289444677]
Large language models (LLMs) often exhibit undesirable behaviors, such as hallucinations and sequence repetitions.<n>We categorize fallback behaviors - sequence repetitions, degenerate text, and hallucinations - and extensively analyze them.<n>Our experiments reveal a clear and consistent ordering of fallback behaviors, across all these axes.
arXiv Detail & Related papers (2024-07-08T16:13:42Z) - Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Repetition In Repetition Out: Towards Understanding Neural Text
Degeneration from the Data Perspective [91.14291142262262]
This work presents a straightforward and fundamental explanation from the data perspective.
Our preliminary investigation reveals a strong correlation between the degeneration issue and the presence of repetitions in training data.
Our experiments reveal that penalizing the repetitions in training data remains critical even when considering larger model sizes and instruction tuning.
arXiv Detail & Related papers (2023-10-16T09:35:42Z) - Replicable Reinforcement Learning [15.857503103543308]
We provide a provably replicable algorithm for parallel value iteration, and a provably replicable version of R-max in the episodic setting.
These are the first formal replicability results for control problems, which present different challenges for replication than batch learning settings.
arXiv Detail & Related papers (2023-05-24T16:05:15Z) - Identifiability Results for Multimodal Contrastive Learning [72.15237484019174]
We show that it is possible to recover shared factors in a more general setup than the multi-view setting studied previously.
Our work provides a theoretical basis for multimodal representation learning and explains in which settings multimodal contrastive learning can be effective in practice.
arXiv Detail & Related papers (2023-03-16T09:14:26Z) - Composed Variational Natural Language Generation for Few-shot Intents [118.37774762596123]
We generate training examples for few-shot intents in the realistic imbalanced scenario.
To evaluate the quality of the generated utterances, experiments are conducted on the generalized few-shot intent detection task.
Our proposed model achieves state-of-the-art performances on two real-world intent detection datasets.
arXiv Detail & Related papers (2020-09-21T17:48:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.