Related papers: Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing

Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing

URL: http://arxiv.org/abs/2502.15208v1
Date: Fri, 21 Feb 2025 04:46:57 GMT
Title: Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing
Authors: Zhilin Wang, Yafu Li, Jianhao Yan, Yu Cheng, Yue Zhang,
Abstract summary: Repetitive transformations can lead to stable configurations, known as attractors, including fixed points and limit cycles.<n>Applying this perspective to large language models (LLMs), which iteratively map input text to output text, provides a principled approach to characterizing long-term behaviors.<n>Successive paraphrasing serves as a compelling testbed for exploring such dynamics, as paraphrases re-express the same underlying meaning with linguistic variation.
Score: 28.646627695015646
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Dynamical systems theory provides a framework for analyzing iterative processes and evolution over time. Within such systems, repetitive transformations can lead to stable configurations, known as attractors, including fixed points and limit cycles. Applying this perspective to large language models (LLMs), which iteratively map input text to output text, provides a principled approach to characterizing long-term behaviors. Successive paraphrasing serves as a compelling testbed for exploring such dynamics, as paraphrases re-express the same underlying meaning with linguistic variation. Although LLMs are expected to explore a diverse set of paraphrases in the text space, our study reveals that successive paraphrasing converges to stable periodic states, such as 2-period attractor cycles, limiting linguistic diversity. This phenomenon is attributed to the self-reinforcing nature of LLMs, as they iteratively favour and amplify certain textual forms over others. This pattern persists with increasing generation randomness or alternating prompts and LLMs. These findings underscore inherent constraints in LLM generative capability, while offering a novel dynamical systems perspective for studying their expressive potential.

Related papers

LLM-PS: Empowering Large Language Models for Time Series Forecasting with Temporal Patterns and Semantics [56.99021951927683]
Time Series Forecasting (TSF) is critical in many real-world domains like financial planning and health monitoring. Existing Large Language Models (LLMs) usually perform suboptimally because they neglect the inherent characteristics of time series data. We propose LLM-PS to empower the LLM for TSF by learning the fundamental textitPatterns and meaningful textitSemantics from time series data.
arXiv Detail & Related papers (2025-03-12T11:45:11Z)
LogiDynamics: Unraveling the Dynamics of Logical Inference in Large Language Model Reasoning [49.58786377307728]
This paper adopts an exploratory approach by introducing a controlled evaluation environment for analogical reasoning. We analyze the comparative dynamics of inductive, abductive, and deductive inference pipelines. We investigate advanced paradigms such as hypothesis selection, verification, and refinement, revealing their potential to scale up logical inference.
arXiv Detail & Related papers (2025-02-16T15:54:53Z)
Unified Generative and Discriminative Training for Multi-modal Large Language Models [88.84491005030316]
Generative training has enabled Vision-Language Models (VLMs) to tackle various complex tasks. Discriminative training, exemplified by models like CLIP, excels in zero-shot image-text classification and retrieval. This paper proposes a unified approach that integrates the strengths of both paradigms.
arXiv Detail & Related papers (2024-11-01T01:51:31Z)
LLMTemporalComparator: A Tool for Analysing Differences in Temporal Adaptations of Large Language Models [17.021220773165016]
This study addresses the challenges of analyzing temporal discrepancies in large language models (LLMs) trained on data from different time periods. We propose a novel system that compares in a systematic way the outputs of two LLM versions based on user-defined queries.
arXiv Detail & Related papers (2024-10-05T15:17:07Z)
Unraveling Text Generation in LLMs: A Stochastic Differential Equation Approach [3.4039202831583903]
This paper explores the application of Differential Equations (SDE) to interpret the text generation process of Large Language Models (LLMs) such as GPT-4. We represent this generation process using SDE to capture both deterministic trends and perturbations. We fit these functions using neural networks and validate the model on real-world text corpora.
arXiv Detail & Related papers (2024-08-17T15:30:27Z)
Semantic Change Characterization with LLMs using Rhetorics [0.1474723404975345]
We investigate the potential of LLMs in characterizing three types of semantic change: thought, relation, and orientation. Our results highlight the effectiveness of LLMs in capturing and analyzing semantic changes, providing valuable insights to improve computational linguistic applications.
arXiv Detail & Related papers (2024-07-23T16:32:49Z)
LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments [70.91258869156353]
We introduce LangSuitE, a versatile and simulation-free testbed featuring 6 representative embodied tasks in textual embodied worlds. Compared with previous LLM-based testbeds, LangSuitE offers adaptability to diverse environments without multiple simulation engines. We devise a novel chain-of-thought (CoT) schema, EmMem, which summarizes embodied states w.r.t. history information.
arXiv Detail & Related papers (2024-06-24T03:36:29Z)
When Large Language Models Meet Evolutionary Algorithms: Potential Enhancements and Challenges [50.280704114978384]
Pre-trained large language models (LLMs) exhibit powerful capabilities for generating natural text. Evolutionary algorithms (EAs) can discover diverse solutions to complex real-world problems.
arXiv Detail & Related papers (2024-01-19T05:58:30Z)
In-Context Learning Dynamics with Random Binary Sequences [16.645695664776433]
We propose a framework that enables us to analyze in-context learning dynamics. Inspired by the cognitive science of human perception, we use random binary sequences as context. In the latest GPT-3.5+ models, we find emergent abilities to generate seemingly random numbers and learn basic formal languages.
arXiv Detail & Related papers (2023-10-26T17:54:52Z)
Large Language Models can Contrastively Refine their Generation for Better Sentence Representation Learning [57.74233319453229]
Large language models (LLMs) have emerged as a groundbreaking technology and their unparalleled text generation capabilities have sparked interest in their application to the fundamental sentence representation learning task. We propose MultiCSR, a multi-level contrastive sentence representation learning framework that decomposes the process of prompting LLMs to generate a corpus. Our experiments reveal that MultiCSR enables a less advanced LLM to surpass the performance of ChatGPT, while applying it to ChatGPT achieves better state-of-the-art results.
arXiv Detail & Related papers (2023-10-17T03:21:43Z)
Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization [52.935150075484074]
We introduce a well-designed visual tokenizer to translate the non-linguistic image into a sequence of discrete tokens like a foreign language. The resulting visual tokens encompass high-level semantics worthy of a word and also support dynamic sequence length varying from the image. This unification empowers LaVIT to serve as an impressive generalist interface to understand and generate multi-modal content simultaneously.
arXiv Detail & Related papers (2023-09-09T03:01:38Z)
Iterative Forward Tuning Boosts In-Context Learning in Language Models [88.25013390669845]
In this study, we introduce a novel two-stage framework to boost in-context learning in large language models (LLMs) Specifically, our framework delineates the ICL process into two distinct stages: Deep-Thinking and test stages. The Deep-Thinking stage incorporates a unique attention mechanism, i.e., iterative enhanced attention, which enables multiple rounds of information accumulation.
arXiv Detail & Related papers (2023-05-22T13:18:17Z)
APo-VAE: Text Generation in Hyperbolic Space [116.11974607497986]
In this paper, we investigate text generation in a hyperbolic latent space to learn continuous hierarchical representations. An Adrial Poincare Variversaational Autoencoder (APo-VAE) is presented, where both the prior and variational posterior of latent variables are defined over a Poincare ball via wrapped normal distributions. Experiments in language modeling and dialog-response generation tasks demonstrate the winning effectiveness of the proposed APo-VAE model.
arXiv Detail & Related papers (2020-04-30T19:05:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.