A multiscale analysis of mean-field transformers in the moderate interaction regime
- URL: http://arxiv.org/abs/2509.25040v1
- Date: Mon, 29 Sep 2025 16:57:04 GMT
- Title: A multiscale analysis of mean-field transformers in the moderate interaction regime
- Authors: Giuseppe Bruno, Federico Pasqualotto, Andrea Agazzi,
- Abstract summary: We study the evolution of tokens through the depth of encoder-only transformer models at inference time.<n>We provide a rigorous characterization of the limiting dynamics in each of these phases and prove convergence in the above mentioned limit.
- Score: 7.742297876120561
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we study the evolution of tokens through the depth of encoder-only transformer models at inference time by modeling them as a system of particles interacting in a mean-field way and studying the corresponding dynamics. More specifically, we consider this problem in the moderate interaction regime, where the number $N$ of tokens is large and the inverse temperature parameter $\beta$ of the model scales together with $N$. In this regime, the dynamics of the system displays a multiscale behavior: a fast phase, where the token empirical measure collapses on a low-dimensional space, an intermediate phase, where the measure further collapses into clusters, and a slow one, where such clusters sequentially merge into a single one. We provide a rigorous characterization of the limiting dynamics in each of these phases and prove convergence in the above mentioned limit, exemplifying our results with some simulations.
Related papers
- On the Mechanism and Dynamics of Modular Addition: Fourier Features, Lottery Ticket, and Grokking [49.1352577985191]
We present a comprehensive analysis of how two-layer neural networks learn features to solve the modular addition task.<n>Our work provides a full mechanistic interpretation of the learned model and a theoretical explanation of its training dynamics.
arXiv Detail & Related papers (2026-02-18T20:25:13Z) - The Mean-Field Dynamics of Transformers [6.008788032203683]
By idealizing attention on the sphere, we connect Transformer dynamics to Wasserstein gradient flows (Kuramoto), and mean-shift clustering.<n>Results highlight both the mechanisms that drive representation collapse and the regimes that preserve expressive, multi-cluster structure in deep attention architectures.
arXiv Detail & Related papers (2025-12-01T16:51:00Z) - Higher symmetry breaking and non-reciprocity in a driven-dissipative Dicke model [0.0]
We study a variant of the Dicke model with higher-order discrete symmetry, resulting from complex-valued coupling coefficients between quantum emitters and a bosonic mode.<n>This $n$-phase Dicke model may be equivalently realized in a variety of optomechanical or opto-magnonic settings.
arXiv Detail & Related papers (2025-10-05T16:59:36Z) - Error stabilized logical qubits in qudit generalizations of the monitored Kitaev model [0.0]
We study the monitored dynamics of qudit generalizations of the Kitaev model on the honeycomb and square lattices.<n>Our results reveal a rich interplay between quantum spin liquids and monitored circuit dynamics.
arXiv Detail & Related papers (2025-09-20T17:48:10Z) - Efficiency of Dynamical Decoupling for (Almost) Any Spin-Boson Model [44.99833362998488]
We analytically study the dynamical decoupling of a two-level system coupled with a structured bosonic environment.<n>We find sufficient conditions under which dynamical decoupling works for such systems.<n>Our bounds reproduce the correct scaling in various relevant system parameters.
arXiv Detail & Related papers (2024-09-24T04:58:28Z) - Entanglement dynamics in the many-body Hatano-Nelson model [0.0]
The entanglement dynamics in a non-Hermitian quantum system is studied numerically and analyzed from the viewpoint of quasiparticle picture.
As opposed to an assertion of previous studies, the entanglement dynamics in this non-Hermitian quantum system is very different from the one in its Hermitian counterpart.
arXiv Detail & Related papers (2023-08-06T10:12:41Z) - Modeling the space-time correlation of pulsed twin beams [68.8204255655161]
Entangled twin-beams generated by parametric down-conversion are among the favorite sources for imaging-oriented applications.
We propose a semi-analytic model which aims to bridge the gap between time-consuming numerical simulations and the unrealistic plane-wave pump theory.
arXiv Detail & Related papers (2023-01-18T11:29:49Z) - Photoinduced prethermal order parameter dynamics in the two-dimensional
large-$N$ Hubbard-Heisenberg model [77.34726150561087]
We study the microscopic dynamics of competing ordered phases in a two-dimensional correlated electron model.
We simulate the light-induced transition between two competing phases.
arXiv Detail & Related papers (2022-05-13T13:13:31Z) - Phase synchronization in dissipative non-Hermitian coupled quantum
systems [0.0]
We study the interplay between non-Hermitian dynamics and phase synchronization in a system of $mathcalN$ bosonic modes coupled to an auxiliary mode.
We provide analytical and numerical solutions for systems ranging from a few modes to the macroscopic limit of large $mathcalN$ in the presence of inhomogeneous frequency broadening.
arXiv Detail & Related papers (2021-11-03T13:12:59Z) - Geometric phase in a dissipative Jaynes-Cummings model: theoretical
explanation for resonance robustness [68.8204255655161]
We compute the geometric phases acquired in both unitary and dissipative Jaynes-Cummings models.
In the dissipative model, the non-unitary effects arise from the outflow of photons through the cavity walls.
We show the geometric phase is robust, exhibiting a vanishing correction under a non-unitary evolution.
arXiv Detail & Related papers (2021-10-27T15:27:54Z) - Qubit-photon bound states in topological waveguides with long-range
hoppings [62.997667081978825]
Quantum emitters interacting with photonic band-gap materials lead to the appearance of qubit-photon bound states.
We study the features of the qubit-photon bound states when the emitters couple to the bulk modes in the different phases.
We consider the coupling of emitters to the edge modes appearing in the different topological phases.
arXiv Detail & Related papers (2021-05-26T10:57:21Z) - Simulation of complex dynamics of mean-field $p$-spin models using
measurement-based quantum feedback control [0.0]
We apply a new method for simulating nonlinear dynamics of many-body spin systems using quantum measurement and feedback.
We study applications including properties of dynamical phase transitions and the emergence of spontaneous symmetry breaking in the adiabatic dynamics of the collective spin.
arXiv Detail & Related papers (2020-04-23T18:22:03Z) - Kernel and Rich Regimes in Overparametrized Models [69.40899443842443]
We show that gradient descent on overparametrized multilayer networks can induce rich implicit biases that are not RKHS norms.
We also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.
arXiv Detail & Related papers (2020-02-20T15:43:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.