Related papers: Transfer entropy and O-information to detect grokking in tensor network multi-class classification problems

Transfer entropy and O-information to detect grokking in tensor network multi-class classification problems

URL: http://arxiv.org/abs/2507.23346v1
Date: Thu, 31 Jul 2025 08:53:04 GMT
Title: Transfer entropy and O-information to detect grokking in tensor network multi-class classification problems
Authors: Domenico Pomarico, Roberto Cilli, Alfonso Monaco, Loredana Bellantuono, Marianna La Rocca, Tommaso Maggipinto, Giuseppe Magnifico, Marlis Ontivero Ortega, Ester Pantaleo, Sabina Tangaro, Sebastiano Stramaglia, Roberto Bellotti, Nicola Amoroso,
Abstract summary: We study the training dynamics of Matrix Product State (MPS) classifiers applied to three-class problems.<n>We investigate the phenomenon of grokking, where generalization emerges suddenly after memorization.<n>Our results show that grokking in the fashion MNIST task coincides with a sharp entanglement transition and a peak in redundant information, whereas the overfitted hyper-spectral model retains synergistic, disordered behavior.
Score: 0.6222849132943892
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Quantum-enhanced machine learning, encompassing both quantum algorithms and quantum-inspired classical methods such as tensor networks, offers promising tools for extracting structure from complex, high-dimensional data. In this work, we study the training dynamics of Matrix Product State (MPS) classifiers applied to three-class problems, using both fashion MNIST and hyper-spectral satellite imagery as representative datasets. We investigate the phenomenon of grokking, where generalization emerges suddenly after memorization, by tracking entanglement entropy, local magnetization, and model performance across training sweeps. Additionally, we employ information theory tools to gain deeper insights: transfer entropy is used to reveal causal dependencies between label-specific quantum masks, while O-information captures the shift from synergistic to redundant correlations among class outputs. Our results show that grokking in the fashion MNIST task coincides with a sharp entanglement transition and a peak in redundant information, whereas the overfitted hyper-spectral model retains synergistic, disordered behavior. These findings highlight the relevance of high-order information dynamics in quantum-inspired learning and emphasize the distinct learning behaviors that emerge in multi-class classification, offering a principled framework to interpret generalization in quantum machine learning architectures.

Related papers

Models of Heavy-Tailed Mechanistic Universality [62.107333654304014]
We propose a family of random matrix models to explore attributes that give rise to heavy-tailed behavior in trained neural networks.<n>Under this model, spectral densities with power laws on tails arise through a combination of three independent factors.<n> Implications of our model on other appearances of heavy tails, including neural scaling laws, trajectories, and the five-plus-one phases of neural network training, are discussed.
arXiv Detail & Related papers (2025-06-04T00:55:01Z)
Explaining Anomalies with Tensor Networks [0.0]
We introduce tree tensor networks for the task of explainable anomaly detection.<n>We show adequate predictive performance compared to several baseline models.<n>We thereby extend the application of tensor networks to a broader class of potential problems.
arXiv Detail & Related papers (2025-05-06T18:35:05Z)
Grokking as an entanglement transition in tensor network machine learning [0.608657548424657]
We numerically prove that grokking phenomenon can be related to an entanglement dynamical transition in the underlying quantum many-body systems.<n>We exploit measurement of qubits magnetization and correlation functions in the Matrix Product State network as a tool to identify meaningful and relevant gene subcommunities.
arXiv Detail & Related papers (2025-03-13T15:51:23Z)
Generalization Performance of Hypergraph Neural Networks [21.483543928698676]
We develop margin-based generalization bounds for four representative classes of hypergraph neural networks.<n>Our results reveal the manner in which hypergraph structure and spectral norms of the learned weights can affect the generalization bounds.<n>Our empirical study examines the relationship between the practical performance and theoretical bounds of the models over synthetic and real-world datasets.
arXiv Detail & Related papers (2025-01-22T00:20:26Z)
Quantum reservoir computing on random regular graphs [0.0]
Quantum reservoir computing (QRC) is a low-complexity learning paradigm that combines input-driven many-body quantum systems with classical learning techniques.<n>We study information localization, dynamical quantum correlations, and the many-body structure of the disordered Hamiltonian.<n>Our findings thus provide guidelines for the optimal design of disordered analog quantum learning platforms.
arXiv Detail & Related papers (2024-09-05T16:18:03Z)
ShadowNet for Data-Centric Quantum System Learning [188.683909185536]
We propose a data-centric learning paradigm combining the strength of neural-network protocols and classical shadows. Capitalizing on the generalization power of neural networks, this paradigm can be trained offline and excel at predicting previously unseen systems. We present the instantiation of our paradigm in quantum state tomography and direct fidelity estimation tasks and conduct numerical analysis up to 60 qubits.
arXiv Detail & Related papers (2023-08-22T09:11:53Z)
DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion [66.21290235237808]
We introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states. We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs. Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks.
arXiv Detail & Related papers (2023-01-23T15:18:54Z)
A didactic approach to quantum machine learning with a single qubit [68.8204255655161]
We focus on the case of learning with a single qubit, using data re-uploading techniques. We implement the different proposed formulations in toy and real-world datasets using the qiskit quantum computing SDK.
arXiv Detail & Related papers (2022-11-23T18:25:32Z)
Generalization despite overfitting in quantum machine learning models [0.0]
We provide a characterization of benign overfitting in quantum models. We show how a class of quantum models exhibits analogous features. We intuitively explain these features according to the ability of the quantum model to interpolate noisy data with locally "spiky" behavior.
arXiv Detail & Related papers (2022-09-12T18:08:45Z)
Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization [76.68866368409216]
We propose learning to dynamically select discretization tightness conditioned on inputs. We show that dynamically varying tightness in communication bottlenecks can improve model performance on visual reasoning and reinforcement learning tasks.
arXiv Detail & Related papers (2022-02-02T23:54:26Z)
Tracing Information Flow from Open Quantum Systems [52.77024349608834]
We use photons in a waveguide array to implement a quantum simulation of the coupling of a qubit with a low-dimensional discrete environment. Using the trace distance between quantum states as a measure of information, we analyze different types of information transfer.
arXiv Detail & Related papers (2021-03-22T16:38:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.