The Interaction Bottleneck of Deep Neural Networks: Discovery, Proof, and Modulation
- URL: http://arxiv.org/abs/2512.18607v1
- Date: Sun, 21 Dec 2025 05:55:11 GMT
- Title: The Interaction Bottleneck of Deep Neural Networks: Discovery, Proof, and Modulation
- Authors: Huiqi Deng, Qihan Ren, Zhuofan Chen, Zhenyuan Cui, Wen Shen, Peng Zhang, Hongbin Pei, Quanshi Zhang,
- Abstract summary: We study how deep neural networks encode interactions under different levels of contextual complexity.<n>We find that DNNs easily learn low-order and high-order interactions but consistently under-represent mid-order ones.<n>These results establish interaction order as a powerful lens for interpreting and guiding deep representations.
- Score: 30.18664515203697
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding what kinds of cooperative structures deep neural networks (DNNs) can represent remains a fundamental yet insufficiently understood problem. In this work, we treat interactions as the fundamental units of such structure and investigate a largely unexplored question: how DNNs encode interactions under different levels of contextual complexity, and how these microscopic interaction patterns shape macroscopic representation capacity. To quantify this complexity, we use multi-order interactions [57], where each order reflects the amount of contextual information required to evaluate the joint interaction utility of a variable pair. This formulation enables a stratified analysis of cooperative patterns learned by DNNs. Building on this formulation, we develop a comprehensive study of interaction structure in DNNs. (i) We empirically discover a universal interaction bottleneck: across architectures and tasks, DNNs easily learn low-order and high-order interactions but consistently under-represent mid-order ones. (ii) We theoretically explain this bottleneck by proving that mid-order interactions incur the highest contextual variability, yielding large gradient variance and making them intrinsically difficult to learn. (iii) We further modulate the bottleneck by introducing losses that steer models toward emphasizing interactions of selected orders. Finally, we connect microscopic interaction structures with macroscopic representational behavior: low-order-emphasized models exhibit stronger generalization and robustness, whereas high-order-emphasized models demonstrate greater structural modeling and fitting capability. Together, these results uncover an inherent representational bias in modern DNNs and establish interaction order as a powerful lens for interpreting and guiding deep representations.
Related papers
- High-order Interactions Modeling for Interpretable Multi-Agent Q-Learning [22.42637658125405]
Previous efforts to model high-order interactions have been hindered by the reinforcement explosion or the opaque nature of their black-box network structures.<n>We propose a novel value decomposition framework, called Continued Fraction Q-Learning (QCoFr), which can flexibly capture arbitrary-order agent interactions.
arXiv Detail & Related papers (2025-10-23T05:08:32Z) - Broad Spectrum Structure Discovery in Large-Scale Higher-Order Networks [1.7273380623090848]
We introduce a class of probabilistic models that efficiently represents and discovers a broad spectrum of mesoscale structure in large-scale hypergraphs.<n>By modeling observed node interactions through latent interactions among classes using low-rank representations, our approach tractably captures rich structural patterns.<n>Our model improves link prediction over state-of-the-art methods and discovers interpretable structures in diverse real-world systems.
arXiv Detail & Related papers (2025-05-27T20:34:58Z) - RelGNN: Composite Message Passing for Relational Deep Learning [56.48834369525997]
We introduce RelGNN, a novel GNN framework specifically designed to leverage the unique structural characteristics of the graphs built from relational databases.<n>RelGNN is evaluated on 30 diverse real-world tasks from Relbench (Fey et al., 2024), and achieves state-of-the-art performance on the vast majority tasks, with improvements of up to 25%.
arXiv Detail & Related papers (2025-02-10T18:58:40Z) - Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data.
We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z) - Two-Phase Dynamics of Interactions Explains the Starting Point of a DNN Learning Over-Fitted Features [68.3512123520931]
We investigate the dynamics of a deep neural network (DNN) learning interactions.
In this paper, we discover the DNN learns interactions in two phases.
The first phase mainly penalizes interactions of medium and high orders, and the second phase mainly learns interactions of gradually increasing orders.
arXiv Detail & Related papers (2024-05-16T17:13:25Z) - Neural Relational Inference with Fast Modular Meta-learning [25.313516707169498]
Graph neural networks (GNNs) are effective models for many dynamical systems consisting of entities and relations.
Relational inference is the problem of inferring these interactions and learning the dynamics from observational data.
We frame relational inference as a textitmodular meta-learning problem, where neural modules are trained to be composed in different ways to solve many tasks.
arXiv Detail & Related papers (2023-10-10T21:05:13Z) - On the Intrinsic Structures of Spiking Neural Networks [66.57589494713515]
Recent years have emerged a surge of interest in SNNs owing to their remarkable potential to handle time-dependent and event-driven data.
There has been a dearth of comprehensive studies examining the impact of intrinsic structures within spiking computations.
This work delves deep into the intrinsic structures of SNNs, by elucidating their influence on the expressivity of SNNs.
arXiv Detail & Related papers (2022-06-21T09:42:30Z) - Deep Architecture Connectivity Matters for Its Convergence: A
Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training.
We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z) - Universal approximation property of invertible neural networks [76.95927093274392]
Invertible neural networks (INNs) are neural network architectures with invertibility by design.
Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning.
arXiv Detail & Related papers (2022-04-15T10:45:26Z) - Discovering and Explaining the Representation Bottleneck of DNNs [21.121270460158712]
This paper explores the bottleneck of feature representations of deep neural networks (DNNs)
We focus on the multi-order interaction between input variables, where the order represents the complexity of interactions.
We discover that a DNN is more likely to encode both too simple interactions and too complex interactions, but usually fails to learn interactions of intermediate complexity.
arXiv Detail & Related papers (2021-11-11T14:35:20Z) - Discrete-Valued Neural Communication [85.3675647398994]
We show that restricting the transmitted information among components to discrete representations is a beneficial bottleneck.
Even though individuals have different understandings of what a "cat" is based on their specific experiences, the shared discrete token makes it possible for communication among individuals to be unimpeded by individual differences in internal representation.
We extend the quantization mechanism from the Vector-Quantized Variational Autoencoder to multi-headed discretization with shared codebooks and use it for discrete-valued neural communication.
arXiv Detail & Related papers (2021-07-06T03:09:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.