Related papers: Shannon invariants: A scalable approach to information decomposition

Shannon invariants: A scalable approach to information decomposition

URL: http://arxiv.org/abs/2504.15779v1
Date: Tue, 22 Apr 2025 10:41:38 GMT
Title: Shannon invariants: A scalable approach to information decomposition
Authors: Aaron J. Gutknecht, Fernando E. Rosas, David A. Ehrlich, Abdullah Makkeh, Pedro A. M. Mediano, Michael Wibral,
Abstract summary: "Shannon invariants" are quantities that capture essential properties of high-order information processing.<n>Our theoretical results demonstrate how Shannon invariants can be used to resolve long-standing ambiguities.<n>Our results reveal distinctive information-processing signatures of various deep learning architectures.
Score: 41.60443091960594
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Distributed systems, such as biological and artificial neural networks, process information via complex interactions engaging multiple subsystems, resulting in high-order patterns with distinct properties across scales. Investigating how these systems process information remains challenging due to difficulties in defining appropriate multivariate metrics and ensuring their scalability to large systems. To address these challenges, we introduce a novel framework based on what we call "Shannon invariants" -- quantities that capture essential properties of high-order information processing in a way that depends only on the definition of entropy and can be efficiently calculated for large systems. Our theoretical results demonstrate how Shannon invariants can be used to resolve long-standing ambiguities regarding the interpretation of widely used multivariate information-theoretic measures. Moreover, our practical results reveal distinctive information-processing signatures of various deep learning architectures across layers, which lead to new insights into how these systems process information and how this evolves during training. Overall, our framework resolves fundamental limitations in analyzing high-order phenomena and offers broad opportunities for theoretical developments and empirical analyses.

Related papers

Broad Spectrum Structure Discovery in Large-Scale Higher-Order Networks [1.7273380623090848]
We introduce a class of probabilistic models that efficiently represents and discovers a broad spectrum of mesoscale structure in large-scale hypergraphs.<n>By modeling observed node interactions through latent interactions among classes using low-rank representations, our approach tractably captures rich structural patterns.<n>Our model improves link prediction over state-of-the-art methods and discovers interpretable structures in diverse real-world systems.
arXiv Detail & Related papers (2025-05-27T20:34:58Z)
Simultaneous Dimensionality Reduction for Extracting Useful Representations of Large Empirical Multimodal Datasets [0.0]
We focus on the sciences of dimensionality reduction as a means to obtain low-dimensional descriptions from high-dimensional data. We address the challenges posed by real-world data that defy conventional assumptions, such as complex interactions within systems or high-dimensional dynamical systems.
arXiv Detail & Related papers (2024-10-23T21:27:40Z)
On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games [55.2480439325792]
In a sequential decision-making problem, the information structure is the description of how events in the system occurring at different points in time affect each other. By contrast, real-world sequential decision-making problems typically involve a complex and time-varying interdependence of system variables. We formalize a novel reinforcement learning model which explicitly represents the information structure.
arXiv Detail & Related papers (2024-03-01T21:28:19Z)
A spectrum of physics-informed Gaussian processes for regression in engineering [0.0]
Despite the growing availability of sensing and data in general, we remain unable to fully characterise many in-service engineering systems and structures from a purely data-driven approach. This paper pursues the combination of machine learning technology and physics-based reasoning to enhance our ability to make predictive models with limited data.
arXiv Detail & Related papers (2023-09-19T14:39:03Z)
Information decomposition in complex systems via machine learning [4.189643331553922]
We use machine learning to decompose the information contained in a set of measurements by jointly optimizing a lossy compression of each measurement. We focus our analysis on two paradigmatic complex systems: a circuit and an amorphous material undergoing plastic deformation.
arXiv Detail & Related papers (2023-07-10T17:57:32Z)
Quantifying & Modeling Multimodal Interactions: An Information Decomposition Framework [89.8609061423685]
We propose an information-theoretic approach to quantify the degree of redundancy, uniqueness, and synergy relating input modalities with an output task. To validate PID estimation, we conduct extensive experiments on both synthetic datasets where the PID is known and on large-scale multimodal benchmarks. We demonstrate their usefulness in (1) quantifying interactions within multimodal datasets, (2) quantifying interactions captured by multimodal models, (3) principled approaches for model selection, and (4) three real-world case studies.
arXiv Detail & Related papers (2023-02-23T18:59:05Z)
Synergistic information supports modality integration and flexible learning in neural networks solving multiple tasks [107.8565143456161]
We investigate the information processing strategies adopted by simple artificial neural networks performing a variety of cognitive tasks. Results show that synergy increases as neural networks learn multiple diverse tasks. randomly turning off neurons during training through dropout increases network redundancy, corresponding to an increase in robustness.
arXiv Detail & Related papers (2022-10-06T15:36:27Z)
Variational Distillation for Multi-View Learning [104.17551354374821]
We design several variational information bottlenecks to exploit two key characteristics for multi-view representation learning. Under rigorously theoretical guarantee, our approach enables IB to grasp the intrinsic correlation between observations and semantic labels.
arXiv Detail & Related papers (2022-06-20T03:09:46Z)
Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction. We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z)
Scalable Gaussian Processes for Data-Driven Design using Big Data with Categorical Factors [14.337297795182181]
Gaussian processes (GP) have difficulties in accommodating big datasets, categorical inputs, and multiple responses. We propose a GP model that utilizes latent variables and functions obtained through variational inference to address the aforementioned challenges simultaneously. Our approach is demonstrated for machine learning of ternary oxide materials and topology optimization of a multiscale compliant mechanism.
arXiv Detail & Related papers (2021-06-26T02:17:23Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
Information Theory Measures via Multidimensional Gaussianization [7.788961560607993]
Information theory is an outstanding framework to measure uncertainty, dependence and relevance in data and systems. It has several desirable properties for real world applications. However, obtaining information from multidimensional data is a challenging problem due to the curse of dimensionality.
arXiv Detail & Related papers (2020-10-08T07:22:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.