Fisher information flow in artificial neural networks
- URL: http://arxiv.org/abs/2509.02407v2
- Date: Wed, 24 Sep 2025 10:57:23 GMT
- Title: Fisher information flow in artificial neural networks
- Authors: Maximilian Weimar, Lukas M. Rachbauer, Ilya Starshynov, Daniele Faccio, Linara Adilova, Dorian Bouchet, Stefan Rotter,
- Abstract summary: We present a method to monitor the flow of Fisher information through an ANN performing a parameter estimation task.<n>We show that optimal estimation performance corresponds to the maximal transmission of Fisher information.<n>This provides a model-free stopping criterion for network training-eliminating the need for a separate validation dataset.
- Score: 3.0053910105391264
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The estimation of continuous parameters from measured data plays a central role in many fields of physics. A key tool in understanding and improving such estimation processes is the concept of Fisher information, which quantifies how information about unknown parameters propagates through a physical system and determines the ultimate limits of precision. With Artificial Neural Networks (ANNs) gradually becoming an integral part of many measurement systems, it is essential to understand how they process and transmit parameter-relevant information internally. Here, we present a method to monitor the flow of Fisher information through an ANN performing a parameter estimation task, tracking it from the input to the output layer. We show that optimal estimation performance corresponds to the maximal transmission of Fisher information, and that training beyond this point results in information loss due to overfitting. This provides a model-free stopping criterion for network training-eliminating the need for a separate validation dataset. To demonstrate the practical relevance of our approach, we apply it to a network trained on data from an imaging experiment, highlighting its effectiveness in a realistic physical setting.
Related papers
- Preserving Information: How does Topological Data Analysis improve Neural Network performance? [0.0]
We introduce a method for integrating Topological Data Analysis (TDA) with Convolutional Neural Networks (CNN) in the context of image recognition.<n>Our approach, further referred to as Vector Stitching, involves combining raw image data with additional topological information.<n>The results of our experiments highlight the potential of incorporating results of additional data analysis into the network's inference process.
arXiv Detail & Related papers (2024-11-27T14:56:05Z) - Enhancing Neural Network Interpretability Through Conductance-Based Information Plane Analysis [0.0]
The Information Plane is a conceptual framework used to analyze the flow of information in neural networks.
This paper introduces a new approach that uses layer conductance, a measure of sensitivity to input features, to enhance the Information Plane analysis.
arXiv Detail & Related papers (2024-08-26T23:10:42Z) - Assessing Neural Network Representations During Training Using
Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process.
We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures.
We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z) - Uncertainty Estimation by Fisher Information-based Evidential Deep
Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications.
We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL)
In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z) - MARS: Meta-Learning as Score Matching in the Function Space [79.73213540203389]
We present a novel approach to extracting inductive biases from a set of related datasets.
We use functional Bayesian neural network inference, which views the prior as a process and performs inference in the function space.
Our approach can seamlessly acquire and represent complex prior knowledge by metalearning the score function of the data-generating process.
arXiv Detail & Related papers (2022-10-24T15:14:26Z) - Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation.
The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z) - Neural Capacity Estimators: How Reliable Are They? [14.904387585122851]
We study the performance of mutual information neural estimator (MINE), smoothed mutual information lower-bound estimator (SMILE), and information directed neural estimator (DINE)
We evaluate these algorithms in terms of their ability to learn the input distributions that are capacity approaching for the AWGN channel, the optical intensity channel, and peak power-constrained AWGN channel.
arXiv Detail & Related papers (2021-11-14T18:14:53Z) - Efficient training of lightweight neural networks using Online
Self-Acquired Knowledge Distillation [51.66271681532262]
Online Self-Acquired Knowledge Distillation (OSAKD) is proposed, aiming to improve the performance of any deep neural model in an online manner.
We utilize k-nn non-parametric density estimation technique for estimating the unknown probability distributions of the data samples in the output feature space.
arXiv Detail & Related papers (2021-08-26T14:01:04Z) - Forgetting Outside the Box: Scrubbing Deep Networks of Information
Accessible from Input-Output Observations [143.3053365553897]
We describe a procedure for removing dependency on a cohort of training data from a trained deep network.
We introduce a new bound on how much information can be extracted per query about the forgotten cohort.
We exploit the connections between the activation and weight dynamics of a DNN inspired by Neural Tangent Kernels to compute the information in the activations.
arXiv Detail & Related papers (2020-03-05T23:17:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.