Related papers: Spectral Introspection Identifies Group Training Dynamics in Deep Neural Networks for Neuroimaging

Spectral Introspection Identifies Group Training Dynamics in Deep Neural Networks for Neuroimaging

URL: http://arxiv.org/abs/2406.11825v1
Date: Mon, 17 Jun 2024 17:58:15 GMT
Title: Spectral Introspection Identifies Group Training Dynamics in Deep Neural Networks for Neuroimaging
Authors: Bradley T. Baker, Vince D. Calhoun, Sergey M. Plis,
Abstract summary: We present a novel introspection framework for Deep Learning on Neuroimaging data. Unlike post-hoc introspection techniques, which require fully-trained models for evaluation, our method allows for the study of training dynamics on the fly.
Score: 16.002859238417223
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural networks, whice have had a profound effect on how researchers study complex phenomena, do so through a complex, nonlinear mathematical structure which can be difficult for human researchers to interpret. This obstacle can be especially salient when researchers want to better understand the emergence of particular model behaviors such as bias, overfitting, overparametrization, and more. In Neuroimaging, the understanding of how such phenomena emerge is fundamental to preventing and informing users of the potential risks involved in practice. In this work, we present a novel introspection framework for Deep Learning on Neuroimaging data, which exploits the natural structure of gradient computations via the singular value decomposition of gradient components during reverse-mode auto-differentiation. Unlike post-hoc introspection techniques, which require fully-trained models for evaluation, our method allows for the study of training dynamics on the fly, and even more interestingly, allow for the decomposition of gradients based on which samples belong to particular groups of interest. We demonstrate how the gradient spectra for several common deep learning models differ between schizophrenia and control participants from the COBRE study, and illustrate how these trajectories may reveal specific training dynamics helpful for further analysis.

Related papers

NOBLE -- Neural Operator with Biologically-informed Latent Embeddings to Capture Experimental Variability in Biological Neuron Models [68.89389652724378]
NOBLE is a neural operator framework that learns a mapping from a continuous frequency-modulated embedding of interpretable neuron features to the somatic voltage response induced by current injection.<n>It predicts distributions of neural dynamics accounting for the intrinsic experimental variability.<n>NOBLE is the first scaled-up deep learning framework validated on real experimental data.
arXiv Detail & Related papers (2025-06-05T01:01:18Z)
Discovering Chunks in Neural Embeddings for Interpretability [53.80157905839065]
We propose leveraging the principle of chunking to interpret artificial neural population activities. We first demonstrate this concept in recurrent neural networks (RNNs) trained on artificial sequences with imposed regularities. We identify similar recurring embedding states corresponding to concepts in the input, with perturbations to these states activating or inhibiting the associated concepts.
arXiv Detail & Related papers (2025-02-03T20:30:46Z)
Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network. Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z)
A Fuzzy-based Approach to Predict Human Interaction by Functional Near-Infrared Spectroscopy [25.185426359719454]
The paper introduces a Fuzzy-based Attention (Fuzzy Attention Layer) mechanism, a novel computational approach to interpretability and efficacy of neural models in psychological research. By leveraging fuzzy logic, the Fuzzy Attention Layer is capable of learning and identifying interpretable patterns of neural activity.
arXiv Detail & Related papers (2024-09-26T09:20:12Z)
Deep Latent Variable Modeling of Physiological Signals [0.8702432681310401]
We explore high-dimensional problems related to physiological monitoring using latent variable models. First, we present a novel deep state-space model to generate electrical waveforms of the heart using optically obtained signals as inputs. Second, we present a brain signal modeling scheme that combines the strengths of probabilistic graphical models and deep adversarial learning. Third, we propose a framework for the joint modeling of physiological measures and behavior.
arXiv Detail & Related papers (2024-05-29T17:07:33Z)
Exploring neural oscillations during speech perception via surrogate gradient spiking neural networks [59.38765771221084]
We present a physiologically inspired speech recognition architecture compatible and scalable with deep learning frameworks. We show end-to-end gradient descent training leads to the emergence of neural oscillations in the central spiking neural network. Our findings highlight the crucial inhibitory role of feedback mechanisms, such as spike frequency adaptation and recurrent connections, in regulating and synchronising neural activity to improve recognition performance.
arXiv Detail & Related papers (2024-04-22T09:40:07Z)
Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations [62.65877150123775]
Causal abstraction is a promising theoretical framework for explainable artificial intelligence. Existing causal abstraction methods require a brute-force search over alignments between the high-level model and the low-level one. We present distributed alignment search (DAS), which overcomes these limitations.
arXiv Detail & Related papers (2023-03-05T00:57:49Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
The Causal Neural Connection: Expressiveness, Learnability, and Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation. In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models. We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z)
On the Evolution of Neuron Communities in a Deep Learning Architecture [0.7106986689736827]
This paper examines the neuron activation patterns of deep learning-based classification models. We show that both the community quality (modularity) and entropy are closely related to the deep learning models' performances.
arXiv Detail & Related papers (2021-06-08T21:09:55Z)
Information theoretic analysis of computational models as a tool to understand the neural basis of behaviors [0.0]
One of the greatest research challenges of this century is to understand the neural basis for how behavior emerges in brain-body-environment systems. Computational models provide an alternative framework within which one can study model systems. I provide an introduction, a review and discussion to make a case for how information theoretic analysis of computational models is a potent research methodology.
arXiv Detail & Related papers (2021-06-02T02:08:18Z)
Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task. This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.