Hidden Activations Are Not Enough: A General Approach to Neural Network Predictions
- URL: http://arxiv.org/abs/2409.13163v1
- Date: Fri, 20 Sep 2024 02:35:13 GMT
- Title: Hidden Activations Are Not Enough: A General Approach to Neural Network Predictions
- Authors: Samuel Leblanc, Aiky Rasolomanana, Marco Armenta,
- Abstract summary: We introduce a novel mathematical framework for analyzing neural networks using tools from quiver representation theory.
By leveraging the induced quiver representation of a data sample, we capture more information than traditional hidden layer outputs.
Results are architecture-agnostic and task-agnostic, making them broadly applicable.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a novel mathematical framework for analyzing neural networks using tools from quiver representation theory. This framework enables us to quantify the similarity between a new data sample and the training data, as perceived by the neural network. By leveraging the induced quiver representation of a data sample, we capture more information than traditional hidden layer outputs. This quiver representation abstracts away the complexity of the computations of the forward pass into a single matrix, allowing us to employ simple geometric and statistical arguments in a matrix space to study neural network predictions. Our mathematical results are architecture-agnostic and task-agnostic, making them broadly applicable. As proof of concept experiments, we apply our results for the MNIST and FashionMNIST datasets on the problem of detecting adversarial examples on different MLP architectures and several adversarial attack methods. Our experiments can be reproduced with our \href{https://github.com/MarcoArmenta/Hidden-Activations-are-not-Enough}{publicly available repository}.
Related papers
- Steinmetz Neural Networks for Complex-Valued Data [23.80312814400945]
We introduce a new approach to processing complex-valued data using DNNs consisting of parallel real-valuedetzworks with coupled outputs.
Our proposed class of architectures, referred to as Steinmetz Neural Networks, leverage multi-view learning to construct more interpretable representations within the latent space.
Our numerical experiments depict the improved performance and to additive noise, afforded by these networks on benchmark datasets and synthetic examples.
arXiv Detail & Related papers (2024-09-16T08:26:06Z) - Relational Composition in Neural Networks: A Survey and Call to Action [54.47858085003077]
Many neural nets appear to represent data as linear combinations of "feature vectors"
We argue that this success is incomplete without an understanding of relational composition.
arXiv Detail & Related papers (2024-07-19T20:50:57Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Tabular Data: Is Attention All You Need? [23.787352248749382]
We introduce a large-scale empirical study comparing neural networks against gradient-boosted decision trees on structured data.
In contrast to prior work, our empirical findings indicate that neural networks are competitive against decision trees.
arXiv Detail & Related papers (2024-02-06T12:59:02Z) - Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks.
By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead.
We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z) - Representation Learning via Manifold Flattening and Reconstruction [10.823557517341964]
This work proposes an algorithm for explicitly constructing a pair of neural networks that linearize and reconstruct an embedded submanifold.
Our such-generated neural networks, called Flattening Networks (FlatNet), are theoretically interpretable, computationally feasible at scale, and generalize well to test data.
arXiv Detail & Related papers (2023-05-02T20:36:34Z) - An Information-Theoretic Framework for Supervised Learning [22.280001450122175]
We propose a novel information-theoretic framework with its own notions of regret and sample complexity.
We study the sample complexity of learning from data generated by deep neural networks with ReLU activation units.
We conclude by corroborating our theoretical results with experimental analysis of random single-hidden-layer neural networks.
arXiv Detail & Related papers (2022-03-01T05:58:28Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z) - Analyzing the Noise Robustness of Deep Neural Networks [43.63911131982369]
Adversarial examples, generated by adding small but intentionally imperceptible perturbations to normal examples, can mislead deep neural networks (DNNs) to make incorrect predictions.
We present a visual analysis method to explain why adversarial examples are misclassified.
arXiv Detail & Related papers (2020-01-26T03:39:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.