Related papers: Neural Anisotropy Directions

Neural Anisotropy Directions

URL: http://arxiv.org/abs/2006.09717v2
Date: Wed, 14 Oct 2020 10:21:58 GMT
Title: Neural Anisotropy Directions
Authors: Guillermo Ortiz-Jimenez, Apostolos Modas, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard
Abstract summary: We define neural anisotropy directions (NADs) the vectors that encapsulate the directional inductive bias of an architecture. We show that for the CIFAR-10 dataset, NADs characterize the features used by CNNs to discriminate between different classes.
Score: 63.627760598441796
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we analyze the role of the network architecture in shaping the inductive bias of deep classifiers. To that end, we start by focusing on a very simple problem, i.e., classifying a class of linearly separable distributions, and show that, depending on the direction of the discriminative feature of the distribution, many state-of-the-art deep convolutional neural networks (CNNs) have a surprisingly hard time solving this simple task. We then define as neural anisotropy directions (NADs) the vectors that encapsulate the directional inductive bias of an architecture. These vectors, which are specific for each architecture and hence act as a signature, encode the preference of a network to separate the input data based on some particular features. We provide an efficient method to identify NADs for several CNN architectures and thus reveal their directional inductive biases. Furthermore, we show that, for the CIFAR-10 dataset, NADs characterize the features used by CNNs to discriminate between different classes.

Related papers

Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations [54.17275171325324]
We present a counterexample to the Linear Representation Hypothesis (LRH) When trained to repeat an input token sequence, neural networks learn to represent the token at each position with a particular order of magnitude, rather than a direction. These findings strongly indicate that interpretability research should not be confined to the LRH.
arXiv Detail & Related papers (2024-08-20T15:04:37Z)
Exploring Learned Representations of Neural Networks with Principal Component Analysis [1.0923877073891446]
In certain layers, as little as 20% of the intermediate feature-space variance is necessary for high-accuracy classification. We relate our findings to neural collapse and provide partial evidence for the related phenomenon of intermediate neural collapse.
arXiv Detail & Related papers (2023-09-27T00:18:25Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Predictions Based on Pixel Data: Insights from PDEs and Finite Differences [0.0]
This paper deals with approximation of time sequences where each observation is a matrix. We show that with relatively small networks, we can represent exactly a class of numerical discretizations of PDEs based on the method of lines. Our network architecture is inspired by those typically adopted in the approximation of time sequences.
arXiv Detail & Related papers (2023-05-01T08:54:45Z)
Towards Rigorous Understanding of Neural Networks via Semantics-preserving Transformations [0.0]
We present an approach to the precise and global verification and explanation of Rectifier Neural Networks. Key to our approach is the symbolic execution of these networks that allows the construction of semantically equivalent Typed Affine Decision Structures.
arXiv Detail & Related papers (2023-01-19T11:35:07Z)
Neural networks trained with SGD learn distributions of increasing complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics. We then exploit higher-order statistics only later during training. We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z)
Dynamical systems' based neural networks [0.7874708385247353]
We build neural networks using a suitable, structure-preserving, numerical time-discretisation. The structure of the neural network is then inferred from the properties of the ODE vector field. We present two universal approximation results and demonstrate how to impose some particular properties on the neural networks.
arXiv Detail & Related papers (2022-10-05T16:30:35Z)
Discretization Invariant Networks for Learning Maps between Neural Fields [3.09125960098955]
We present a new framework for understanding and designing discretization invariant neural networks (DI-Nets) Our analysis establishes upper bounds on the deviation in model outputs under different finite discretizations. We prove by construction that DI-Nets universally approximate a large class of maps between integrable function spaces.
arXiv Detail & Related papers (2022-06-02T17:44:03Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Architecture Disentanglement for Deep Neural Networks [174.16176919145377]
We introduce neural architecture disentanglement (NAD) to explain the inner workings of deep neural networks (DNNs) NAD learns to disentangle a pre-trained DNN into sub-architectures according to independent tasks, forming information flows that describe the inference processes. Results show that misclassified images have a high probability of being assigned to task sub-architectures similar to the correct ones.
arXiv Detail & Related papers (2020-03-30T08:34:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.