Neural networks learn to magnify areas near decision boundaries
- URL: http://arxiv.org/abs/2301.11375v3
- Date: Sat, 14 Oct 2023 22:52:45 GMT
- Title: Neural networks learn to magnify areas near decision boundaries
- Authors: Jacob A. Zavatone-Veth and Sheng Yang and Julian A. Rubinfien and
Cengiz Pehlevan
- Abstract summary: We study how training shapes the geometry induced by unconstrained neural network feature maps.
We first show that at infinite width, neural networks with random parameters induce highly symmetric metrics on input space.
This symmetry is broken by feature learning: networks trained to perform classification tasks learn to magnify local areas along decision boundaries.
- Score: 32.84188052937496
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In machine learning, there is a long history of trying to build neural
networks that can learn from fewer example data by baking in strong geometric
priors. However, it is not always clear a priori what geometric constraints are
appropriate for a given task. Here, we consider the possibility that one can
uncover useful geometric inductive biases by studying how training molds the
Riemannian geometry induced by unconstrained neural network feature maps. We
first show that at infinite width, neural networks with random parameters
induce highly symmetric metrics on input space. This symmetry is broken by
feature learning: networks trained to perform classification tasks learn to
magnify local areas along decision boundaries. This holds in deep networks
trained on high-dimensional image classification tasks, and even in
self-supervised representation learning. These results begins to elucidate how
training shapes the geometry induced by unconstrained neural network feature
maps, laying the groundwork for an understanding of this richly nonlinear form
of feature learning.
Related papers
- Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - Asymptotics of Learning with Deep Structured (Random) Features [9.366617422860543]
For a large class of feature maps we provide a tight characterisation of the test error associated with learning the readout layer.
In some cases our results can capture feature maps learned by deep, finite-width neural networks trained under gradient descent.
arXiv Detail & Related papers (2024-02-21T18:35:27Z) - Task structure and nonlinearity jointly determine learned
representational geometry [0.0]
We show that Tanh networks tend to learn representations that reflect the structure of the target outputs, while ReLU networks retain more information about the structure of the raw inputs.
Our findings shed light on the interplay between input-output geometry, nonlinearity, and learned representations in neural networks.
arXiv Detail & Related papers (2024-01-24T16:14:38Z) - Riemannian Residual Neural Networks [58.925132597945634]
We show how to extend the residual neural network (ResNet)
ResNets have become ubiquitous in machine learning due to their beneficial learning properties, excellent empirical results, and easy-to-incorporate nature when building varied neural networks.
arXiv Detail & Related papers (2023-10-16T02:12:32Z) - A singular Riemannian geometry approach to Deep Neural Networks II.
Reconstruction of 1-D equivalence classes [78.120734120667]
We build the preimage of a point in the output manifold in the input space.
We focus for simplicity on the case of neural networks maps from n-dimensional real spaces to (n - 1)-dimensional real spaces.
arXiv Detail & Related papers (2021-12-17T11:47:45Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.