Neural Networks as Paths through the Space of Representations
- URL: http://arxiv.org/abs/2206.10999v1
- Date: Wed, 22 Jun 2022 11:59:10 GMT
- Title: Neural Networks as Paths through the Space of Representations
- Authors: Richard D. Lange, Jordan Matelsky, Xinyue Wang, Devin Kwok, David S.
Rolnick, Konrad P. Kording
- Abstract summary: We develop a simple idea for interpreting the layer-by-layer construction of useful representations.
We formalize this intuitive idea of "distance" by leveraging recent work on metric representational similarity.
With this framework, the layer-wise computation implemented by a deep neural network can be viewed as a path in a high-dimensional representation space.
- Score: 5.165741406553346
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks implement a sequence of layer-by-layer operations that
are each relatively easy to understand, but the resulting overall computation
is generally difficult to understand. We develop a simple idea for interpreting
the layer-by-layer construction of useful representations: the role of each
layer is to reformat information to reduce the "distance" to the target
outputs. We formalize this intuitive idea of "distance" by leveraging recent
work on metric representational similarity, and show how it leads to a rich
space of geometric concepts. With this framework, the layer-wise computation
implemented by a deep neural network can be viewed as a path in a
high-dimensional representation space. We develop tools to characterize the
geometry of these in terms of distances, angles, and geodesics. We then ask
three sets of questions of residual networks trained on CIFAR-10: (1) how
straight are paths, and how does each layer contribute towards the target? (2)
how do these properties emerge over training? and (3) how similar are the paths
taken by wider versus deeper networks? We conclude by sketching additional ways
that this kind of representational geometry can be used to understand and
interpret network training, or to prescriptively improve network architectures
to suit a task.
Related papers
- Half-Space Feature Learning in Neural Networks [2.3249139042158853]
There currently exist two extreme viewpoints for neural network feature learning.
We argue neither interpretation is likely to be correct based on a novel viewpoint.
We use this alternate interpretation to motivate a model, called the Deep Linearly Gated Network (DLGN)
arXiv Detail & Related papers (2024-04-05T12:03:19Z) - Task structure and nonlinearity jointly determine learned
representational geometry [0.0]
We show that Tanh networks tend to learn representations that reflect the structure of the target outputs, while ReLU networks retain more information about the structure of the raw inputs.
Our findings shed light on the interplay between input-output geometry, nonlinearity, and learned representations in neural networks.
arXiv Detail & Related papers (2024-01-24T16:14:38Z) - Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural
Networks [49.808194368781095]
We show that three-layer neural networks have provably richer feature learning capabilities than two-layer networks.
This work makes progress towards understanding the provable benefit of three-layer neural networks over two-layer networks in the feature learning regime.
arXiv Detail & Related papers (2023-05-11T17:19:30Z) - GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs [49.55919802779889]
We propose a Graph Convolution based Spatial Propagation Network (GraphCSPN) as a general approach for depth completion.
In this work, we leverage convolution neural networks as well as graph neural networks in a complementary way for geometric representation learning.
Our method achieves the state-of-the-art performance, especially when compared in the case of using only a few propagation steps.
arXiv Detail & Related papers (2022-10-19T17:56:03Z) - Origami in N dimensions: How feed-forward networks manufacture linear
separability [1.7404865362620803]
We show that a feed-forward architecture has one primary tool at hand to achieve separability: progressive folding of the data manifold in unoccupied higher dimensions.
We argue that an alternative method based on shear, requiring very deep architectures, plays only a small role in real-world networks.
Based on the mechanistic insight, we predict that the progressive generation of separability is necessarily accompanied by neurons showing mixed selectivity and bimodal tuning curves.
arXiv Detail & Related papers (2022-03-21T21:33:55Z) - Neural Network Layer Algebra: A Framework to Measure Capacity and
Compression in Deep Learning [0.0]
We present a new framework to measure the intrinsic properties of (deep) neural networks.
While we focus on convolutional networks, our framework can be extrapolated to any network architecture.
arXiv Detail & Related papers (2021-07-02T13:43:53Z) - Solving hybrid machine learning tasks by traversing weight space
geodesics [6.09170287691728]
Machine learning problems have an intrinsic geometric structure as central objects including a neural network's weight space.
We introduce a geometric framework that unifies a range machine learning objectives and can be applied to multiple classes neural network architectures.
arXiv Detail & Related papers (2021-06-05T04:37:03Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Joint Learning of Neural Transfer and Architecture Adaptation for Image
Recognition [77.95361323613147]
Current state-of-the-art visual recognition systems rely on pretraining a neural network on a large-scale dataset and finetuning the network weights on a smaller dataset.
In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness.
Our method can be easily generalized to an unsupervised paradigm by replacing supernet training with self-supervised learning in the source domain tasks and performing linear evaluation in the downstream tasks.
arXiv Detail & Related papers (2021-03-31T08:15:17Z) - Neural-Pull: Learning Signed Distance Functions from Point Clouds by
Learning to Pull Space onto Surfaces [68.12457459590921]
Reconstructing continuous surfaces from 3D point clouds is a fundamental operation in 3D geometry processing.
We introduce textitNeural-Pull, a new approach that is simple and leads to high quality SDFs.
arXiv Detail & Related papers (2020-11-26T23:18:10Z) - Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths.
Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.