Solving hybrid machine learning tasks by traversing weight space
geodesics
- URL: http://arxiv.org/abs/2106.02793v1
- Date: Sat, 5 Jun 2021 04:37:03 GMT
- Title: Solving hybrid machine learning tasks by traversing weight space
geodesics
- Authors: Guruprasad Raghavan, Matt Thomson
- Abstract summary: Machine learning problems have an intrinsic geometric structure as central objects including a neural network's weight space.
We introduce a geometric framework that unifies a range machine learning objectives and can be applied to multiple classes neural network architectures.
- Score: 6.09170287691728
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning problems have an intrinsic geometric structure as central
objects including a neural network's weight space and the loss function
associated with a particular task can be viewed as encoding the intrinsic
geometry of a given machine learning problem. Therefore, geometric concepts can
be applied to analyze and understand theoretical properties of machine learning
strategies as well as to develop new algorithms. In this paper, we address
three seemingly unrelated open questions in machine learning by viewing them
through a unified framework grounded in differential geometry. Specifically, we
view the weight space of a neural network as a manifold endowed with a
Riemannian metric that encodes performance on specific tasks. By defining a
metric, we can construct geodesic, minimum length, paths in weight space that
represent sets of networks of equivalent or near equivalent functional
performance on a specific task. We, then, traverse geodesic paths while
identifying networks that satisfy a second objective. Inspired by the geometric
insight, we apply our geodesic framework to 3 major applications: (i) Network
sparsification (ii) Mitigating catastrophic forgetting by constructing networks
with high performance on a series of objectives and (iii) Finding high-accuracy
paths connecting distinct local optima of deep networks in the non-convex loss
landscape. Our results are obtained on a wide range of network architectures
(MLP, VGG11/16) trained on MNIST, CIFAR-10/100. Broadly, we introduce a
geometric framework that unifies a range of machine learning objectives and
that can be applied to multiple classes of neural network architectures.
Related papers
- Unsupervised Graph Attention Autoencoder for Attributed Networks using
K-means Loss [0.0]
We introduce a simple, efficient, and clustering-oriented model based on unsupervised textbfGraph Attention textbfAutotextbfEncoder for community detection in attributed networks.
The proposed model adeptly learns representations from both the network's topology and attribute information, simultaneously addressing dual objectives: reconstruction and community discovery.
arXiv Detail & Related papers (2023-11-21T20:45:55Z) - Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural
Networks [49.808194368781095]
We show that three-layer neural networks have provably richer feature learning capabilities than two-layer networks.
This work makes progress towards understanding the provable benefit of three-layer neural networks over two-layer networks in the feature learning regime.
arXiv Detail & Related papers (2023-05-11T17:19:30Z) - Neural Networks as Paths through the Space of Representations [5.165741406553346]
We develop a simple idea for interpreting the layer-by-layer construction of useful representations.
We formalize this intuitive idea of "distance" by leveraging recent work on metric representational similarity.
With this framework, the layer-wise computation implemented by a deep neural network can be viewed as a path in a high-dimensional representation space.
arXiv Detail & Related papers (2022-06-22T11:59:10Z) - Quasi-orthogonality and intrinsic dimensions as measures of learning and
generalisation [55.80128181112308]
We show that dimensionality and quasi-orthogonality of neural networks' feature space may jointly serve as network's performance discriminants.
Our findings suggest important relationships between the networks' final performance and properties of their randomly initialised feature spaces.
arXiv Detail & Related papers (2022-03-30T21:47:32Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges [50.22269760171131]
The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods.
This text is concerned with exposing pre-defined regularities through unified geometric principles.
It provides a common mathematical framework to study the most successful neural network architectures, such as CNNs, RNNs, GNNs, and Transformers.
arXiv Detail & Related papers (2021-04-27T21:09:51Z) - Sparsifying networks by traversing Geodesics [6.09170287691728]
In this paper, we attempt to solve certain open questions in ML, by viewing them through the lens of geometry.
We propose a mathematical framework to evaluate geodesics in the functional space, to find high-performance paths from a dense network to its sparser counterpart.
arXiv Detail & Related papers (2020-12-12T21:39:19Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training.
The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.