Related papers: Geodesic Mode Connectivity

Geodesic Mode Connectivity

URL: http://arxiv.org/abs/2308.12666v1
Date: Thu, 24 Aug 2023 09:18:43 GMT
Title: Geodesic Mode Connectivity
Authors: Charlie Tan, Theodore Long, Sarah Zhao and Rudolf Laine
Abstract summary: Mode connectivity is a phenomenon where trained models are connected by a path of low loss. We propose an algorithm to approximate geodesics and demonstrate that they achieve mode connectivity.
Score: 4.096453902709292
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Mode connectivity is a phenomenon where trained models are connected by a path of low loss. We reframe this in the context of Information Geometry, where neural networks are studied as spaces of parameterized distributions with curved geometry. We hypothesize that shortest paths in these spaces, known as geodesics, correspond to mode-connecting paths in the loss landscape. We propose an algorithm to approximate geodesics and demonstrate that they achieve mode connectivity.

Related papers

Understanding Mode Connectivity via Parameter Space Symmetry [33.150665036826624]
Neural network minima are often connected by curves along which train and test loss remain nearly constant.<n>We propose a new approach to exploring the connectedness of minima using parameter space symmetry.
arXiv Detail & Related papers (2025-05-29T17:20:54Z)
Input Space Mode Connectivity in Deep Neural Networks [5.8470747480006695]
We extend the concept of loss landscape mode connectivity to the input space of deep neural networks. We present theoretical and empirical evidence of its presence in the input space of deep networks. We exploit mode connectivity to obtain new insights about adversarial examples and demonstrate its potential for adversarial detection.
arXiv Detail & Related papers (2024-09-09T17:03:43Z)
Landscaping Linear Mode Connectivity [76.39694196535996]
linear mode connectivity (LMC) has garnered interest from both theoretical and practical fronts. We take a step towards understanding it by providing a model of how the loss landscape needs to behave topographically for LMC.
arXiv Detail & Related papers (2024-06-24T03:53:30Z)
Quiver neural networks [5.076419064097734]
We develop a uniform theoretical approach towards the analysis of various neural network connectivity architectures. Inspired by quiver representation theory in mathematics, this approach gives a compact way to capture elaborate data flows.
arXiv Detail & Related papers (2022-07-26T09:42:45Z)
Linear Connectivity Reveals Generalization Strategies [54.947772002394736]
Some pairs of finetuned models have large barriers of increasing loss on the linear paths between them. We find distinct clusters of models which are linearly connected on the test loss surface, but are disconnected from models outside the cluster. Our work demonstrates how the geometry of the loss surface can guide models towards different functions.
arXiv Detail & Related papers (2022-05-24T23:43:02Z)
Deep Networks on Toroids: Removing Symmetries Reveals the Structure of Flat Regions in the Landscape Geometry [3.712728573432119]
We develop a standardized parameterization in which all symmetries are removed, resulting in a toroidal topology. We derive a meaningful notion of the flatness of minimizers and of the geodesic paths connecting them. We also find that minimizers found by variants of gradient descent can be connected by zero-error paths with a single bend.
arXiv Detail & Related papers (2022-02-07T09:57:54Z)
Geodesic Models with Convexity Shape Prior [8.932981695464761]
In this paper, we take into account a more complicated problem: finding curvature-penalized geodesic paths with a convexity shape prior. We establish new geodesic models relying on the strategy of orientation-lifting. The convexity shape prior serves as a constraint for the construction of local geodesic metrics encoding a curvature constraint.
arXiv Detail & Related papers (2021-11-01T09:41:54Z)
Fusing the Old with the New: Learning Relative Camera Pose with Geometry-Guided Uncertainty [91.0564497403256]
We present a novel framework that involves probabilistic fusion between the two families of predictions during network training. Our network features a self-attention graph neural network, which drives the learning by enforcing strong interactions between different correspondences. We propose motion parmeterizations suitable for learning and show that our method achieves state-of-the-art performance on the challenging DeMoN and ScanNet datasets.
arXiv Detail & Related papers (2021-04-16T17:59:06Z)
GELATO: Geometrically Enriched Latent Model for Offline Reinforcement Learning [54.291331971813364]
offline reinforcement learning approaches can be divided into proximal and uncertainty-aware methods. In this work, we demonstrate the benefit of combining the two in a latent variational model. Our proposed metrics measure both the quality of out of distribution samples as well as the discrepancy of examples in the data.
arXiv Detail & Related papers (2021-02-22T19:42:40Z)
Optimizing Mode Connectivity via Neuron Alignment [84.26606622400423]
Empirically, the local minima of loss functions can be connected by a learned curve in model space along which the loss remains nearly constant. We propose a more general framework to investigate effect of symmetry on landscape connectivity by accounting for the weight permutations of networks being connected.
arXiv Detail & Related papers (2020-09-05T02:25:23Z)
An Ode to an ODE [78.97367880223254]
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the group O(d) This nested system of two flows provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem.
arXiv Detail & Related papers (2020-06-19T22:05:19Z)
Uniform Interpolation Constrained Geodesic Learning on Data Manifold [28.509561636926414]
Along the learned geodesic, our method can generate high-qualitys between two given data samples. We provide a theoretical analysis of our model and use image translation as an example to demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2020-02-12T07:47:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.