Landscaping Linear Mode Connectivity
- URL: http://arxiv.org/abs/2406.16300v1
- Date: Mon, 24 Jun 2024 03:53:30 GMT
- Title: Landscaping Linear Mode Connectivity
- Authors: Sidak Pal Singh, Linara Adilova, Michael Kamp, Asja Fischer, Bernhard Schölkopf, Thomas Hofmann,
- Abstract summary: linear mode connectivity (LMC) has garnered interest from both theoretical and practical fronts.
We take a step towards understanding it by providing a model of how the loss landscape needs to behave topographically for LMC.
- Score: 76.39694196535996
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The presence of linear paths in parameter space between two different network solutions in certain cases, i.e., linear mode connectivity (LMC), has garnered interest from both theoretical and practical fronts. There has been significant research that either practically designs algorithms catered for connecting networks by adjusting for the permutation symmetries as well as some others that more theoretically construct paths through which networks can be connected. Yet, the core reasons for the occurrence of LMC, when in fact it does occur, in the highly non-convex loss landscapes of neural networks are far from clear. In this work, we take a step towards understanding it by providing a model of how the loss landscape needs to behave topographically for LMC (or the lack thereof) to manifest. Concretely, we present a `mountainside and ridge' perspective that helps to neatly tie together different geometric features that can be spotted in the loss landscape along the training runs. We also complement this perspective by providing a theoretical analysis of the barrier height, for which we provide empirical support, and which additionally extends as a faithful predictor of layer-wise LMC. We close with a toy example that provides further intuition on how barriers arise in the first place, all in all, showcasing the larger aim of the work -- to provide a working model of the landscape and its topography for the occurrence of LMC.
Related papers
- Exploring Neural Network Landscapes: Star-Shaped and Geodesic Connectivity [4.516746821973374]
We show that for two typical global minima, there exists a path connecting them without barrier.
For a finite number of typical minima, there exists a center on minima manifold that connects all of them simultaneously.
Results are provably valid for linear networks and two-layer ReLU networks under a teacher-student setup.
arXiv Detail & Related papers (2024-04-09T15:35:02Z) - 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception [56.84921040837699]
Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results.
We propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics.
We also present an unsupervised adaptation technique tailored for horizon-depth and ratio representations.
Our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.
arXiv Detail & Related papers (2023-12-26T12:16:03Z) - Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature
Connectivity [62.11981948274508]
The study of LLFC transcends and advances our understanding of LMC by adopting a feature-learning perspective.
We provide comprehensive empirical evidence for LLFC across a wide range of settings, demonstrating that whenever two trained networks satisfy LMC, they also satisfy LLFC in nearly all the layers.
arXiv Detail & Related papers (2023-07-17T07:16:28Z) - On the explainable properties of 1-Lipschitz Neural Networks: An Optimal
Transport Perspective [0.0]
Saliency Maps generated by traditional neural networks are often noisy and provide limited insights.
In this paper, we demonstrate that, on the contrary, the Saliency Maps of 1-Lipschitz neural networks, exhibit desirable XAI properties.
We also prove that these maps align unprecedentedly well with human explanations on ImageNet.
arXiv Detail & Related papers (2022-06-14T13:49:08Z) - Deep Architecture Connectivity Matters for Its Convergence: A
Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training.
We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z) - Deep Networks on Toroids: Removing Symmetries Reveals the Structure of
Flat Regions in the Landscape Geometry [3.712728573432119]
We develop a standardized parameterization in which all symmetries are removed, resulting in a toroidal topology.
We derive a meaningful notion of the flatness of minimizers and of the geodesic paths connecting them.
We also find that minimizers found by variants of gradient descent can be connected by zero-error paths with a single bend.
arXiv Detail & Related papers (2022-02-07T09:57:54Z) - Sparsifying networks by traversing Geodesics [6.09170287691728]
In this paper, we attempt to solve certain open questions in ML, by viewing them through the lens of geometry.
We propose a mathematical framework to evaluate geodesics in the functional space, to find high-performance paths from a dense network to its sparser counterpart.
arXiv Detail & Related papers (2020-12-12T21:39:19Z) - Optimizing Mode Connectivity via Neuron Alignment [84.26606622400423]
Empirically, the local minima of loss functions can be connected by a learned curve in model space along which the loss remains nearly constant.
We propose a more general framework to investigate effect of symmetry on landscape connectivity by accounting for the weight permutations of networks being connected.
arXiv Detail & Related papers (2020-09-05T02:25:23Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.