Extracting Global Dynamics of Loss Landscape in Deep Learning Models
- URL: http://arxiv.org/abs/2106.07683v1
- Date: Mon, 14 Jun 2021 18:07:05 GMT
- Title: Extracting Global Dynamics of Loss Landscape in Deep Learning Models
- Authors: Mohammed Eslami, Hamed Eramian, Marcio Gameiro, William Kalies,
Konstantin Mischaikow
- Abstract summary: We present a toolkit for the Dynamical Organization Of Deep Learning Loss Landscapes, or DOODL3.
DOODL3 formulates the training of neural networks as a dynamical system, analyzes the learning process, and presents an interpretable global view of trajectories in the loss landscape.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning models evolve through training to learn the manifold in which
the data exists to satisfy an objective. It is well known that evolution leads
to different final states which produce inconsistent predictions of the same
test data points. This calls for techniques to be able to empirically quantify
the difference in the trajectories and highlight problematic regions. While
much focus is placed on discovering what models learn, the question of how a
model learns is less studied beyond theoretical landscape characterizations and
local geometric approximations near optimal conditions. Here, we present a
toolkit for the Dynamical Organization Of Deep Learning Loss Landscapes, or
DOODL3. DOODL3 formulates the training of neural networks as a dynamical
system, analyzes the learning process, and presents an interpretable global
view of trajectories in the loss landscape. Our approach uses the coarseness of
topology to capture the granularity of geometry to mitigate against states of
instability or elongated training. Overall, our analysis presents an empirical
framework to extract the global dynamics of a model and to use that information
to guide the training of neural networks.
Related papers
- Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - Pre-training Contextualized World Models with In-the-wild Videos for
Reinforcement Learning [54.67880602409801]
In this paper, we study the problem of pre-training world models with abundant in-the-wild videos for efficient learning of visual control tasks.
We introduce Contextualized World Models (ContextWM) that explicitly separate context and dynamics modeling.
Our experiments show that in-the-wild video pre-training equipped with ContextWM can significantly improve the sample efficiency of model-based reinforcement learning.
arXiv Detail & Related papers (2023-05-29T14:29:12Z) - Explaining Deep Models through Forgettable Learning Dynamics [12.653673008542155]
We visualize the learning behaviour during training by tracking how often samples are learned and forgotten in subsequent training epochs.
Inspired by this phenomenon, we present a novel segmentation method that actively uses this information to alter the data representation within the model.
arXiv Detail & Related papers (2023-01-10T21:59:20Z) - Taxonomizing local versus global structure in neural network loss
landscapes [60.206524503782006]
We show that the best test accuracy is obtained when the loss landscape is globally well-connected.
We also show that globally poorly-connected landscapes can arise when models are small or when they are trained to lower quality data.
arXiv Detail & Related papers (2021-07-23T13:37:14Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - A Visual Analytics Framework for Explaining and Diagnosing Transfer
Learning Processes [42.57604833160855]
We present a visual analytics framework for the multi-level exploration of the transfer learning processes when training deep neural networks.
Our framework establishes a multi-aspect design to explain how the learned knowledge from the existing model is transferred into the new learning task when training deep neural networks.
arXiv Detail & Related papers (2020-09-15T05:59:00Z) - Deep learning of contagion dynamics on complex networks [0.0]
We propose a complementary approach based on deep learning to build effective models of contagion dynamics on networks.
By allowing simulations on arbitrary network structures, our approach makes it possible to explore the properties of the learned dynamics beyond the training data.
Our results demonstrate how deep learning offers a new and complementary perspective to build effective models of contagion dynamics on networks.
arXiv Detail & Related papers (2020-06-09T17:18:34Z) - Gradients as Features for Deep Representation Learning [26.996104074384263]
We address the problem of deep representation learning--the efficient adaption of a pre-trained deep network to different tasks.
Our key innovation is the design of a linear model that incorporates both gradient and activation of the pre-trained network.
We present an efficient algorithm for the training and inference of our model without computing the actual gradient.
arXiv Detail & Related papers (2020-04-12T02:57:28Z) - The large learning rate phase of deep learning: the catapult mechanism [50.23041928811575]
We present a class of neural networks with solvable training dynamics.
We find good agreement between our model's predictions and training dynamics in realistic deep learning settings.
We believe our results shed light on characteristics of models trained at different learning rates.
arXiv Detail & Related papers (2020-03-04T17:52:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.