Visualizing High-Dimensional Trajectories on the Loss-Landscape of ANNs
- URL: http://arxiv.org/abs/2102.00485v1
- Date: Sun, 31 Jan 2021 16:30:50 GMT
- Title: Visualizing High-Dimensional Trajectories on the Loss-Landscape of ANNs
- Authors: Stefan Horoi, Jessie Huang, Guy Wolf, Smita Krishnaswamy
- Abstract summary: Training artificial neural networks requires the optimization of highly non-dimensional loss functions.
Visualization tools have played a key role in uncovering key geometric characteristics of loss-landscape of ANNs.
We propose the modernity reduction method which represents the SOTA in terms both local and global structures.
- Score: 15.689418447376587
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training artificial neural networks requires the optimization of highly
non-convex loss functions. Throughout the years, the scientific community has
developed an extensive set of tools and architectures that render this
optimization task tractable and a general intuition has been developed for
choosing hyper parameters that help the models reach minima that generalize
well to unseen data. However, for the most part, the difference in trainability
in between architectures, tasks and even the gap in network generalization
abilities still remain unexplained. Visualization tools have played a key role
in uncovering key geometric characteristics of the loss-landscape of ANNs and
how they impact trainability and generalization capabilities. However, most
visualizations methods proposed so far have been relatively limited in their
capabilities since they are of linear nature and only capture features in a
limited number of dimensions. We propose the use of the modern dimensionality
reduction method PHATE which represents the SOTA in terms of capturing both
global and local structures of high-dimensional data. We apply this method to
visualize the loss landscape during and after training. Our visualizations
reveal differences in training trajectories and generalization capabilities
when used to make comparisons between optimization methods, initializations,
architectures, and datasets. Given this success we anticipate this method to be
used in making informed choices about these aspects of neural networks.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Towards Robust Out-of-Distribution Generalization: Data Augmentation and Neural Architecture Search Approaches [4.577842191730992]
We study ways toward robust OoD generalization for deep learning.
We first propose a novel and effective approach to disentangle the spurious correlation between features that are not essential for recognition.
We then study the problem of strengthening neural architecture search in OoD scenarios.
arXiv Detail & Related papers (2024-10-25T20:50:32Z) - Growing Tiny Networks: Spotting Expressivity Bottlenecks and Fixing Them Optimally [2.645067871482715]
In machine learning tasks, one searches for an optimal function within a certain functional space.
This way forces the evolution of the function during training to lie within the realm of what is expressible with the chosen architecture.
We show that the information about desirable architectural changes, due to expressivity bottlenecks can be extracted from %the backpropagation.
arXiv Detail & Related papers (2024-05-30T08:23:56Z) - Contrastive-Adversarial and Diffusion: Exploring pre-training and fine-tuning strategies for sulcal identification [3.0398616939692777]
Techniques like adversarial learning, contrastive learning, diffusion denoising learning, and ordinary reconstruction learning have become standard.
The study aims to elucidate the advantages of pre-training techniques and fine-tuning strategies to enhance the learning process of neural networks.
arXiv Detail & Related papers (2024-05-29T15:44:51Z) - Statistical Mechanics and Artificial Neural Networks: Principles, Models, and Applications [0.0]
The field of neuroscience and the development of artificial neural networks (ANNs) have mutually influenced each other.
In the first part of this chapter, we provide an overview of the principles, models, and applications of ANNs.
The second part of this chapter focuses on quantifying geometric properties and visualizing loss functions associated with deep ANNs.
arXiv Detail & Related papers (2024-04-05T13:54:58Z) - Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - Human as Points: Explicit Point-based 3D Human Reconstruction from
Single-view RGB Images [78.56114271538061]
We introduce an explicit point-based human reconstruction framework called HaP.
Our approach is featured by fully-explicit point cloud estimation, manipulation, generation, and refinement in the 3D geometric space.
Our results may indicate a paradigm rollback to the fully-explicit and geometry-centric algorithm design.
arXiv Detail & Related papers (2023-11-06T05:52:29Z) - Robust Geometry-Preserving Depth Estimation Using Differentiable
Rendering [93.94371335579321]
We propose a learning framework that trains models to predict geometry-preserving depth without requiring extra data or annotations.
Comprehensive experiments underscore our framework's superior generalization capabilities.
Our innovative loss functions empower the model to autonomously recover domain-specific scale-and-shift coefficients.
arXiv Detail & Related papers (2023-09-18T12:36:39Z) - What can linear interpolation of neural network loss landscapes tell us? [11.753360538833139]
Loss landscapes are notoriously difficult to visualize in a human-comprehensible fashion.
One common way to address this problem is to plot linear slices of the landscape.
arXiv Detail & Related papers (2021-06-30T11:54:04Z) - On Robustness and Transferability of Convolutional Neural Networks [147.71743081671508]
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts.
We study the interplay between out-of-distribution and transfer performance of modern image classification CNNs for the first time.
We find that increasing both the training set and model sizes significantly improve the distributional shift robustness.
arXiv Detail & Related papers (2020-07-16T18:39:04Z) - Understanding the Effects of Data Parallelism and Sparsity on Neural
Network Training [126.49572353148262]
We study two factors in neural network training: data parallelism and sparsity.
Despite their promising benefits, understanding of their effects on neural network training remains elusive.
arXiv Detail & Related papers (2020-03-25T10:49:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.