Related papers: Visualizing High-Dimensional Trajectories on the Loss-Landscape of ANNs

Visualizing High-Dimensional Trajectories on the Loss-Landscape of ANNs

URL: http://arxiv.org/abs/2102.00485v1
Date: Sun, 31 Jan 2021 16:30:50 GMT
Title: Visualizing High-Dimensional Trajectories on the Loss-Landscape of ANNs
Authors: Stefan Horoi, Jessie Huang, Guy Wolf, Smita Krishnaswamy
Abstract summary: Training artificial neural networks requires the optimization of highly non-dimensional loss functions. Visualization tools have played a key role in uncovering key geometric characteristics of loss-landscape of ANNs. We propose the modernity reduction method which represents the SOTA in terms both local and global structures.
Score: 15.689418447376587
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Training artificial neural networks requires the optimization of highly non-convex loss functions. Throughout the years, the scientific community has developed an extensive set of tools and architectures that render this optimization task tractable and a general intuition has been developed for choosing hyper parameters that help the models reach minima that generalize well to unseen data. However, for the most part, the difference in trainability in between architectures, tasks and even the gap in network generalization abilities still remain unexplained. Visualization tools have played a key role in uncovering key geometric characteristics of the loss-landscape of ANNs and how they impact trainability and generalization capabilities. However, most visualizations methods proposed so far have been relatively limited in their capabilities since they are of linear nature and only capture features in a limited number of dimensions. We propose the use of the modern dimensionality reduction method PHATE which represents the SOTA in terms of capturing both global and local structures of high-dimensional data. We apply this method to visualize the loss landscape during and after training. Our visualizations reveal differences in training trajectories and generalization capabilities when used to make comparisons between optimization methods, initializations, architectures, and datasets. Given this success we anticipate this method to be used in making informed choices about these aspects of neural networks.

Related papers

Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Towards Robust Out-of-Distribution Generalization: Data Augmentation and Neural Architecture Search Approaches [4.577842191730992]
We study ways toward robust OoD generalization for deep learning. We first propose a novel and effective approach to disentangle the spurious correlation between features that are not essential for recognition. We then study the problem of strengthening neural architecture search in OoD scenarios.
arXiv Detail & Related papers (2024-10-25T20:50:32Z)
Growing Tiny Networks: Spotting Expressivity Bottlenecks and Fixing Them Optimally [2.645067871482715]
In machine learning tasks, one searches for an optimal function within a certain functional space. This way forces the evolution of the function during training to lie within the realm of what is expressible with the chosen architecture. We show that the information about desirable architectural changes, due to expressivity bottlenecks can be extracted from %the backpropagation.
arXiv Detail & Related papers (2024-05-30T08:23:56Z)
Contrastive-Adversarial and Diffusion: Exploring pre-training and fine-tuning strategies for sulcal identification [3.0398616939692777]
Techniques like adversarial learning, contrastive learning, diffusion denoising learning, and ordinary reconstruction learning have become standard. The study aims to elucidate the advantages of pre-training techniques and fine-tuning strategies to enhance the learning process of neural networks.
arXiv Detail & Related papers (2024-05-29T15:44:51Z)
Statistical Mechanics and Artificial Neural Networks: Principles, Models, and Applications [0.0]
The field of neuroscience and the development of artificial neural networks (ANNs) have mutually influenced each other. In the first part of this chapter, we provide an overview of the principles, models, and applications of ANNs. The second part of this chapter focuses on quantifying geometric properties and visualizing loss functions associated with deep ANNs.
arXiv Detail & Related papers (2024-04-05T13:54:58Z)
Hyperspectral Image Analysis in Single-Modal and Multimodal setting using Deep Learning Techniques [1.2328446298523066]
Hyperspectral imaging provides precise classification for land use and cover due to its exceptional spectral resolution. However, the challenges of high dimensionality and limited spatial resolution hinder its effectiveness. This study addresses these challenges by employing deep learning techniques to efficiently process, extract features, and classify data in an integrated manner.
arXiv Detail & Related papers (2024-03-03T15:47:43Z)
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity. Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z)
Human as Points: Explicit Point-based 3D Human Reconstruction from Single-view RGB Images [78.56114271538061]
We introduce an explicit point-based human reconstruction framework called HaP. Our approach is featured by fully-explicit point cloud estimation, manipulation, generation, and refinement in the 3D geometric space. Our results may indicate a paradigm rollback to the fully-explicit and geometry-centric algorithm design.
arXiv Detail & Related papers (2023-11-06T05:52:29Z)
Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering [93.94371335579321]
We propose a learning framework that trains models to predict geometry-preserving depth without requiring extra data or annotations. Comprehensive experiments underscore our framework's superior generalization capabilities. Our innovative loss functions empower the model to autonomously recover domain-specific scale-and-shift coefficients.
arXiv Detail & Related papers (2023-09-18T12:36:39Z)
What can linear interpolation of neural network loss landscapes tell us? [11.753360538833139]
Loss landscapes are notoriously difficult to visualize in a human-comprehensible fashion. One common way to address this problem is to plot linear slices of the landscape.
arXiv Detail & Related papers (2021-06-30T11:54:04Z)
On Robustness and Transferability of Convolutional Neural Networks [147.71743081671508]
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts. We study the interplay between out-of-distribution and transfer performance of modern image classification CNNs for the first time. We find that increasing both the training set and model sizes significantly improve the distributional shift robustness.
arXiv Detail & Related papers (2020-07-16T18:39:04Z)
Understanding the Effects of Data Parallelism and Sparsity on Neural Network Training [126.49572353148262]
We study two factors in neural network training: data parallelism and sparsity. Despite their promising benefits, understanding of their effects on neural network training remains elusive.
arXiv Detail & Related papers (2020-03-25T10:49:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.