Related papers: The Global Landscape of Neural Networks: An Overview

The Global Landscape of Neural Networks: An Overview

URL: http://arxiv.org/abs/2007.01429v1
Date: Thu, 2 Jul 2020 22:50:20 GMT
Title: The Global Landscape of Neural Networks: An Overview
Authors: Ruoyu Sun, Dawei Li, Shiyu Liang, Tian Ding, R Srikant
Abstract summary: Recent success of neural networks suggests that their loss is not too bad, but what do we know about the landscape? We discuss a few rigorous results on their geometric properties wide networks such as "no bad" paths, and some modifications that eliminate suboptimal local minima and/or decreasing visualization to infinity.
Score: 23.79848233534269
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One of the major concerns for neural network training is that the non-convexity of the associated loss functions may cause bad landscape. The recent success of neural networks suggests that their loss landscape is not too bad, but what specific results do we know about the landscape? In this article, we review recent findings and results on the global landscape of neural networks. First, we point out that wide neural nets may have sub-optimal local minima under certain assumptions. Second, we discuss a few rigorous results on the geometric properties of wide networks such as "no bad basin", and some modifications that eliminate sub-optimal local minima and/or decreasing paths to infinity. Third, we discuss visualization and empirical explorations of the landscape for practical neural nets. Finally, we briefly discuss some convergence results and their relation to landscape results.

Related papers

Deeper or Wider: A Perspective from Optimal Generalization Error with Sobolev Loss [2.07180164747172]
We compare deeper neural networks (DeNNs) with a flexible number of layers and wider neural networks (WeNNs) with limited hidden layers. We find that a higher number of parameters tends to favor WeNNs, while an increased number of sample points and greater regularity in the loss function lean towards the adoption of DeNNs.
arXiv Detail & Related papers (2024-01-31T20:10:10Z)
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work [59.29606307518154]
We show that as long as the width $m geq 2n/d$ (where $d$ is the input dimension), its expressivity is strong, i.e., there exists at least one global minimizer with zero training loss. We also consider a constrained optimization formulation where the feasible region is the nice local region, and prove that every KKT point is a nearly global minimizer.
arXiv Detail & Related papers (2022-10-21T14:41:26Z)
SAR Despeckling Using Overcomplete Convolutional Networks [53.99620005035804]
despeckling is an important problem in remote sensing as speckle degrades SAR images. Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods. This study employs an overcomplete CNN architecture to focus on learning low-level features by restricting the receptive field. We show that the proposed network improves despeckling performance compared to recent despeckling methods on synthetic and real SAR images.
arXiv Detail & Related papers (2022-05-31T15:55:37Z)
FuNNscope: Visual microscope for interactively exploring the loss landscape of fully connected neural networks [77.34726150561087]
We show how to explore high-dimensional landscape characteristics of neural networks. We generalize observations on small neural networks to more complex systems. An interactive dashboard opens up a number of possible application networks.
arXiv Detail & Related papers (2022-04-09T16:41:53Z)
Exact Solutions of a Deep Linear Network [2.2344764434954256]
This work finds the analytical expression of the global minima of a deep linear network with weight decay and neurons. We show that weight decay strongly interacts with the model architecture and can create bad minima at zero in a network with more than $1$ hidden layer.
arXiv Detail & Related papers (2022-02-10T00:13:34Z)
Taxonomizing local versus global structure in neural network loss landscapes [60.206524503782006]
We show that the best test accuracy is obtained when the loss landscape is globally well-connected. We also show that globally poorly-connected landscapes can arise when models are small or when they are trained to lower quality data.
arXiv Detail & Related papers (2021-07-23T13:37:14Z)
Topological obstructions in neural networks learning [67.8848058842671]
We study global properties of the loss gradient function flow. We use topological data analysis of the loss function and its Morse complex to relate local behavior along gradient trajectories with global properties of the loss surface.
arXiv Detail & Related papers (2020-12-31T18:53:25Z)
Piecewise linear activations substantially shape the loss surfaces of neural networks [95.73230376153872]
This paper presents how piecewise linear activation functions substantially shape the loss surfaces of neural networks. We first prove that it the loss surfaces of many neural networks have infinite spurious local minima which are defined as the local minima with higher empirical risks than the global minima. For one-hidden-layer networks, we prove that all local minima in a cell constitute an equivalence class; they are concentrated in a valley; and they are all global minima in the cell.
arXiv Detail & Related papers (2020-03-27T04:59:34Z)
Avoiding Spurious Local Minima in Deep Quadratic Networks [0.0]
We characterize the landscape of the mean squared nonlinear error for networks with neural activation functions. We prove that deepized neural networks with quadratic activations benefit from similar landscape properties.
arXiv Detail & Related papers (2019-12-31T22:31:11Z)
Barcodes as Summary of Loss Function Topology [65.3479573549873]
We show that increase of the neural network's depth and width lowers the barcodes of local minima. This has some natural implications for the neural network's learning and for its generalization properties.
arXiv Detail & Related papers (2019-11-29T19:22:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.