Related papers: Embedding Principle of Loss Landscape of Deep Neural Networks

Embedding Principle of Loss Landscape of Deep Neural Networks

URL: http://arxiv.org/abs/2105.14573v1
Date: Sun, 30 May 2021 15:32:32 GMT
Title: Embedding Principle of Loss Landscape of Deep Neural Networks
Authors: Yaoyu Zhang, Zhongwang Zhang, Tao Luo, Zhi-Qin John Xu
Abstract summary: We show that the loss landscape of a deep neural network (DNN) "contains" all the critical principle of all DNNs. We find that a wide DNN is often embedded by highlydegenerate critical points that are embedded from narrow DNNs.
Score: 1.1958610985612828
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Understanding the structure of loss landscape of deep neural networks (DNNs)is obviously important. In this work, we prove an embedding principle that the loss landscape of a DNN "contains" all the critical points of all the narrower DNNs. More precisely, we propose a critical embedding such that any critical point, e.g., local or global minima, of a narrower DNN can be embedded to a critical point/hyperplane of the target DNN with higher degeneracy and preserving the DNN output function. The embedding structure of critical points is independent of loss function and training data, showing a stark difference from other nonconvex problems such as protein-folding. Empirically, we find that a wide DNN is often attracted by highly-degenerate critical points that are embedded from narrow DNNs. The embedding principle provides an explanation for the general easy optimization of wide DNNs and unravels a potential implicit low-complexity regularization during the training. Overall, our work provides a skeleton for the study of loss landscape of DNNs and its implication, by which a more exact and comprehensive understanding can be anticipated in the near

Related papers

Harnessing Neuron Stability to Improve DNN Verification [42.65507402735545]
We present VeriStable, a novel extension of recently proposed DPLL-based constraint DNN verification approach. We evaluate the effectiveness of VeriStable across a range of challenging benchmarks including fully-connected feed networks (FNNs), convolutional neural networks (CNNs) and residual networks (ResNets) Preliminary results show that VeriStable is competitive and outperforms state-of-the-art verification tools, including $alpha$-$beta$-CROWN and MN-BaB, the first and second performers of the VNN-COMP, respectively.
arXiv Detail & Related papers (2024-01-19T23:48:04Z)
Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach [8.994430921243767]
Adversarial training (AT) is a canonical method for enhancing the robustness of deep neural networks (DNNs) We non-trivially extend the neural tangent kernel (NTK) theory to AT and prove that an adversarially trained wide DNN can be well approximated by a linearized DNN. For squared loss, closed-form AT dynamics for the linearized DNN can be derived, which reveals a new AT degeneration phenomenon.
arXiv Detail & Related papers (2023-10-09T19:40:25Z)
Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification. Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z)
Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks [3.5208869573271446]
We prove an embedding principle in depth that loss landscape of an NN "contains" all critical points of the loss landscapes for shallower NNs. We empirically demonstrate that, through suppressing layer linearization, batch normalization helps avoid the lifted critical manifold.
arXiv Detail & Related papers (2022-05-26T11:42:44Z)
Embedding Principle: a hierarchical structure of loss landscape of deep neural networks [3.0871079010101963]
We prove a general Embedding Principle of loss landscape of deep neural networks (NNs) We provide a gross estimate of the dimension of critical submanifolds embedded from critical points of narrower NNs.
arXiv Detail & Related papers (2021-11-30T16:15:50Z)
CAP: Co-Adversarial Perturbation on Weights and Features for Improving Generalization of Graph Neural Networks [59.692017490560275]
Adversarial training has been widely demonstrated to improve model's robustness against adversarial attacks. It remains unclear how the adversarial training could improve the generalization abilities of GNNs in the graph analytics problem. We construct the co-adversarial perturbation (CAP) optimization problem in terms of weights and features, and design the alternating adversarial perturbation algorithm to flatten the weight and feature loss landscapes alternately.
arXiv Detail & Related papers (2021-10-28T02:28:13Z)
Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth [57.10183643449905]
Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization. We study the dynamics of GNNs by studying deep skip optimization. Our results provide first theoretical support for the success of GNNs.
arXiv Detail & Related papers (2021-05-10T17:59:01Z)
Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art results in many different problem settings. DNNs are often treated as black box systems, which complicates their evaluation and validation. One promising field, inspired by the success of convolutional neural networks (CNNs) in computer vision tasks, is to incorporate knowledge about symmetric geometrical transformations.
arXiv Detail & Related papers (2020-06-30T14:56:05Z)
Eigen-GNN: A Graph Structure Preserving Plug-in for GNNs [95.63153473559865]
Graph Neural Networks (GNNs) are emerging machine learning models on graphs. Most existing GNN models in practice are shallow and essentially feature-centric. We show empirically and analytically that the existing shallow GNNs cannot preserve graph structures well. We propose Eigen-GNN, a plug-in module to boost GNNs ability in preserving graph structures.
arXiv Detail & Related papers (2020-06-08T02:47:38Z)
CodNN -- Robust Neural Networks From Coded Classification [27.38642191854458]
Deep Neural Networks (DNNs) are a revolutionary force in the ongoing information revolution. DNNs are highly sensitive to noise, whether adversarial or random. This poses a fundamental challenge for hardware implementations of DNNs, and for their deployment in critical applications such as autonomous driving. By our approach, either the data or internal layers of the DNN are coded with error correcting codes, and successful computation under noise is guaranteed.
arXiv Detail & Related papers (2020-04-22T17:07:15Z)
Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes. We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.