Depth Selection for Deep ReLU Nets in Feature Extraction and
Generalization
- URL: http://arxiv.org/abs/2004.00245v1
- Date: Wed, 1 Apr 2020 06:03:01 GMT
- Title: Depth Selection for Deep ReLU Nets in Feature Extraction and
Generalization
- Authors: Zhi Han, Siquan Yu, Shao-Bo Lin, Ding-Xuan Zhou
- Abstract summary: We show that implementing the classical empirical risk minimization on deep nets can achieve the optimal generalization performance for numerous learning tasks.
Our results are verified by a series of numerical experiments including toy simulations and a real application of earthquake seismic intensity prediction.
- Score: 22.696129751033983
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning is recognized to be capable of discovering deep features for
representation learning and pattern recognition without requiring elegant
feature engineering techniques by taking advantage of human ingenuity and prior
knowledge. Thus it has triggered enormous research activities in machine
learning and pattern recognition. One of the most important challenge of deep
learning is to figure out relations between a feature and the depth of deep
neural networks (deep nets for short) to reflect the necessity of depth. Our
purpose is to quantify this feature-depth correspondence in feature extraction
and generalization. We present the adaptivity of features to depths and
vice-verse via showing a depth-parameter trade-off in extracting both single
feature and composite features. Based on these results, we prove that
implementing the classical empirical risk minimization on deep nets can achieve
the optimal generalization performance for numerous learning tasks. Our
theoretical results are verified by a series of numerical experiments including
toy simulations and a real application of earthquake seismic intensity
prediction.
Related papers
- Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural
Networks [49.808194368781095]
We show that three-layer neural networks have provably richer feature learning capabilities than two-layer networks.
This work makes progress towards understanding the provable benefit of three-layer neural networks over two-layer networks in the feature learning regime.
arXiv Detail & Related papers (2023-05-11T17:19:30Z) - Generalized Uncertainty of Deep Neural Networks: Taxonomy and
Applications [1.9671123873378717]
We show that the uncertainty of deep neural networks is not only important in a sense of interpretability and transparency, but also crucial in further advancing their performance.
We will generalize the definition of the uncertainty of deep neural networks to any number or vector that is associated with an input or an input-label pair, and catalog existing methods on mining'' such uncertainty from a deep model.
arXiv Detail & Related papers (2023-02-02T22:02:33Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - Deep Active Learning by Leveraging Training Dynamics [57.95155565319465]
We propose a theory-driven deep active learning method (dynamicAL) which selects samples to maximize training dynamics.
We show that dynamicAL not only outperforms other baselines consistently but also scales well on large deep learning models.
arXiv Detail & Related papers (2021-10-16T16:51:05Z) - Expressive Power and Loss Surfaces of Deep Learning Models [0.0]
This paper serves as an expository tutorial on the working of deep learning models.
The second goal is to complement the current results on the expressive power of deep learning models with novel insights and results.
arXiv Detail & Related papers (2021-08-08T06:28:09Z) - Self-Guided Instance-Aware Network for Depth Completion and Enhancement [6.319531161477912]
Existing methods directly interpolate the missing depth measurements based on pixel-wise image content and the corresponding neighboring depth values.
We propose a novel self-guided instance-aware network (SG-IANet) that utilize self-guided mechanism to extract instance-level features that is needed for depth restoration.
arXiv Detail & Related papers (2021-05-25T19:41:38Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Variational Structured Attention Networks for Deep Visual Representation
Learning [49.80498066480928]
We propose a unified deep framework to jointly learn both spatial attention maps and channel attention in a principled manner.
Specifically, we integrate the estimation and the interaction of the attentions within a probabilistic representation learning framework.
We implement the inference rules within the neural network, thus allowing for end-to-end learning of the probabilistic and the CNN front-end parameters.
arXiv Detail & Related papers (2021-03-05T07:37:24Z) - Accurate RGB-D Salient Object Detection via Collaborative Learning [101.82654054191443]
RGB-D saliency detection shows impressive ability on some challenge scenarios.
We propose a novel collaborative learning framework where edge, depth and saliency are leveraged in a more efficient way.
arXiv Detail & Related papers (2020-07-23T04:33:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.