Depth Selection for Deep ReLU Nets in Feature Extraction and
Generalization
- URL: http://arxiv.org/abs/2004.00245v1
- Date: Wed, 1 Apr 2020 06:03:01 GMT
- Title: Depth Selection for Deep ReLU Nets in Feature Extraction and
Generalization
- Authors: Zhi Han, Siquan Yu, Shao-Bo Lin, Ding-Xuan Zhou
- Abstract summary: We show that implementing the classical empirical risk minimization on deep nets can achieve the optimal generalization performance for numerous learning tasks.
Our results are verified by a series of numerical experiments including toy simulations and a real application of earthquake seismic intensity prediction.
- Score: 22.696129751033983
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning is recognized to be capable of discovering deep features for
representation learning and pattern recognition without requiring elegant
feature engineering techniques by taking advantage of human ingenuity and prior
knowledge. Thus it has triggered enormous research activities in machine
learning and pattern recognition. One of the most important challenge of deep
learning is to figure out relations between a feature and the depth of deep
neural networks (deep nets for short) to reflect the necessity of depth. Our
purpose is to quantify this feature-depth correspondence in feature extraction
and generalization. We present the adaptivity of features to depths and
vice-verse via showing a depth-parameter trade-off in extracting both single
feature and composite features. Based on these results, we prove that
implementing the classical empirical risk minimization on deep nets can achieve
the optimal generalization performance for numerous learning tasks. Our
theoretical results are verified by a series of numerical experiments including
toy simulations and a real application of earthquake seismic intensity
prediction.
Related papers
- The Computational Advantage of Depth: Learning High-Dimensional Hierarchical Functions with Gradient Descent [28.999394988111106]
We introduce a class of target functions that incorporate a hierarchy of latent subspace dimensionalities.
Our main theorem shows that feature learning with gradient descent reduces the effective dimensionality.
These findings open the way to further quantitative studies of the crucial role of depth in learning hierarchical structures with deep networks.
arXiv Detail & Related papers (2025-02-19T18:58:28Z) - Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - Semantics-Oriented Multitask Learning for DeepFake Detection: A Joint Embedding Approach [77.65459419417533]
We propose an automatic dataset expansion technique to support semantics-oriented DeepFake detection tasks.
We also resort to joint embedding of face images and their corresponding labels for prediction.
Our method improves the generalizability of DeepFake detection and renders some degree of model interpretation by providing human-understandable explanations.
arXiv Detail & Related papers (2024-08-29T07:11:50Z) - Convergence Analysis for Deep Sparse Coding via Convolutional Neural Networks [7.956678963695681]
We explore intersections between sparse coding and deep learning to enhance our understanding of feature extraction capabilities.
We derive convergence rates for convolutional neural networks (CNNs) in their ability to extract sparse features.
Inspired by the strong connection between sparse coding and CNNs, we explore training strategies to encourage neural networks to learn more sparse features.
arXiv Detail & Related papers (2024-08-10T12:43:55Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - Self-Guided Instance-Aware Network for Depth Completion and Enhancement [6.319531161477912]
Existing methods directly interpolate the missing depth measurements based on pixel-wise image content and the corresponding neighboring depth values.
We propose a novel self-guided instance-aware network (SG-IANet) that utilize self-guided mechanism to extract instance-level features that is needed for depth restoration.
arXiv Detail & Related papers (2021-05-25T19:41:38Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Variational Structured Attention Networks for Deep Visual Representation
Learning [49.80498066480928]
We propose a unified deep framework to jointly learn both spatial attention maps and channel attention in a principled manner.
Specifically, we integrate the estimation and the interaction of the attentions within a probabilistic representation learning framework.
We implement the inference rules within the neural network, thus allowing for end-to-end learning of the probabilistic and the CNN front-end parameters.
arXiv Detail & Related papers (2021-03-05T07:37:24Z) - Accurate RGB-D Salient Object Detection via Collaborative Learning [101.82654054191443]
RGB-D saliency detection shows impressive ability on some challenge scenarios.
We propose a novel collaborative learning framework where edge, depth and saliency are leveraged in a more efficient way.
arXiv Detail & Related papers (2020-07-23T04:33:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.