Expressive Power and Loss Surfaces of Deep Learning Models
- URL: http://arxiv.org/abs/2108.03579v2
- Date: Tue, 10 Aug 2021 01:34:42 GMT
- Title: Expressive Power and Loss Surfaces of Deep Learning Models
- Authors: Simant Dube
- Abstract summary: This paper serves as an expository tutorial on the working of deep learning models.
The second goal is to complement the current results on the expressive power of deep learning models with novel insights and results.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The goals of this paper are two-fold. The first goal is to serve as an
expository tutorial on the working of deep learning models which emphasizes
geometrical intuition about the reasons for success of deep learning. The
second goal is to complement the current results on the expressive power of
deep learning models and their loss surfaces with novel insights and results.
In particular, we describe how deep neural networks carve out manifolds
especially when the multiplication neurons are introduced. Multiplication is
used in dot products and the attention mechanism and it is employed in capsule
networks and self-attention based transformers. We also describe how random
polynomial, random matrix, spin glass and computational complexity perspectives
on the loss surfaces are interconnected.
Related papers
- Generating visual explanations from deep networks using implicit neural representations [0.6056822594090163]
In this work, we demonstrate that implicit neural representations (INRs) constitute a good framework for generating visual explanations.
We present an iterative INR-based method that can be used to generate multiple non-overlapping attribution masks for the same image.
arXiv Detail & Related papers (2025-01-20T23:17:57Z) - Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - Breaking the Curse of Dimensionality in Deep Neural Networks by Learning
Invariant Representations [1.9580473532948401]
This thesis explores the theoretical foundations of deep learning by studying the relationship between the architecture of these models and the inherent structures found within the data they process.
We ask What drives the efficacy of deep learning algorithms and allows them to beat the so-called curse of dimensionality.
Our methodology takes an empirical approach to deep learning, combining experimental studies with physics-inspired toy models.
arXiv Detail & Related papers (2023-10-24T19:50:41Z) - Riemannian Residual Neural Networks [58.925132597945634]
We show how to extend the residual neural network (ResNet)
ResNets have become ubiquitous in machine learning due to their beneficial learning properties, excellent empirical results, and easy-to-incorporate nature when building varied neural networks.
arXiv Detail & Related papers (2023-10-16T02:12:32Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - Convergence Analysis of Deep Residual Networks [3.274290296343038]
Deep Residual Networks (ResNets) are of particular importance because they demonstrated great usefulness in computer vision.
We aim at characterizing the convergence of deep ResNets as the depth tends to infinity in terms of the parameters of the networks.
arXiv Detail & Related papers (2022-05-13T11:53:09Z) - Tensor Methods in Computer Vision and Deep Learning [120.3881619902096]
tensors, or multidimensional arrays, are data structures that can naturally represent visual data of multiple dimensions.
With the advent of the deep learning paradigm shift in computer vision, tensors have become even more fundamental.
This article provides an in-depth and practical review of tensors and tensor methods in the context of representation learning and deep learning.
arXiv Detail & Related papers (2021-07-07T18:42:45Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Deep Polynomial Neural Networks [77.70761658507507]
$Pi$Nets are a new class of function approximators based on expansions.
$Pi$Nets produce state-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning.
arXiv Detail & Related papers (2020-06-20T16:23:32Z) - Depth Selection for Deep ReLU Nets in Feature Extraction and
Generalization [22.696129751033983]
We show that implementing the classical empirical risk minimization on deep nets can achieve the optimal generalization performance for numerous learning tasks.
Our results are verified by a series of numerical experiments including toy simulations and a real application of earthquake seismic intensity prediction.
arXiv Detail & Related papers (2020-04-01T06:03:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.