Deep Learning is Singular, and That's Good
- URL: http://arxiv.org/abs/2010.11560v1
- Date: Thu, 22 Oct 2020 09:33:59 GMT
- Title: Deep Learning is Singular, and That's Good
- Authors: Daniel Murfet, Susan Wei, Mingming Gong, Hui Li, Jesse Gell-Redman,
Thomas Quella
- Abstract summary: In singular models, the optimal set of parameters forms an analytic set with singularities and classical statistical inference cannot be applied.
This is significant for deep learning as neural networks are singular and thus "dividing" by the determinant of the Hessian or employing the Laplace approximation are not appropriate.
Despite its potential for addressing fundamental issues in deep learning, singular learning theory appears to have made little inroads into the developing canon of deep learning theory.
- Score: 31.985399645173022
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In singular models, the optimal set of parameters forms an analytic set with
singularities and classical statistical inference cannot be applied to such
models. This is significant for deep learning as neural networks are singular
and thus "dividing" by the determinant of the Hessian or employing the Laplace
approximation are not appropriate. Despite its potential for addressing
fundamental issues in deep learning, singular learning theory appears to have
made little inroads into the developing canon of deep learning theory. Via a
mix of theory and experiment, we present an invitation to singular learning
theory as a vehicle for understanding deep learning and suggest important
future work to make singular learning theory directly applicable to how deep
learning is performed in practice.
Related papers
- Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning [14.909298522361306]
Notes are based on a lecture delivered by NC on March 2021, as part of an advanced course in Princeton University on the mathematical understanding of deep learning.
They present a theory (developed by NC, NR and collaborators) of linear neural networks -- a fundamental model in the study of optimization and generalization in deep learning.
arXiv Detail & Related papers (2024-08-25T08:24:48Z) - Applying statistical learning theory to deep learning [21.24637996678039]
The goal of these lectures is to provide an overview of some of the main questions that arise when attempting to understand deep learning.
We discuss implicit bias in the context of benign overfitting.
We provide a detailed study of the implicit bias of gradient descent on linear diagonal networks for various regression tasks.
arXiv Detail & Related papers (2023-11-26T20:00:53Z) - A Theory of Human-Like Few-Shot Learning [14.271690184738205]
We derive a theory of human-like few-shot learning from von-Neuman-Landauer's principle.
We find that deep generative model like variational autoencoder (VAE) can be used to approximate our theory.
arXiv Detail & Related papers (2023-01-03T11:22:37Z) - Envisioning Future Deep Learning Theories: Some Basic Concepts and Characteristics [30.365274034429508]
We argue that a future deep learning theory should inherit three characteristics: a textitarchhierically structured network architecture, parameters textititeratively optimized using gradient-based methods, and information from the data that evolves textitcompressively
We integrate these characteristics into a graphical model called textitneurashed, which effectively explains some common empirical patterns in deep learning.
arXiv Detail & Related papers (2021-12-17T19:51:26Z) - An Empirical Investigation into Deep and Shallow Rule Learning [0.0]
In this paper, we empirically compare deep and shallow rule learning with a uniform general algorithm.
Our experiments on both artificial and real-world benchmark data indicate that deep rule networks outperform shallow networks.
arXiv Detail & Related papers (2021-06-18T17:43:17Z) - Ten Quick Tips for Deep Learning in Biology [116.78436313026478]
Machine learning is concerned with the development and applications of algorithms that can recognize patterns in data and use them for predictive modeling.
Deep learning has become its own subfield of machine learning.
In the context of biological research, deep learning has been increasingly used to derive novel insights from high-dimensional biological data.
arXiv Detail & Related papers (2021-05-29T21:02:44Z) - Exploring Bayesian Deep Learning for Urgent Instructor Intervention Need
in MOOC Forums [58.221459787471254]
Massive Open Online Courses (MOOCs) have become a popular choice for e-learning thanks to their great flexibility.
Due to large numbers of learners and their diverse backgrounds, it is taxing to offer real-time support.
With the large volume of posts and high workloads for MOOC instructors, it is unlikely that the instructors can identify all learners requiring intervention.
This paper explores for the first time Bayesian deep learning on learner-based text posts with two methods: Monte Carlo Dropout and Variational Inference.
arXiv Detail & Related papers (2021-04-26T15:12:13Z) - Demystification of Few-shot and One-shot Learning [63.58514532659252]
Few-shot and one-shot learning have been the subject of active and intensive research in recent years.
We show that if the ambient or latent decision space of a learning machine is sufficiently high-dimensional than a large class of objects in this space can indeed be easily learned from few examples.
arXiv Detail & Related papers (2021-04-25T14:47:05Z) - Nonparametric Estimation of Heterogeneous Treatment Effects: From Theory
to Learning Algorithms [91.3755431537592]
We analyze four broad meta-learning strategies which rely on plug-in estimation and pseudo-outcome regression.
We highlight how this theoretical reasoning can be used to guide principled algorithm design and translate our analyses into practice.
arXiv Detail & Related papers (2021-01-26T17:11:40Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z) - The large learning rate phase of deep learning: the catapult mechanism [50.23041928811575]
We present a class of neural networks with solvable training dynamics.
We find good agreement between our model's predictions and training dynamics in realistic deep learning settings.
We believe our results shed light on characteristics of models trained at different learning rates.
arXiv Detail & Related papers (2020-03-04T17:52:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.