Related papers: Generalization in Deep Learning

Generalization in Deep Learning

URL: http://arxiv.org/abs/1710.05468v9
Date: Tue, 22 Aug 2023 03:04:22 GMT
Title: Generalization in Deep Learning
Authors: Kenji Kawaguchi, Leslie Pack Kaelbling, Yoshua Bengio
Abstract summary: This paper provides theoretical insights into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima. We also discuss approaches to provide non-vacuous generalization guarantees for deep learning.
Score: 103.91623583928852
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper provides theoretical insights into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, responding to an open question in the literature. We also discuss approaches to provide non-vacuous generalization guarantees for deep learning. Based on theoretical observations, we propose new open problems and discuss the limitations of our results.

Related papers

Not All Explanations for Deep Learning Phenomena Are Equally Valuable [58.7010466783654]
We argue that there is little evidence to suggest that counterintuitive phenomena appear in real-world applications.<n>These include double descent, grokking, and the lottery ticket hypothesis.<n>We propose practical recommendations for future research, aiming to ensure that progress on deep learning phenomena is well aligned with the ultimate pragmatic goal of progress in the broader field of deep learning.
arXiv Detail & Related papers (2025-06-29T15:18:56Z)
Computability of Classification and Deep Learning: From Theoretical Limits to Practical Feasibility through Quantization [53.15874572081944]
We study computability in the deep learning framework from two perspectives. We show algorithmic limitations in training deep neural networks even in cases where the underlying problem is well-behaved. Finally, we show that in quantized versions of classification and deep network training, computability restrictions do not arise or can be overcome to a certain degree.
arXiv Detail & Related papers (2024-08-12T15:02:26Z)
Soft Reasoning on Uncertain Knowledge Graphs [85.1968214421899]
We study the setting of soft queries on uncertain knowledge, which is motivated by the establishment of soft constraint programming. We propose an ML-based approach with both forward inference and backward calibration to answer soft queries on large-scale, incomplete, and uncertain knowledge graphs.
arXiv Detail & Related papers (2024-03-03T13:13:53Z)
A Survey Analyzing Generalization in Deep Reinforcement Learning [14.141453107129403]
We will formalize and analyze generalization in deep reinforcement learning. We will explain the fundamental reasons why deep reinforcement learning policies encounter overfitting problems that limit their generalization capabilities.
arXiv Detail & Related papers (2024-01-04T16:45:01Z)
Deep Causal Learning: Representation, Discovery and Inference [2.696435860368848]
Causal learning reveals the essential relationships that underpin phenomena and delineates the mechanisms by which the world evolves. Traditional causal learning methods face numerous challenges and limitations, including high-dimensional variables, unstructured variables, optimization problems, unobserved confounders, selection biases, and estimation inaccuracies. Deep causal learning, which leverages deep neural networks, offers innovative insights and solutions for addressing these challenges.
arXiv Detail & Related papers (2022-11-07T09:00:33Z)
Theoretical Perspectives on Deep Learning Methods in Inverse Problems [115.93934028666845]
We focus on generative priors, untrained neural network priors, and unfolding algorithms. In addition to summarizing existing results in these topics, we highlight several ongoing challenges and open problems.
arXiv Detail & Related papers (2022-06-29T02:37:50Z)
The Modern Mathematics of Deep Learning [8.939008609565368]
We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical of learning theory. For selected approaches, we describe the main ideas in more detail.
arXiv Detail & Related papers (2021-05-09T21:30:42Z)
NeurIPS 2020 Competition: Predicting Generalization in Deep Learning [0.0]
Understanding generalization in deep learning is arguably one of the most important questions in deep learning. We invite the community to propose complexity measures that can accurately predict generalization of models.
arXiv Detail & Related papers (2020-12-14T22:21:37Z)
In Search of Robust Measures of Generalization [79.75709926309703]
We develop bounds on generalization error, optimization error, and excess risk. When evaluated empirically, most of these bounds are numerically vacuous. We argue that generalization measures should instead be evaluated within the framework of distributional robustness.
arXiv Detail & Related papers (2020-10-22T17:54:25Z)
A Chain Graph Interpretation of Real-World Neural Networks [58.78692706974121]
We propose an alternative interpretation that identifies NNs as chain graphs (CGs) and feed-forward as an approximate inference procedure. The CG interpretation specifies the nature of each NN component within the rich theoretical framework of probabilistic graphical models. We demonstrate with concrete examples that the CG interpretation can provide novel theoretical support and insights for various NN techniques.
arXiv Detail & Related papers (2020-06-30T14:46:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.