Bias-Variance Trade-off and Overlearning in Dynamic Decision Problems
- URL: http://arxiv.org/abs/2011.09349v2
- Date: Wed, 12 May 2021 16:37:06 GMT
- Title: Bias-Variance Trade-off and Overlearning in Dynamic Decision Problems
- Authors: A. Max Reppen and H. Mete Soner
- Abstract summary: Modern Monte Carlo-type approaches to dynamic decision problems are reformulated as empirical loss minimization.
These computational methods are then analyzed in this framework to demonstrate their effectiveness as well as their susceptibility to generalization error.
- Score: 1.2183405753834562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern Monte Carlo-type approaches to dynamic decision problems are
reformulated as empirical loss minimization, allowing direct applications of
classical results from statistical machine learning. These computational
methods are then analyzed in this framework to demonstrate their effectiveness
as well as their susceptibility to generalization error. Standard uses of
classical results prove potential overlearning, thus bias-variance trade-off,
by connecting over-trained networks to anticipating controls. On the other
hand, non-asymptotic estimates based on Rademacher complexity show the
convergence of these algorithms for sufficiently large training sets. A
numerically studied stylized example illustrates these possibilities, including
the importance of problem dimension in the degree of overlearning, and the
effectiveness of this approach.
Related papers
- Data-driven approaches to inverse problems [12.614421935598317]
Inverse problems serve as critical tools for visualizing internal structures beyond what is visible to the naked eye.<n>A more recent paradigm considers deriving solutions to inverse problems in a data-driven manner.<n>These notes offer an introduction to this data-driven paradigm for inverse problems.
arXiv Detail & Related papers (2025-06-13T12:44:32Z) - Dynamic Post-Hoc Neural Ensemblers [55.15643209328513]
In this study, we explore employing neural networks as ensemble methods.
Motivated by the risk of learning low-diversity ensembles, we propose regularizing the model by randomly dropping base model predictions.
We demonstrate this approach lower bounds the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z) - Most Influential Subset Selection: Challenges, Promises, and Beyond [9.479235005673683]
We study the Most Influential Subset Selection (MISS) problem, which aims to identify a subset of training samples with the greatest collective influence.
We conduct a comprehensive analysis of the prevailing approaches in MISS, elucidating their strengths and weaknesses.
We demonstrate that an adaptive version of theses which applies them iteratively, can effectively capture the interactions among samples.
arXiv Detail & Related papers (2024-09-25T20:00:23Z) - Machine Learning for predicting chaotic systems [0.0]
We show that well-tuned simple methods, as well as untuned baseline methods, often outperform state-of-the-art deep learning models.
These findings underscore the importance of matching prediction methods to data characteristics and available computational resources.
arXiv Detail & Related papers (2024-07-29T16:34:47Z) - On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - A Unified Generalization Analysis of Re-Weighting and Logit-Adjustment
for Imbalanced Learning [129.63326990812234]
We propose a technique named data-dependent contraction to capture how modified losses handle different classes.
On top of this technique, a fine-grained generalization bound is established for imbalanced learning, which helps reveal the mystery of re-weighting and logit-adjustment.
arXiv Detail & Related papers (2023-10-07T09:15:08Z) - Learning Interpretable Deep Disentangled Neural Networks for
Hyperspectral Unmixing [16.02193274044797]
We propose a new interpretable deep learning method for hyperspectral unmixing that accounts for nonlinearity and endmember variability.
The model is learned end-to-end using backpropagation, and trained using a self-supervised strategy.
Experimental results on synthetic and real datasets illustrate the performance of the proposed method.
arXiv Detail & Related papers (2023-10-03T18:21:37Z) - On Robust Numerical Solver for ODE via Self-Attention Mechanism [82.95493796476767]
We explore training efficient and robust AI-enhanced numerical solvers with a small data size by mitigating intrinsic noise disturbances.
We first analyze the ability of the self-attention mechanism to regulate noise in supervised learning and then propose a simple-yet-effective numerical solver, Attr, which introduces an additive self-attention mechanism to the numerical solution of differential equations.
arXiv Detail & Related papers (2023-02-05T01:39:21Z) - On the generalization of learning algorithms that do not converge [54.122745736433856]
Generalization analyses of deep learning typically assume that the training converges to a fixed point.
Recent results indicate that in practice, the weights of deep neural networks optimized with gradient descent often oscillate indefinitely.
arXiv Detail & Related papers (2022-08-16T21:22:34Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Deep learning: a statistical viewpoint [120.94133818355645]
Deep learning has revealed some major surprises from a theoretical perspective.
In particular, simple gradient methods easily find near-perfect solutions to non-optimal training problems.
We conjecture that specific principles underlie these phenomena.
arXiv Detail & Related papers (2021-03-16T16:26:36Z) - Robust Unsupervised Learning via L-Statistic Minimization [38.49191945141759]
We present a general approach to this problem focusing on unsupervised learning.
The key assumption is that the perturbing distribution is characterized by larger losses relative to a given class of admissible models.
We prove uniform convergence bounds with respect to the proposed criterion for several popular models in unsupervised learning.
arXiv Detail & Related papers (2020-12-14T10:36:06Z) - Uses and Abuses of the Cross-Entropy Loss: Case Studies in Modern Deep
Learning [29.473503894240096]
We focus on the use of the categorical cross-entropy loss to model data that is not strictly categorical, but rather takes values on the simplex.
This practice is standard in neural network architectures with label smoothing and actor-mimic reinforcement learning, amongst others.
We propose probabilistically-inspired alternatives to these models, providing an approach that is more principled and theoretically appealing.
arXiv Detail & Related papers (2020-11-10T16:44:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.