Local Risk Bounds for Statistical Aggregation
- URL: http://arxiv.org/abs/2306.17151v1
- Date: Thu, 29 Jun 2023 17:51:42 GMT
- Title: Local Risk Bounds for Statistical Aggregation
- Authors: Jaouad Mourtada and Tomas Va\v{s}kevi\v{c}ius and Nikita Zhivotovskiy
- Abstract summary: We prove localized versions of the classical bound for the exponential weights estimator due to Leung and Barron and deviation-optimal bounds for the Q-aggregation estimator.
These bounds improve over the results of Dai, Rigollet and Zhang for fixed design regression and the results of Lecu'e and Rigollet for random design regression.
- Score: 5.940699390639279
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the problem of aggregation, the aim is to combine a given class of base
predictors to achieve predictions nearly as accurate as the best one. In this
flexible framework, no assumption is made on the structure of the class or the
nature of the target. Aggregation has been studied in both sequential and
statistical contexts. Despite some important differences between the two
problems, the classical results in both cases feature the same global
complexity measure. In this paper, we revisit and tighten classical results in
the theory of aggregation in the statistical setting by replacing the global
complexity with a smaller, local one. Some of our proofs build on the PAC-Bayes
localization technique introduced by Catoni. Among other results, we prove
localized versions of the classical bound for the exponential weights estimator
due to Leung and Barron and deviation-optimal bounds for the Q-aggregation
estimator. These bounds improve over the results of Dai, Rigollet and Zhang for
fixed design regression and the results of Lecu\'e and Rigollet for random
design regression.
Related papers
- Statistical Inference in Classification of High-dimensional Gaussian Mixture [1.2354076490479515]
We investigate the behavior of a general class of regularized convex classifiers in the high-dimensional limit.
Our focus is on the generalization error and variable selection properties of the estimators.
arXiv Detail & Related papers (2024-10-25T19:58:36Z) - Aggregation Weighting of Federated Learning via Generalization Bound
Estimation [65.8630966842025]
Federated Learning (FL) typically aggregates client model parameters using a weighting approach determined by sample proportions.
We replace the aforementioned weighting method with a new strategy that considers the generalization bounds of each local model.
arXiv Detail & Related papers (2023-11-10T08:50:28Z) - Sample Complexity Bounds for Score-Matching: Causal Discovery and
Generative Modeling [82.36856860383291]
We demonstrate that accurate estimation of the score function is achievable by training a standard deep ReLU neural network.
We establish bounds on the error rate of recovering causal relationships using the score-matching-based causal discovery method.
arXiv Detail & Related papers (2023-10-27T13:09:56Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP,
and Beyond [101.5329678997916]
We study sample efficient reinforcement learning (RL) under the general framework of interactive decision making.
We propose a novel complexity measure, generalized eluder coefficient (GEC), which characterizes the fundamental tradeoff between exploration and exploitation.
We show that RL problems with low GEC form a remarkably rich class, which subsumes low Bellman eluder dimension problems, bilinear class, low witness rank problems, PO-bilinear class, and generalized regular PSR.
arXiv Detail & Related papers (2022-11-03T16:42:40Z) - Exponential Tail Local Rademacher Complexity Risk Bounds Without the
Bernstein Condition [30.401770841788718]
The local Rademacher toolbox is one of the most successful general-purpose toolboxes.
Applying the Bernstein theory to problems where optimal performance is only achievable via non-probable settings yields an exponential-tail excess risk.
Our results apply to improper prediction regimes not covered by the toolbox.
arXiv Detail & Related papers (2022-02-23T12:27:53Z) - Benign-Overfitting in Conditional Average Treatment Effect Prediction
with Linear Regression [14.493176427999028]
We study the benign overfitting theory in the prediction of the conditional average treatment effect (CATE) with linear regression models.
We show that the T-learner fails to achieve the consistency except the random assignment, while the IPW-learner converges the risk to zero if the propensity score is known.
arXiv Detail & Related papers (2022-02-10T18:51:52Z) - Optimistic Rates: A Unifying Theory for Interpolation Learning and
Regularization in Linear Regression [35.78863301525758]
We study a localized notion of uniform convergence known as an "optimistic rate"
Our refined analysis avoids the hidden constant and logarithmic factor in existing results.
arXiv Detail & Related papers (2021-12-08T18:55:00Z) - Treeging [0.0]
Treeging combines the flexible mean structure of regression trees with the covariance-based prediction strategy of kriging into the base learner of an ensemble prediction algorithm.
We investigate the predictive accuracy of treeging across a thorough and widely varied battery of spatial and space-time simulation scenarios.
arXiv Detail & Related papers (2021-10-03T17:48:18Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.