Sequential prediction under log-loss and misspecification
- URL: http://arxiv.org/abs/2102.00050v1
- Date: Fri, 29 Jan 2021 20:28:23 GMT
- Title: Sequential prediction under log-loss and misspecification
- Authors: Meir Feder and Yury Polyanskiy
- Abstract summary: We consider the question of sequential prediction under the log-loss in terms of cumulative regret.
We show that cumulative regrets in the well-specified and misspecified cases coincideally.
We provide an $o(1)$ characterization of the distribution-free or PAC regret.
- Score: 47.66467420098395
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the question of sequential prediction under the log-loss in terms
of cumulative regret. Namely, given a hypothesis class of distributions,
learner sequentially predicts the (distribution of the) next letter in sequence
and its performance is compared to the baseline of the best constant predictor
from the hypothesis class. The well-specified case corresponds to an additional
assumption that the data-generating distribution belongs to the hypothesis
class as well. Here we present results in the more general misspecified case.
Due to special properties of the log-loss, the same problem arises in the
context of competitive-optimality in density estimation, and model selection.
For the $d$-dimensional Gaussian location hypothesis class, we show that
cumulative regrets in the well-specified and misspecified cases asymptotically
coincide. In other words, we provide an $o(1)$ characterization of the
distribution-free (or PAC) regret in this case -- the first such result as far
as we know. We recall that the worst-case (or individual-sequence) regret in
this case is larger by an additive constant ${d\over 2} + o(1)$. Surprisingly,
neither the traditional Bayesian estimators, nor the Shtarkov's normalized
maximum likelihood achieve the PAC regret and our estimator requires special
"robustification" against heavy-tailed data. In addition, we show two general
results for misspecified regret: the existence and uniqueness of the optimal
estimator, and the bound sandwiching the misspecified regret between
well-specified regrets with (asymptotically) close hypotheses classes.
Related papers
- Enhanced $H$-Consistency Bounds [30.389055604165222]
We present a framework for establishing enhanced $H$-consistency bounds based on more general inequalities relating conditional regrets.
Our theorems subsume existing results as special cases but also enable the derivation of more favorable bounds in various scenarios.
These include standard multi-class classification, binary and multi-class classification under Tsybakov noise conditions, and bipartite ranking.
arXiv Detail & Related papers (2024-07-18T17:22:40Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Universal Batch Learning Under The Misspecification Setting [4.772817128620037]
We consider the problem of universal em batch learning in a misspecification setting with log-loss.
We derive the optimal universal learner, a mixture over the set of the data generating distributions, and get a closed form expression for the min-max regret.
arXiv Detail & Related papers (2024-05-12T11:16:05Z) - $(\epsilon, u)$-Adaptive Regret Minimization in Heavy-Tailed Bandits [29.966828248335972]
We study the regret minimization problem when $epsilon$ and $u$ are unknown to the learner.
AdaR-UCB is the first algorithm that enjoys regret guarantees nearly matching those of the non-adaptive heavy-tailed case.
arXiv Detail & Related papers (2023-10-04T17:11:15Z) - Variational Prediction [95.00085314353436]
We present a technique for learning a variational approximation to the posterior predictive distribution using a variational bound.
This approach can provide good predictive distributions without test time marginalization costs.
arXiv Detail & Related papers (2023-07-14T18:19:31Z) - Robust PAC$^m$: Training Ensemble Models Under Misspecification and
Outliers [46.38465729190199]
PAC-Bayes theory demonstrates that the free energy criterion minimized by Bayesian learning is a bound on the generalization error for Gibbs predictors.
This work presents a novel robust free energy criterion that combines the generalized score function with PAC$m$ ensemble bounds.
arXiv Detail & Related papers (2022-03-03T17:11:07Z) - Suboptimality of Constrained Least Squares and Improvements via
Non-Linear Predictors [3.5788754401889014]
We study the problem of predicting as well as the best linear predictor in a bounded Euclidean ball with respect to the squared loss.
We discuss additional distributional assumptions sufficient to guarantee an $O(d/n)$ excess risk rate for the least squares estimator.
arXiv Detail & Related papers (2020-09-19T21:39:46Z) - The Vector Poisson Channel: On the Linearity of the Conditional Mean
Estimator [82.5577471797883]
This work studies properties of the conditional mean estimator in vector Poisson noise.
The first result shows that the conditional mean estimator cannot be linear when the dark current parameter of the Poisson noise is non-zero.
The second result produces a quantitative refinement of the first result.
arXiv Detail & Related papers (2020-03-19T18:21:33Z) - Distributionally Robust Bayesian Quadrature Optimization [60.383252534861136]
We study BQO under distributional uncertainty in which the underlying probability distribution is unknown except for a limited set of its i.i.d. samples.
A standard BQO approach maximizes the Monte Carlo estimate of the true expected objective given the fixed sample set.
We propose a novel posterior sampling based algorithm, namely distributionally robust BQO (DRBQO) for this purpose.
arXiv Detail & Related papers (2020-01-19T12:00:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.