Adaptive Estimation and Learning under Temporal Distribution Shift
- URL: http://arxiv.org/abs/2505.15803v1
- Date: Wed, 21 May 2025 17:56:07 GMT
- Title: Adaptive Estimation and Learning under Temporal Distribution Shift
- Authors: Dheeraj Baby, Yifei Tang, Hieu Duy Nguyen, Yu-Xiang Wang, Rohit Pyati,
- Abstract summary: We study the problem of estimation and learning under temporal distribution shift.<n>Consider an observation sequence of length $n$, which is a noisy realization of a time-varying groundtruth sequence.<n>We show that a wavelet soft-thresholding estimator provides an optimal estimation error bound for the groundtruth.
- Score: 17.82448143883071
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we study the problem of estimation and learning under temporal distribution shift. Consider an observation sequence of length $n$, which is a noisy realization of a time-varying groundtruth sequence. Our focus is to develop methods to estimate the groundtruth at the final time-step while providing sharp point-wise estimation error rates. We show that, without prior knowledge on the level of temporal shift, a wavelet soft-thresholding estimator provides an optimal estimation error bound for the groundtruth. Our proposed estimation method generalizes existing researches Mazzetto and Upfal (2023) by establishing a connection between the sequence's non-stationarity level and the sparsity in the wavelet-transformed domain. Our theoretical findings are validated by numerical experiments. Additionally, we applied the estimator to derive sparsity-aware excess risk bounds for binary classification under distribution shift and to develop computationally efficient training objectives. As a final contribution, we draw parallels between our results and the classical signal processing problem of total-variation denoising (Mammen and van de Geer,1997; Tibshirani, 2014), uncovering novel optimal algorithms for such task.
Related papers
- Minimax Optimal Two-Stage Algorithm For Moment Estimation Under Covariate Shift [10.35788775775647]
We investigate the minimax lower bound of the problem when the source and target distributions are known.<n>Specifically, it first trains an optimal estimator for the function under the source distribution, and then uses a likelihood ratio reweighting procedure to calibrate the moment estimator.<n>To solve this problem, we propose a truncated version of the estimator that ensures double robustness and provide the corresponding upper bound.
arXiv Detail & Related papers (2025-06-30T01:32:36Z) - BAPE: Learning an Explicit Bayes Classifier for Long-tailed Visual Recognition [78.70453964041718]
Current deep learning algorithms usually solve for the optimal classifier by emphimplicitly estimating the posterior probabilities.<n>This simple methodology has been proven effective for meticulously balanced academic benchmark datasets.<n>However, it is not applicable to the long-tailed data distributions in the real world.<n>This paper presents a novel approach (BAPE) that provides a more precise theoretical estimation of the data distributions.
arXiv Detail & Related papers (2025-06-29T15:12:50Z) - Asymptotically Optimal Linear Best Feasible Arm Identification with Fixed Budget [55.938644481736446]
We introduce a novel algorithm for best feasible arm identification that guarantees an exponential decay in the error probability.<n>We validate our algorithm through comprehensive empirical evaluations across various problem instances with different levels of complexity.
arXiv Detail & Related papers (2025-06-03T02:56:26Z) - In-Context Parametric Inference: Point or Distribution Estimators? [66.22308335324239]
We show that amortized point estimators generally outperform posterior inference, though the latter remain competitive in some low-dimensional problems.<n>Our experiments indicate that amortized point estimators generally outperform posterior inference, though the latter remain competitive in some low-dimensional problems.
arXiv Detail & Related papers (2025-02-17T10:00:24Z) - A Tale of Sampling and Estimation in Discounted Reinforcement Learning [50.43256303670011]
We present a minimax lower bound on the discounted mean estimation problem.
We show that estimating the mean by directly sampling from the discounted kernel of the Markov process brings compelling statistical properties.
arXiv Detail & Related papers (2023-04-11T09:13:17Z) - Adapting to Continuous Covariate Shift via Online Density Ratio Estimation [64.8027122329609]
Dealing with distribution shifts is one of the central challenges for modern machine learning.
We propose an online method that can appropriately reuse historical information.
Our density ratio estimation method is proven to perform well by enjoying a dynamic regret bound.
arXiv Detail & Related papers (2023-02-06T04:03:33Z) - Efficient Estimation for Longitudinal Networks via Adaptive Merging [21.62069959992736]
We propose an efficient estimation framework for longitudinal network, leveraging strengths of adaptive network merging, tensor decomposition and point process.
It merges neighboring sparse networks so as to enlarge the number of observed edges and reduce estimation variance.
A projected descent algorithm is proposed to facilitate estimation, where upper bound of the estimation error in each iteration is established.
arXiv Detail & Related papers (2022-11-15T03:17:11Z) - Learning Prediction Intervals for Regression: Generalization and
Calibration [12.576284277353606]
We study the generation of prediction intervals in regression for uncertainty quantification.
We use a general learning theory to characterize the optimality-feasibility tradeoff that encompasses Lipschitz continuity and VC-subgraph classes.
We empirically demonstrate the strengths of our interval generation and calibration algorithms in terms of testing performances compared to existing benchmarks.
arXiv Detail & Related papers (2021-02-26T17:55:30Z) - A Random Matrix Theory Approach to Damping in Deep Learning [0.7614628596146599]
We conjecture that the inherent difference in generalisation between adaptive and non-adaptive gradient methods in deep learning stems from the increased estimation noise.
We develop a novel random matrix theory based damping learner for second order optimiser inspired by linear shrinkage estimation.
arXiv Detail & Related papers (2020-11-15T18:19:42Z) - Learning Calibrated Uncertainties for Domain Shift: A Distributionally
Robust Learning Approach [150.8920602230832]
We propose a framework for learning calibrated uncertainties under domain shifts.
In particular, the density ratio estimation reflects the closeness of a target (test) sample to the source (training) distribution.
We show that our proposed method generates calibrated uncertainties that benefit downstream tasks.
arXiv Detail & Related papers (2020-10-08T02:10:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.