Invariance-Based Dynamic Regret Minimization
- URL: http://arxiv.org/abs/2603.03843v1
- Date: Wed, 04 Mar 2026 08:47:02 GMT
- Title: Invariance-Based Dynamic Regret Minimization
- Authors: Margherita Lazzaretto, Jonas Peters, Niklas Pfister,
- Abstract summary: We consider non-stationary linear bandits where the linear parameter connecting contexts to the reward changes over time.<n>We propose to leverage historical data while adapting to changes by assuming the reward model decomposes into stationary and non-stationary components.
- Score: 8.349786817840858
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider stochastic non-stationary linear bandits where the linear parameter connecting contexts to the reward changes over time. Existing algorithms in this setting localize the policy by gradually discarding or down-weighting past data, effectively shrinking the time horizon over which learning can occur. However, in many settings historical data may still carry partial information about the reward model. We propose to leverage such data while adapting to changes, by assuming the reward model decomposes into stationary and non-stationary components. Based on this assumption, we introduce ISD-linUCB, an algorithm that uses past data to learn invariances in the reward model and subsequently exploits them to improve online performance. We show both theoretically and empirically that leveraging invariance reduces the problem dimensionality, yielding significant regret improvements in fast-changing environments when sufficient historical data is available.
Related papers
- Holdout-Loss-Based Data Selection for LLM Finetuning via In-Context Learning [19.677969862434708]
We present a theoretically grounded, resource-efficient framework for data selection and reweighting.<n>At its core is an In-Context Approximation (ICA) that estimates the holdout loss a model would incur after training on a candidate example.<n>We derive per-example weights from ICA scores, dynamically reweighting gradient updates as model parameters evolve.
arXiv Detail & Related papers (2025-10-16T09:00:39Z) - Disentangled Deep Smoothed Bootstrap for Fair Imbalanced Regression [1.2289361708127877]
Imbalanced distribution learning is a common and significant challenge in predictive modeling, often reducing the performance of standard algorithms.<n>We propose using Variational Autoencoders (VAEs) to model and define a latent representation of data distributions.<n>To address this, we develop an innovative data generation method that combines a disentangled VAE with a Smoothed Bootstrap applied in the latent space.
arXiv Detail & Related papers (2025-08-19T13:40:04Z) - Capturing the Temporal Dependence of Training Data Influence [100.91355498124527]
We formalize the concept of trajectory-specific leave-one-out influence, which quantifies the impact of removing a data point during training.<n>We propose data value embedding, a novel technique enabling efficient approximation of trajectory-specific LOO.<n>As data value embedding captures training data ordering, it offers valuable insights into model training dynamics.
arXiv Detail & Related papers (2024-12-12T18:28:55Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for Incremental Learning [93.90047628101155]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.<n>To address this, some methods propose replaying data from previous tasks during new task learning.<n>However, it is not expected in practice due to memory constraints and data privacy issues.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Instance-Conditional Timescales of Decay for Non-Stationary Learning [11.90763787610444]
Slow concept drift is a ubiquitous, yet under-studied problem in machine learning systems.
We propose an optimization-driven approach towards balancing instance importance over large training windows.
Experiments on a large real-world dataset of 39M photos over a 9 year period show upto 15% relative gains in accuracy.
arXiv Detail & Related papers (2022-12-12T14:16:26Z) - Non-Parametric Temporal Adaptation for Social Media Topic Classification [41.52878699836363]
We study temporal adaptation through the task of longitudinal hashtag prediction.
Our method improves by 64.12% over the best parametric baseline without any of its costly gradient-based updating.
Our dense retrieval approach is also well-suited to dynamically deleted user data in line with data privacy laws, with negligible computational cost and performance loss.
arXiv Detail & Related papers (2022-09-13T03:31:38Z) - ORFit: One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive Least-Squares [5.430441358049335]
We investigate the problem of one-pass learning, in which a model is trained on sequentially arriving data without retraining on previous datapoints.<n>We propose Orthogonal Recursive Fitting (ORFit), an algorithm for one-pass learning which seeks to perfectly fit each new datapoint while minimally altering the predictions on previous datapoints.
arXiv Detail & Related papers (2022-07-28T02:01:31Z) - Few-Shot Class-Incremental Learning via Entropy-Regularized Data-Free
Replay [52.251188477192336]
Few-shot class-incremental learning (FSCIL) has been proposed aiming to enable a deep learning system to incrementally learn new classes with limited data.
We show through empirical results that adopting the data replay is surprisingly favorable.
We propose using data-free replay that can synthesize data by a generator without accessing real data.
arXiv Detail & Related papers (2022-07-22T17:30:51Z) - Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift.
We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness.
The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.