Future Gradient Descent for Adapting the Temporal Shifting Data
Distribution in Online Recommendation Systems
- URL: http://arxiv.org/abs/2209.01143v1
- Date: Fri, 2 Sep 2022 15:55:31 GMT
- Title: Future Gradient Descent for Adapting the Temporal Shifting Data
Distribution in Online Recommendation Systems
- Authors: Mao Ye, Ruichen Jiang, Haoxiang Wang, Dhruv Choudhary, Xiaocong Du,
Bhargav Bhushanam, Aryan Mokhtari, Arun Kejariwal, Qiang Liu
- Abstract summary: We learn a meta future gradient generator that forecasts the gradient information of the future data distribution for training.
Compared with Batch Update, our theory suggests that the proposed algorithm achieves smaller temporal domain generalization error.
- Score: 30.88268793277078
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the key challenges of learning an online recommendation model is the
temporal domain shift, which causes the mismatch between the training and
testing data distribution and hence domain generalization error. To overcome,
we propose to learn a meta future gradient generator that forecasts the
gradient information of the future data distribution for training so that the
recommendation model can be trained as if we were able to look ahead at the
future of its deployment. Compared with Batch Update, a widely used paradigm,
our theory suggests that the proposed algorithm achieves smaller temporal
domain generalization error measured by a gradient variation term in a local
regret. We demonstrate the empirical advantage by comparing with various
representative baselines.
Related papers
- Future-Guided Learning: A Predictive Approach To Enhance Time-Series Forecasting [4.866362841501992]
We introduce Future-Guided Learning, an approach that enhances time-series event forecasting.
Our approach involves two models: a detection model that analyzes future data to identify critical events and a forecasting model that predicts these events based on present data.
When discrepancies arise between the forecasting and detection models, the forecasting model undergoes more substantial updates.
arXiv Detail & Related papers (2024-10-19T21:22:55Z) - Pre-trained Recommender Systems: A Causal Debiasing Perspective [19.712997823535066]
We develop a generic recommender that captures universal interaction patterns by training on generic user-item interaction data extracted from different domains.
Our empirical studies show that the proposed model could significantly improve the recommendation performance in zero- and few-shot learning settings.
arXiv Detail & Related papers (2023-10-30T03:37:32Z) - Learning Rate Schedules in the Presence of Distribution Shift [18.310336156637774]
We design learning schedules that regret networks cumulatively learning in the presence of a changing data distribution.
We provide experiments for high-dimensional regression models to increase regret models.
arXiv Detail & Related papers (2023-03-27T23:29:02Z) - Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores.
We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z) - Debiased Fine-Tuning for Vision-language Models by Prompt Regularization [50.41984119504716]
We present a new paradigm for fine-tuning large-scale vision pre-trained models on downstream task, dubbed Prompt Regularization (ProReg)
ProReg uses the prediction by prompting the pretrained model to regularize the fine-tuning.
We show the consistently strong performance of ProReg compared with conventional fine-tuning, zero-shot prompt, prompt tuning, and other state-of-the-art methods.
arXiv Detail & Related papers (2023-01-29T11:53:55Z) - Towards Out-of-Distribution Sequential Event Prediction: A Causal
Treatment [72.50906475214457]
The goal of sequential event prediction is to estimate the next event based on a sequence of historical events.
In practice, the next-event prediction models are trained with sequential data collected at one time.
We propose a framework with hierarchical branching structures for learning context-specific representations.
arXiv Detail & Related papers (2022-10-24T07:54:13Z) - A Multi-stage Framework with Mean Subspace Computation and Recursive
Feedback for Online Unsupervised Domain Adaptation [9.109788577327503]
We propose a novel framework to solve real-world situations when the target data are unlabeled and arriving online sequentially in batches.
To project the data from the source and the target domains to a common subspace and manipulate the projected data in real-time, our proposed framework institutes a novel method.
Experiments on six datasets were conducted to investigate in depth the effect and contribution of each stage in our proposed framework.
arXiv Detail & Related papers (2022-06-24T03:50:34Z) - An Empirical Study on Distribution Shift Robustness From the Perspective
of Pre-Training and Data Augmentation [91.62129090006745]
This paper studies the distribution shift problem from the perspective of pre-training and data augmentation.
We provide the first comprehensive empirical study focusing on pre-training and data augmentation.
arXiv Detail & Related papers (2022-05-25T13:04:53Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - A Variational Bayesian Approach to Learning Latent Variables for
Acoustic Knowledge Transfer [55.20627066525205]
We propose a variational Bayesian (VB) approach to learning distributions of latent variables in deep neural network (DNN) models.
Our proposed VB approach can obtain good improvements on target devices, and consistently outperforms 13 state-of-the-art knowledge transfer algorithms.
arXiv Detail & Related papers (2021-10-16T15:54:01Z) - New Perspectives on the Use of Online Learning for Congestion Level
Prediction over Traffic Data [6.664111208927475]
This work focuses on classification over time series data.
When a time series is generated by non-stationary phenomena, the pattern relating the series with the class to be predicted may evolve over time.
Online learning methods incrementally learn from new data samples arriving over time, and accommodate eventual changes along the data stream.
arXiv Detail & Related papers (2020-03-27T09:44:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.