DDG-DA: Data Distribution Generation for Predictable Concept Drift
Adaptation
- URL: http://arxiv.org/abs/2201.04038v1
- Date: Tue, 11 Jan 2022 16:34:29 GMT
- Title: DDG-DA: Data Distribution Generation for Predictable Concept Drift
Adaptation
- Authors: Wendi Li, Xiao Yang, Weiqing Liu, Yingce Xia, Jiang Bian
- Abstract summary: We propose a novel method DDG-DA, that can effectively forecast the evolution of data distribution.
Specifically, we first train a predictor to estimate the future data distribution, then leverage it to generate training samples, and finally train models on the generated data.
- Score: 34.03849669133438
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many real-world scenarios, we often deal with streaming data that is
sequentially collected over time. Due to the non-stationary nature of the
environment, the streaming data distribution may change in unpredictable ways,
which is known as concept drift. To handle concept drift, previous methods
first detect when/where the concept drift happens and then adapt models to fit
the distribution of the latest data. However, there are still many cases that
some underlying factors of environment evolution are predictable, making it
possible to model the future concept drift trend of the streaming data, while
such cases are not fully explored in previous work.
In this paper, we propose a novel method DDG-DA, that can effectively
forecast the evolution of data distribution and improve the performance of
models. Specifically, we first train a predictor to estimate the future data
distribution, then leverage it to generate training samples, and finally train
models on the generated data. We conduct experiments on three real-world tasks
(forecasting on stock price trend, electricity load and solar irradiance) and
obtain significant improvement on multiple widely-used models.
Related papers
- Multi-Transmotion: Pre-trained Model for Human Motion Prediction [68.87010221355223]
Multi-Transmotion is an innovative transformer-based model designed for cross-modality pre-training.
Our methodology demonstrates competitive performance across various datasets on several downstream tasks.
arXiv Detail & Related papers (2024-11-04T23:15:21Z) - Addressing Concept Shift in Online Time Series Forecasting: Detect-then-Adapt [37.98336090671441]
Concept textbfDrift textbfDetection antextbfD textbfAdaptation (D3A)
It first detects drifting conception and then aggressively adapts the current model to the drifted concepts after the detection for rapid adaption.
It helps mitigate the data distribution gap, a critical factor contributing to train-test performance inconsistency.
arXiv Detail & Related papers (2024-03-22T04:44:43Z) - Ask Your Distribution Shift if Pre-Training is Right for You [74.18516460467019]
In practice, fine-tuning a pre-trained model improves robustness significantly in some cases but not at all in others.
We focus on two possible failure modes of models under distribution shift: poor extrapolation and biases in the training data.
Our study suggests that, as a rule of thumb, pre-training can help mitigate poor extrapolation but not dataset biases.
arXiv Detail & Related papers (2024-02-29T23:46:28Z) - Humanoid Locomotion as Next Token Prediction [84.21335675130021]
Our model is a causal transformer trained via autoregressive prediction of sensorimotor trajectories.
We show that our model enables a full-sized humanoid to walk in San Francisco zero-shot.
Our model can transfer to the real world even when trained on only 27 hours of walking data, and can generalize commands not seen during training like walking backward.
arXiv Detail & Related papers (2024-02-29T18:57:37Z) - Online Test-Time Adaptation of Spatial-Temporal Traffic Flow Forecasting [13.770733370640565]
This paper conducts the first study of the online test-time adaptation techniques for spatial-temporal traffic flow forecasting problems.
We propose an Adaptive Double Correction by Series Decomposition (ADCSD) method, which first decomposes the output of the trained model into seasonal and trend-cyclical parts.
In the proposed ADCSD method, instead of fine-tuning the whole trained model during the testing phase, a lite network is attached after the trained model, and only the lite network is fine-tuned in the testing process each time a data entry is observed.
arXiv Detail & Related papers (2024-01-08T12:04:39Z) - MemDA: Forecasting Urban Time Series with Memory-based Drift Adaptation [24.284969264008733]
We propose a new urban time series prediction model for the concept drift problem, which encodes the drift by considering the periodicity in the data.
Our design significantly outperforms state-of-the-art methods and can be well generalized to existing prediction backbones.
arXiv Detail & Related papers (2023-09-25T15:22:28Z) - Pre-training on Synthetic Driving Data for Trajectory Prediction [61.520225216107306]
We propose a pipeline-level solution to mitigate the issue of data scarcity in trajectory forecasting.
We adopt HD map augmentation and trajectory synthesis for generating driving data, and then we learn representations by pre-training on them.
We conduct extensive experiments to demonstrate the effectiveness of our data expansion and pre-training strategies.
arXiv Detail & Related papers (2023-09-18T19:49:22Z) - Handling Concept Drift in Global Time Series Forecasting [10.732102570751392]
We propose two new concept drift handling methods, namely: Error Contribution Weighting (ECW) and Gradient Descent Weighting (GDW)
These methods use two forecasting models which are separately trained with the most recent series and all series, and finally, the weighted average of the forecasts provided by the two models are considered as the final forecasts.
arXiv Detail & Related papers (2023-04-04T03:46:25Z) - Back2Future: Leveraging Backfill Dynamics for Improving Real-time
Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task.
'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature.
We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z) - Learning Interpretable Deep State Space Model for Probabilistic Time
Series Forecasting [98.57851612518758]
Probabilistic time series forecasting involves estimating the distribution of future based on its history.
We propose a deep state space model for probabilistic time series forecasting whereby the non-linear emission model and transition model are parameterized by networks.
We show in experiments that our model produces accurate and sharp probabilistic forecasts.
arXiv Detail & Related papers (2021-01-31T06:49:33Z) - Learning Parameter Distributions to Detect Concept Drift in Data Streams [13.20231558027132]
We propose a novel framework for the detection of real concept drift, called ERICS.
By treating the parameters of a predictive model as random variables, we show that concept drift corresponds to a change in the distribution of optimal parameters.
ERICS is also capable to detect concept drift at the input level, which is a significant advantage over existing approaches.
arXiv Detail & Related papers (2020-10-19T11:19:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.