Bridging the Gap Between Training and Inference for Spatio-Temporal
Forecasting
- URL: http://arxiv.org/abs/2005.09343v1
- Date: Tue, 19 May 2020 10:14:43 GMT
- Title: Bridging the Gap Between Training and Inference for Spatio-Temporal
Forecasting
- Authors: Hong-Bin Liu, Ickjai Lee
- Abstract summary: We propose a novel curriculum learning based strategy named Temporal Progressive Growing Sampling to bridge the gap between training and inference for S-temporal sequence forecasting.
Experimental results demonstrate that our proposed method better models long term dependencies and outperforms baseline approaches on two competitive datasets.
- Score: 16.06369357595426
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spatio-temporal sequence forecasting is one of the fundamental tasks in
spatio-temporal data mining. It facilitates many real world applications such
as precipitation nowcasting, citywide crowd flow prediction and air pollution
forecasting. Recently, a few Seq2Seq based approaches have been proposed, but
one of the drawbacks of Seq2Seq models is that, small errors can accumulate
quickly along the generated sequence at the inference stage due to the
different distributions of training and inference phase. That is because
Seq2Seq models minimise single step errors only during training, however the
entire sequence has to be generated during the inference phase which generates
a discrepancy between training and inference. In this work, we propose a novel
curriculum learning based strategy named Temporal Progressive Growing Sampling
to effectively bridge the gap between training and inference for
spatio-temporal sequence forecasting, by transforming the training process from
a fully-supervised manner which utilises all available previous ground-truth
values to a less-supervised manner which replaces some of the ground-truth
context with generated predictions. To do that we sample the target sequence
from midway outputs from intermediate models trained with bigger timescales
through a carefully designed decaying strategy. Experimental results
demonstrate that our proposed method better models long term dependencies and
outperforms baseline approaches on two competitive datasets.
Related papers
- Loss Shaping Constraints for Long-Term Time Series Forecasting [79.3533114027664]
We present a Constrained Learning approach for long-term time series forecasting that respects a user-defined upper bound on the loss at each time-step.
We propose a practical Primal-Dual algorithm to tackle it, and aims to demonstrate that it exhibits competitive average performance in time series benchmarks, while shaping the errors across the predicted window.
arXiv Detail & Related papers (2024-02-14T18:20:44Z) - Time-series Generation by Contrastive Imitation [87.51882102248395]
We study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy.
At inference, the learned policy serves as the generator for iterative sampling, and the learned energy serves as a trajectory-level measure for evaluating sample quality.
arXiv Detail & Related papers (2023-11-02T16:45:25Z) - Can Diffusion Model Achieve Better Performance in Text Generation?
Bridging the Gap between Training and Inference! [14.979893207094221]
Diffusion models have been successfully adapted to text generation tasks by mapping the discrete text into the continuous space.
There exist nonnegligible gaps between training and inference, owing to the absence of the forward process during inference.
We propose two simple yet effective methods to bridge the gaps mentioned above, named Distance Penalty and Adaptive Decay Sampling.
arXiv Detail & Related papers (2023-05-08T05:32:22Z) - Towards Out-of-Distribution Sequential Event Prediction: A Causal
Treatment [72.50906475214457]
The goal of sequential event prediction is to estimate the next event based on a sequence of historical events.
In practice, the next-event prediction models are trained with sequential data collected at one time.
We propose a framework with hierarchical branching structures for learning context-specific representations.
arXiv Detail & Related papers (2022-10-24T07:54:13Z) - Flipped Classroom: Effective Teaching for Time Series Forecasting [0.0]
Sequence-to-sequence models based on LSTM and GRU are a most popular choice for forecasting time series data.
The two most common training strategies within this context are teacher forcing (TF) and free running (FR)
We propose several new curricula, and systematically evaluate their performance in two experimental sets.
arXiv Detail & Related papers (2022-10-17T11:53:25Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Meta-Forecasting by combining Global DeepRepresentations with Local
Adaptation [12.747008878068314]
We introduce a novel forecasting method called Meta Global-Local Auto-Regression (Meta-GLAR)
It adapts to each time series by learning in closed-form the mapping from the representations produced by a recurrent neural network (RNN) to one-step-ahead forecasts.
Our method is competitive with the state-of-the-art in out-of-sample forecasting accuracy reported in earlier work.
arXiv Detail & Related papers (2021-11-05T11:45:02Z) - Quantifying Uncertainty in Deep Spatiotemporal Forecasting [67.77102283276409]
We describe two types of forecasting problems: regular grid-based and graph-based.
We analyze UQ methods from both the Bayesian and the frequentist point view, casting in a unified framework via statistical decision theory.
Through extensive experiments on real-world road network traffic, epidemics, and air quality forecasting tasks, we reveal the statistical computational trade-offs for different UQ methods.
arXiv Detail & Related papers (2021-05-25T14:35:46Z) - BERT Loses Patience: Fast and Robust Inference with Early Exit [91.26199404912019]
We propose Patience-based Early Exit as a plug-and-play technique to improve the efficiency and robustness of a pretrained language model.
Our approach improves inference efficiency as it allows the model to make a prediction with fewer layers.
arXiv Detail & Related papers (2020-06-07T13:38:32Z) - A machine learning approach for forecasting hierarchical time series [4.157415305926584]
We propose a machine learning approach for forecasting hierarchical time series.
Forecast reconciliation is the process of adjusting forecasts to make them coherent across the hierarchy.
We exploit the ability of a deep neural network to extract information capturing the structure of the hierarchy.
arXiv Detail & Related papers (2020-05-31T22:26:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.