Back2Future: Leveraging Backfill Dynamics for Improving Real-time
Predictions in Future
- URL: http://arxiv.org/abs/2106.04420v1
- Date: Tue, 8 Jun 2021 14:48:20 GMT
- Title: Back2Future: Leveraging Backfill Dynamics for Improving Real-time
Predictions in Future
- Authors: Harshavardhan Kamarthi, Alexander Rodr\'iguez, B. Aditya Prakash
- Abstract summary: In real-time forecasting in public health, data collection is a non-trivial and demanding task.
'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature.
We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
- Score: 73.03458424369657
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In real-time forecasting in public health, data collection is a non-trivial
and demanding task. Often after initially released, it undergoes several
revisions later (maybe due to human or technical constraints) - as a result, it
may take weeks until the data reaches to a stable value. This so-called
'backfill' phenomenon and its effect on model performance has been barely
studied in the prior literature. In this paper, we introduce the multi-variate
backfill problem using COVID-19 as the motivating example. We construct a
detailed dataset composed of relevant signals over the past year of the
pandemic. We then systematically characterize several patterns in backfill
dynamics and leverage our observations for formulating a novel problem and
neural framework Back2Future that aims to refines a given model's predictions
in real-time. Our extensive experiments demonstrate that our method refines the
performance of top models for COVID-19 forecasting, in contrast to non-trivial
baselines, yielding 18% improvement over baselines, enabling us obtain a new
SOTA performance. In addition, we show that our model improves model evaluation
too; hence policy-makers can better understand the true accuracy of forecasting
models in real-time.
Related papers
- F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - Stochastic Diffusion: A Diffusion Probabilistic Model for Stochastic Time Series Forecasting [8.232475807691255]
We propose a novel Diffusion (StochDiff) model which learns data-driven prior knowledge at each time step.
The learnt prior knowledge helps the model to capture complex temporal dynamics and the inherent uncertainty of the data.
arXiv Detail & Related papers (2024-06-05T00:13:38Z) - Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - Lag-Llama: Towards Foundation Models for Probabilistic Time Series
Forecasting [54.04430089029033]
We present Lag-Llama, a general-purpose foundation model for time series forecasting based on a decoder-only transformer architecture.
Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities.
When fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-12T12:29:32Z) - Online learning techniques for prediction of temporal tabular datasets
with regime changes [0.0]
We propose a modular machine learning pipeline for ranking predictions on temporal panel datasets.
The modularity of the pipeline allows the use of different models, including Gradient Boosting Decision Trees (GBDTs) and Neural Networks.
Online learning techniques, which require no retraining of models, can be used post-prediction to enhance the results.
arXiv Detail & Related papers (2022-12-30T17:19:00Z) - Evaluation of Time-Series Forecasting Models for Chickenpox Cases
Estimation in Hungary [0.0]
We use time-series forecasting techniques to model and predict the future incidence of chickenpox.
We implement and simulate multiple models and data preprocessing techniques on a Hungary-collected dataset.
arXiv Detail & Related papers (2022-09-28T14:27:07Z) - DeepVol: Volatility Forecasting from High-Frequency Data with Dilated
Causal Convolutions [78.6363825307044]
We propose DeepVol, a model based on Dilated Causal Convolutions to forecast day-ahead volatility by using high-frequency data.
We show that the dilated convolutional filters are ideally suited to extract relevant information from intraday financial data.
arXiv Detail & Related papers (2022-09-23T16:13:47Z) - Feature-weighted Stacking for Nonseasonal Time Series Forecasts: A Case
Study of the COVID-19 Epidemic Curves [0.0]
We investigate ensembling techniques in forecasting and examine their potential for use in nonseasonal time-series.
We propose using late data fusion, using a stacked ensemble of two forecasting models and two meta-features that prove their predictive power during a preliminary forecasting stage.
arXiv Detail & Related papers (2021-08-19T14:44:46Z) - Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples.
We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models.
We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.