Resilient Neural Forecasting Systems
- URL: http://arxiv.org/abs/2203.08492v1
- Date: Wed, 16 Mar 2022 09:37:49 GMT
- Title: Resilient Neural Forecasting Systems
- Authors: Michael Bohlke-Schneider, Shubham Kapoor, Tim Januschowski
- Abstract summary: Industrial machine learning systems face data challenges that are often under-explored in the academic literature.
In this paper, we discuss data challenges and solutions in the context of a Neural Forecasting application on labor planning.
We address changes in data distribution with a periodic retraining scheme and discuss the critical importance of model stability in this setting.
- Score: 10.709321760368137
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Industrial machine learning systems face data challenges that are often
under-explored in the academic literature. Common data challenges are data
distribution shifts, missing values and anomalies. In this paper, we discuss
data challenges and solutions in the context of a Neural Forecasting
application on labor planning.We discuss how to make this forecasting system
resilient to these data challenges. We address changes in data distribution
with a periodic retraining scheme and discuss the critical importance of model
stability in this setting. Furthermore, we show how our deep learning model
deals with missing values natively without requiring imputation. Finally, we
describe how we detect anomalies in the input data and mitigate their effect
before they impact the forecasts. This results in a fully autonomous
forecasting system that compares favorably to a hybrid system consisting of the
algorithm and human overrides.
Related papers
- Multivariate Data Augmentation for Predictive Maintenance using Diffusion [35.286105732902065]
Predictive maintenance has been used to optimize system repairs in the industrial, medical, and financial domains.
There is a lack of fault data to train these models, due to organizations working to keep fault occurrences and down time to a minimum.
For newly installed systems, no fault data exists since they have yet to fail.
arXiv Detail & Related papers (2024-11-06T16:57:09Z) - A Mathematical Model of the Hidden Feedback Loop Effect in Machine Learning Systems [44.99833362998488]
We introduce a repeated learning process to jointly describe several phenomena attributed to unintended hidden feedback loops.
A distinctive feature of such repeated learning setting is that the state of the environment becomes causally dependent on the learner itself over time.
We present a novel dynamical systems model of the repeated learning process and prove the limiting set of probability distributions for positive and negative feedback loop modes.
arXiv Detail & Related papers (2024-05-04T17:57:24Z) - A Temporally Disentangled Contrastive Diffusion Model for Spatiotemporal Imputation [35.46631415365955]
We introduce a conditional diffusion framework called C$2$TSD, which incorporates disentangled temporal (trend and seasonality) representations as conditional information.
Our experiments on three real-world datasets demonstrate the superior performance of our approach compared to a number of state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-18T11:59:04Z) - Measuring and Mitigating Local Instability in Deep Neural Networks [23.342675028217762]
We study how the predictions of a model change, even when it is retrained on the same data, as a consequence of principledity in the training process.
For Natural Language Understanding (NLU) tasks, we find instability in predictions for a significant fraction of queries.
We propose new data-centric methods that exploit our local stability estimates.
arXiv Detail & Related papers (2023-05-18T00:34:15Z) - Data-Centric Epidemic Forecasting: A Survey [56.99209141838794]
This survey delves into various data-driven methodological and practical advancements.
We enumerate the large number of epidemiological datasets and novel data streams that are relevant to epidemic forecasting.
We also discuss experiences and challenges that arise in real-world deployment of these forecasting systems.
arXiv Detail & Related papers (2022-07-19T16:15:11Z) - Predicting Seriousness of Injury in a Traffic Accident: A New Imbalanced
Dataset and Benchmark [62.997667081978825]
The paper introduces a new dataset to assess the performance of machine learning algorithms in the prediction of the seriousness of injury in a traffic accident.
The dataset is created by aggregating publicly available datasets from the UK Department for Transport.
arXiv Detail & Related papers (2022-05-20T21:15:26Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z) - On Covariate Shift of Latent Confounders in Imitation and Reinforcement
Learning [69.48387059607387]
We consider the problem of using expert data with unobserved confounders for imitation and reinforcement learning.
We analyze the limitations of learning from confounded expert data with and without external reward.
We validate our claims empirically on challenging assistive healthcare and recommender system simulation tasks.
arXiv Detail & Related papers (2021-10-13T07:31:31Z) - Using Data Assimilation to Train a Hybrid Forecast System that Combines
Machine-Learning and Knowledge-Based Components [52.77024349608834]
We consider the problem of data-assisted forecasting of chaotic dynamical systems when the available data is noisy partial measurements.
We show that by using partial measurements of the state of the dynamical system, we can train a machine learning model to improve predictions made by an imperfect knowledge-based model.
arXiv Detail & Related papers (2021-02-15T19:56:48Z) - How Training Data Impacts Performance in Learning-based Control [67.7875109298865]
This paper derives an analytical relationship between the density of the training data and the control performance.
We formulate a quality measure for the data set, which we refer to as $rho$-gap.
We show how the $rho$-gap can be applied to a feedback linearizing control law.
arXiv Detail & Related papers (2020-05-25T12:13:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.