M5 Competition Uncertainty: Overdispersion, distributional forecasting,
GAMLSS and beyond
- URL: http://arxiv.org/abs/2107.06675v1
- Date: Wed, 14 Jul 2021 13:05:55 GMT
- Title: M5 Competition Uncertainty: Overdispersion, distributional forecasting,
GAMLSS and beyond
- Authors: Florian Ziel
- Abstract summary: We show that the M5 competition data faces strong overdispersion and sporadic demand, especially zero demand.
We discuss resulting modeling issues concerning adequate probabilistic forecasting of such count data processes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The M5 competition uncertainty track aims for probabilistic forecasting of
sales of thousands of Walmart retail goods. We show that the M5 competition
data faces strong overdispersion and sporadic demand, especially zero demand.
We discuss resulting modeling issues concerning adequate probabilistic
forecasting of such count data processes. Unfortunately, the majority of
popular prediction methods used in the M5 competition (e.g. lightgbm and
xgboost GBMs) fails to address the data characteristics due to the considered
objective functions. The distributional forecasting provides a suitable
modeling approach for to the overcome those problems. The GAMLSS framework
allows flexible probabilistic forecasting using low dimensional distributions.
We illustrate, how the GAMLSS approach can be applied for the M5 competition
data by modeling the location and scale parameter of various distributions,
e.g. the negative binomial distribution. Finally, we discuss software packages
for distributional modeling and their drawback, like the R package gamlss with
its package extensions, and (deep) distributional forecasting libraries such as
TensorFlow Probability.
Related papers
- Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Scalable Probabilistic Forecasting in Retail with Gradient Boosted
Trees: A Practitioner's Approach [4.672665650064167]
We propose a top-down approach to forecasting at an aggregated level with less amount of series and intermittency.
Direct training at the lower level with subsamples can also be an alternative way of scaling.
We are able to show the differences in characteristics of the e-commerce and brick-and-mortar retail datasets.
arXiv Detail & Related papers (2023-11-02T04:46:32Z) - Dr. FERMI: A Stochastic Distributionally Robust Fair Empirical Risk
Minimization Framework [12.734559823650887]
In the presence of distribution shifts, fair machine learning models may behave unfairly on test data.
Existing algorithms require full access to data and cannot be used when small batches are used.
This paper proposes the first distributionally robust fairness framework with convergence guarantees that do not require knowledge of the causal graph.
arXiv Detail & Related papers (2023-09-20T23:25:28Z) - A Nonparametric Approach with Marginals for Modeling Consumer Choice [4.880424147378901]
The marginal distribution model (MDM) is inspired by the utility of similar characterizations for the random utility model (RUM)
This paper aims to establish necessary and sufficient conditions for given choice data to be consistent with the MDM hypothesis.
Numerical experiments show that MDM provides better representational power and prediction accuracy than multinominal logit.
arXiv Detail & Related papers (2022-08-12T04:43:26Z) - WeatherBench Probability: A benchmark dataset for probabilistic
medium-range weather forecasting along with deep learning baseline models [22.435002906710803]
WeatherBench is a benchmark dataset for medium-range weather forecasting of geopotential, temperature and precipitation.
WeatherBench Probability extends this to probabilistic forecasting by adding a set of established probabilistic verification metrics.
arXiv Detail & Related papers (2022-05-02T12:49:05Z) - Robust Nonparametric Distribution Forecast with Backtest-based Bootstrap
and Adaptive Residual Selection [14.398720944586803]
Distribution forecast can quantify forecast uncertainty and provide various forecast scenarios with corresponding estimated probabilities.
We propose a practical and robust distribution forecast framework that relies on backtest-based bootstrap and adaptive residual selection.
arXiv Detail & Related papers (2022-02-16T09:53:48Z) - CovarianceNet: Conditional Generative Model for Correct Covariance
Prediction in Human Motion Prediction [71.31516599226606]
We present a new method to correctly predict the uncertainty associated with the predicted distribution of future trajectories.
Our approach, CovariaceNet, is based on a Conditional Generative Model with Gaussian latent variables.
arXiv Detail & Related papers (2021-09-07T09:38:24Z) - Predicting with Confidence on Unseen Distributions [90.68414180153897]
We connect domain adaptation and predictive uncertainty literature to predict model accuracy on challenging unseen distributions.
We find that the difference of confidences (DoC) of a classifier's predictions successfully estimates the classifier's performance change over a variety of shifts.
We specifically investigate the distinction between synthetic and natural distribution shifts and observe that despite its simplicity DoC consistently outperforms other quantifications of distributional difference.
arXiv Detail & Related papers (2021-07-07T15:50:18Z) - Distributed NLI: Learning to Predict Human Opinion Distributions for
Language Reasoning [76.17436599516074]
We introduce distributed NLI, a new NLU task with a goal to predict the distribution of human judgements for natural language inference.
We show that models can capture human judgement distribution by applying additional distribution estimation methods, namely, Monte Carlo (MC) Dropout, Deep Ensemble, Re-Calibration, and Distribution Distillation.
arXiv Detail & Related papers (2021-04-18T01:25:19Z) - Probabilistic electric load forecasting through Bayesian Mixture Density
Networks [70.50488907591463]
Probabilistic load forecasting (PLF) is a key component in the extended tool-chain required for efficient management of smart energy grids.
We propose a novel PLF approach, framed on Bayesian Mixture Density Networks.
To achieve reliable and computationally scalable estimators of the posterior distributions, both Mean Field variational inference and deep ensembles are integrated.
arXiv Detail & Related papers (2020-12-23T16:21:34Z) - Distributionally Robust Bayesian Quadrature Optimization [60.383252534861136]
We study BQO under distributional uncertainty in which the underlying probability distribution is unknown except for a limited set of its i.i.d. samples.
A standard BQO approach maximizes the Monte Carlo estimate of the true expected objective given the fixed sample set.
We propose a novel posterior sampling based algorithm, namely distributionally robust BQO (DRBQO) for this purpose.
arXiv Detail & Related papers (2020-01-19T12:00:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.