Dealing with zero-inflated data: achieving SOTA with a two-fold machine
learning approach
- URL: http://arxiv.org/abs/2310.08088v1
- Date: Thu, 12 Oct 2023 07:26:41 GMT
- Title: Dealing with zero-inflated data: achieving SOTA with a two-fold machine
learning approach
- Authors: Jo\v{z}e M. Ro\v{z}anec, Ga\v{s}per Petelin, Jo\~ao Costa, Bla\v{z}
Bertalani\v{c}, Gregor Cerar, Marko Gu\v{c}ek, Gregor Papa, Dunja Mladeni\'c
- Abstract summary: This paper showcases two real-world use cases (home appliances classification and airport shuttle demand prediction) where a hierarchical model applied in the context of zero-inflated data leads to excellent results.
It is estimated that the proposed approach is also four times more energy efficient than the SOTA approach against which it was compared.
- Score: 0.18846515534317262
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many cases, a machine learning model must learn to correctly predict a few
data points with particular values of interest in a broader range of data where
many target values are zero. Zero-inflated data can be found in diverse
scenarios, such as lumpy and intermittent demands, power consumption for home
appliances being turned on and off, impurities measurement in distillation
processes, and even airport shuttle demand prediction. The presence of zeroes
affects the models' learning and may result in poor performance. Furthermore,
zeroes also distort the metrics used to compute the model's prediction quality.
This paper showcases two real-world use cases (home appliances classification
and airport shuttle demand prediction) where a hierarchical model applied in
the context of zero-inflated data leads to excellent results. In particular,
for home appliances classification, the weighted average of Precision, Recall,
F1, and AUC ROC was increased by 27%, 34%, 49%, and 27%, respectively.
Furthermore, it is estimated that the proposed approach is also four times more
energy efficient than the SOTA approach against which it was compared to.
Two-fold models performed best in all cases when predicting airport shuttle
demand, and the difference against other models has been proven to be
statistically significant.
Related papers
- Time-Series Foundation Model for Value-at-Risk [9.090616417812306]
Foundation models, pre-trained on vast and varied datasets, can be used in a zero-shot setting with relatively minimal data.
We compare the performance of Google's model, called TimesFM, against conventional parametric and non-parametric models.
arXiv Detail & Related papers (2024-10-15T16:53:44Z) - Using Generative Models to Produce Realistic Populations of the United Kingdom Windstorms [0.0]
dissertation explores the application of generative models to produce realistic synthetic wind field data.
Three models, including standard GANs, WGAN-GP, and U-net diffusion models, were employed to generate wind maps of the UK.
The results reveal that while all models are effective in capturing the general spatial characteristics, each model exhibits distinct strengths and weaknesses.
arXiv Detail & Related papers (2024-09-16T19:53:33Z) - F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - CaFA: Global Weather Forecasting with Factorized Attention on Sphere [7.687215328455751]
We propose a factorized-attention-based model tailored for spherical geometries to mitigate this issue.
The deterministic forecasting accuracy of the proposed model on $1.5circ$ and 0-7 days' lead time is on par with state-of-the-art purely data-driven machine learning weather prediction models.
arXiv Detail & Related papers (2024-05-12T23:18:14Z) - Air Quality Forecasting Using Machine Learning: A Global perspective
with Relevance to Low-Resource Settings [0.0]
Air pollution stands as the fourth leading cause of death globally.
This study proposes a novel machine learning approach for accurate air quality prediction using two months of air quality data.
arXiv Detail & Related papers (2024-01-09T05:52:02Z) - Residual Corrective Diffusion Modeling for Km-scale Atmospheric Downscaling [58.456404022536425]
State of the art for physical hazard prediction from weather and climate requires expensive km-scale numerical simulations driven by coarser resolution global inputs.
Here, a generative diffusion architecture is explored for downscaling such global inputs to km-scale, as a cost-effective machine learning alternative.
The model is trained to predict 2km data from a regional weather model over Taiwan, conditioned on a 25km global reanalysis.
arXiv Detail & Related papers (2023-09-24T19:57:22Z) - A Meta-Learning Approach to Predicting Performance and Data Requirements [163.4412093478316]
We propose an approach to estimate the number of samples required for a model to reach a target performance.
We find that the power law, the de facto principle to estimate model performance, leads to large error when using a small dataset.
We introduce a novel piecewise power law (PPL) that handles the two data differently.
arXiv Detail & Related papers (2023-03-02T21:48:22Z) - X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z) - Back2Future: Leveraging Backfill Dynamics for Improving Real-time
Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task.
'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature.
We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z) - Churn Reduction via Distillation [54.5952282395487]
We show an equivalence between training with distillation using the base model as the teacher and training with an explicit constraint on the predictive churn.
We then show that distillation performs strongly for low churn training against a number of recent baselines.
arXiv Detail & Related papers (2021-06-04T18:03:31Z) - A Data-Driven Machine Learning Approach for Consumer Modeling with Load
Disaggregation [1.6058099298620423]
We propose a generic class of data-driven semiparametric models derived from consumption data of residential consumers.
In the first stage, disaggregation of the load into fixed and shiftable components is accomplished by means of a hybrid algorithm.
In the second stage, the model parameters are estimated using an L2-norm, epsilon-insensitive regression approach.
arXiv Detail & Related papers (2020-11-04T13:36:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.