Model Compression for Dynamic Forecast Combination
- URL: http://arxiv.org/abs/2104.01830v1
- Date: Mon, 5 Apr 2021 09:55:35 GMT
- Title: Model Compression for Dynamic Forecast Combination
- Authors: Vitor Cerqueira, Luis Torgo, Carlos Soares, Albert Bifet
- Abstract summary: We show that compressing dynamic forecasting ensembles into an individual model leads to a comparable predictive performance.
We also show that the compressed individual model with best average rank is a rule-based regression model.
- Score: 9.281199058905017
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The predictive advantage of combining several different predictive models is
widely accepted. Particularly in time series forecasting problems, this
combination is often dynamic to cope with potential non-stationary sources of
variation present in the data. Despite their superior predictive performance,
ensemble methods entail two main limitations: high computational costs and lack
of transparency. These issues often preclude the deployment of such approaches,
in favour of simpler yet more efficient and reliable ones. In this paper, we
leverage the idea of model compression to address this problem in time series
forecasting tasks. Model compression approaches have been mostly unexplored for
forecasting. Their application in time series is challenging due to the
evolving nature of the data. Further, while the literature focuses on neural
networks, we apply model compression to distinct types of methods. In an
extensive set of experiments, we show that compressing dynamic forecasting
ensembles into an individual model leads to a comparable predictive performance
and a drastic reduction in computational costs. Further, the compressed
individual model with best average rank is a rule-based regression model. Thus,
model compression also leads to benefits in terms of model interpretability.
The experiments carried in this paper are fully reproducible.
Related papers
- On conditional diffusion models for PDE simulations [53.01911265639582]
We study score-based diffusion models for forecasting and assimilation of sparse observations.
We propose an autoregressive sampling approach that significantly improves performance in forecasting.
We also propose a new training strategy for conditional score-based models that achieves stable performance over a range of history lengths.
arXiv Detail & Related papers (2024-10-21T18:31:04Z) - An Experimental Study on the Rashomon Effect of Balancing Methods in Imbalanced Classification [0.0]
This paper examines the impact of balancing methods on predictive multiplicity using the Rashomon effect.
It is crucial because the blind model selection in data-centric AI is risky from a set of approximately equally accurate models.
arXiv Detail & Related papers (2024-03-22T13:08:22Z) - Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures [93.17009514112702]
Pruning, setting a significant subset of the parameters of a neural network to zero, is one of the most popular methods of model compression.
Despite existing evidence for this phenomenon, the relationship between neural network pruning and induced bias is not well-understood.
arXiv Detail & Related papers (2023-04-25T07:42:06Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - What do Compressed Large Language Models Forget? Robustness Challenges
in Model Compression [68.82486784654817]
We study two popular model compression techniques including knowledge distillation and pruning.
We show that compressed models are significantly less robust than their PLM counterparts on adversarial test sets.
We develop a regularization strategy for model compression based on sample uncertainty.
arXiv Detail & Related papers (2021-10-16T00:20:04Z) - Consistent Counterfactuals for Deep Models [25.1271020453651]
Counterfactual examples are used to explain predictions of machine learning models in key areas such as finance and medical diagnosis.
This paper studies the consistency of model prediction on counterfactual examples in deep networks under small changes to initial training conditions.
arXiv Detail & Related papers (2021-10-06T23:48:55Z) - Simultaneously Reconciled Quantile Forecasting of Hierarchically Related
Time Series [11.004159006784977]
We propose a flexible nonlinear model that optimize quantile regression loss coupled with suitable regularization terms to maintain consistency of forecasts across hierarchies.
The theoretical framework introduced herein can be applied to any forecasting model with an underlying differentiable loss function.
arXiv Detail & Related papers (2021-02-25T00:59:01Z) - Pattern Similarity-based Machine Learning Methods for Mid-term Load
Forecasting: A Comparative Study [0.0]
We use pattern similarity-based methods for forecasting monthly electricity demand expressing annual seasonality.
An integral part of the models is the time series representation using patterns of time series sequences.
We consider four such models: nearest neighbor model, fuzzy neighborhood model, kernel regression model and general regression neural network.
arXiv Detail & Related papers (2020-03-03T12:14:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.