Distillation and Interpretability of Ensemble Forecasts of ENSO Phase using Entropic Learning
- URL: http://arxiv.org/abs/2602.16857v1
- Date: Sun, 15 Feb 2026 05:49:16 GMT
- Title: Distillation and Interpretability of Ensemble Forecasts of ENSO Phase using Entropic Learning
- Authors: Michael Groom, Davide Bassetti, Illia Horenko, Terence J. O'Kane,
- Abstract summary: This paper introduces a framework for an ensemble of Sparse Probabilistic Approximation (eSPA) models to predict ENSO phase up to 24 months in advance.<n>We show how to compress the ensemble into a compact set of "distilled" models by aggregating the structure of only those ensemble members that make correct predictions.
- Score: 1.3999481573773072
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a distillation framework for an ensemble of entropy-optimal Sparse Probabilistic Approximation (eSPA) models, trained exclusively on satellite-era observational and reanalysis data to predict ENSO phase up to 24 months in advance. While eSPA ensembles yield state-of-the-art forecast skill, they are harder to interpret than individual eSPA models. We show how to compress the ensemble into a compact set of "distilled" models by aggregating the structure of only those ensemble members that make correct predictions. This process yields a single, diagnostically tractable model for each forecast lead time that preserves forecast performance while also enabling diagnostics that are impractical to implement on the full ensemble. An analysis of the regime persistence of the distilled model "superclusters", as well as cross-lead clustering consistency, shows that the discretised system accurately captures the spatiotemporal dynamics of ENSO. By considering the effective dimension of the feature importance vectors, the complexity of the input space required for correct ENSO phase prediction is shown to peak when forecasts must cross the boreal spring predictability barrier. Spatial importance maps derived from the feature importance vectors are introduced to identify where predictive information resides in each field and are shown to include known physical precursors at certain lead times. Case studies of key events are also presented, showing how fields reconstructed from distilled model centroids trace the evolution from extratropical and inter-basin precursors to the mature ENSO state. Overall, the distillation framework enables a rigorous investigation of long-range ENSO predictability that complements real-time data-driven operational forecasts.
Related papers
- Accurate and Efficient Hybrid-Ensemble Atmospheric Data Assimilation in Latent Space with Uncertainty Quantification [20.877039031702605]
We propose a three-dimensional hybrid-ensemble DA method that operates in an atmospheric latent space learned via an autoencoder (AE)<n>HLOBA maps both model forecasts and observations into a shared latent space via the AE encoder and an end-to-end Observation-to-Latent-space mapping network (O2Lnet)<n> Experiments show that this uncertainty highlights large-error regions and captures their seasonal variability.
arXiv Detail & Related papers (2026-03-04T18:58:27Z) - EIDOS: Latent-Space Predictive Learning for Time Series Foundation Models [37.917978019436674]
EIDOS is a foundation model family that shifts pretraining from future value prediction to latent-space predictive learning.<n>We train a causal Transformer to predict the evolution of latent representations, encouraging the emergence of structured and temporally coherent latent states.
arXiv Detail & Related papers (2026-02-15T07:07:20Z) - Demystifying Data-Driven Probabilistic Medium-Range Weather Forecasting [63.8116386935854]
We demonstrate that state-of-the-art probabilistic skill requires neither intricate architectural constraints nor specialized trainings.<n>We introduce a scalable framework for learning multi-scale atmospheric dynamics by combining a directly downsampled latent space with a history-conditioned local projector.<n>We find that our framework design is robust to the choice of probabilistic estimators, seamlessly supporting interpolants, diffusion models, and CRPS-based ensemble training.
arXiv Detail & Related papers (2026-01-26T03:52:16Z) - DAWP: A framework for global observation forecasting via Data Assimilation and Weather Prediction in satellite observation space [60.729377189859]
We propose our DAWP framework to enable AIWPs to operate in a complete observation space.<n>AIDA module applies a mask multi-modality autoencoder for assimilating irregular satellite observation tokens.<n>We show that AIDA significantly improves the roll out and efficiency of AIWP and holds promising potential to be applied in global precipitationresolution forecasting.
arXiv Detail & Related papers (2025-10-13T03:13:35Z) - Faithful and Interpretable Explanations for Complex Ensemble Time Series Forecasts using Surrogate Models and Forecastability Analysis [1.5751034894694789]
We develop a surrogate-based explanation methodology that bridges the accuracy-interpretability gap.<n>We integrate spectral predictability analysis to quantify each series' inherent forecastability.<n>The resulting framework delivers interpretable, instance-level explanations for state-of-the-art ensemble forecasts.
arXiv Detail & Related papers (2025-10-09T18:49:45Z) - Bridging the Last Mile of Prediction: Enhancing Time Series Forecasting with Conditional Guided Flow Matching [9.465542901469815]
Conditional Guided Flow Matching (CGFM) is a model-agnostic framework that extends flow matching by integrating outputs from an auxiliary predictive model.<n>CGFM incorporates historical data as both conditions and guidance, uses two-sided conditional paths, and employs affine paths to expand the path space.<n> Experiments across datasets and baselines show CGFM consistently outperforms state-of-the-art models, advancing forecasting.
arXiv Detail & Related papers (2025-07-09T18:03:31Z) - A Generative Framework for Causal Estimation via Importance-Weighted Diffusion Distillation [55.53426007439564]
Estimating individualized treatment effects from observational data is a central challenge in causal inference.<n>In inverse probability weighting (IPW) is a well-established solution to this problem, but its integration into modern deep learning frameworks remains limited.<n>We propose Importance-Weighted Diffusion Distillation (IWDD), a novel generative framework that combines the pretraining of diffusion models with importance-weighted score distillation.
arXiv Detail & Related papers (2025-05-16T17:00:52Z) - When Rigidity Hurts: Soft Consistency Regularization for Probabilistic
Hierarchical Time Series Forecasting [69.30930115236228]
Probabilistic hierarchical time-series forecasting is an important variant of time-series forecasting.
Most methods focus on point predictions and do not provide well-calibrated probabilistic forecasts distributions.
We propose PROFHiT, a fully probabilistic hierarchical forecasting model that jointly models forecast distribution of entire hierarchy.
arXiv Detail & Related papers (2023-10-17T20:30:16Z) - When Rigidity Hurts: Soft Consistency Regularization for Probabilistic
Hierarchical Time Series Forecasting [69.30930115236228]
Probabilistic hierarchical time-series forecasting is an important variant of time-series forecasting.
Most methods focus on point predictions and do not provide well-calibrated probabilistic forecasts distributions.
We propose PROFHiT, a fully probabilistic hierarchical forecasting model that jointly models forecast distribution of entire hierarchy.
arXiv Detail & Related papers (2022-06-16T06:13:53Z) - Feature-weighted Stacking for Nonseasonal Time Series Forecasts: A Case
Study of the COVID-19 Epidemic Curves [0.0]
We investigate ensembling techniques in forecasting and examine their potential for use in nonseasonal time-series.
We propose using late data fusion, using a stacked ensemble of two forecasting models and two meta-features that prove their predictive power during a preliminary forecasting stage.
arXiv Detail & Related papers (2021-08-19T14:44:46Z) - Learning Interpretable Deep State Space Model for Probabilistic Time
Series Forecasting [98.57851612518758]
Probabilistic time series forecasting involves estimating the distribution of future based on its history.
We propose a deep state space model for probabilistic time series forecasting whereby the non-linear emission model and transition model are parameterized by networks.
We show in experiments that our model produces accurate and sharp probabilistic forecasts.
arXiv Detail & Related papers (2021-01-31T06:49:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.