LSTM networks provide efficient cyanobacterial blooms forecasting even with incomplete spatio-temporal data
- URL: http://arxiv.org/abs/2410.08237v1
- Date: Wed, 9 Oct 2024 15:13:24 GMT
- Title: LSTM networks provide efficient cyanobacterial blooms forecasting even with incomplete spatio-temporal data
- Authors: Claudia Fournier, Raul Fernandez-Fernandez, Samuel Cirés, José A. López-Orozco, Eva Besada-Portas, Antonio Quesada,
- Abstract summary: Early warning systems (EWS) for cyanobacterial blooms allow timely implementation of management measures.
In this paper, we propose an effective EWS for cyanobacterial bloom forecasting, which uses 6 years incomplete high-frequency-temporal data.
Results were analyzed for seven forecasting time horizons ranging from 4 to 28 days evaluated with a hybrid system.
- Score: 2.5537500385691594
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cyanobacteria are the most frequent dominant species of algal blooms in inland waters, threatening ecosystem function and water quality, especially when toxin-producing strains predominate. Enhanced by anthropogenic activities and global warming, cyanobacterial blooms are expected to increase in frequency and global distribution. Early warning systems (EWS) for cyanobacterial blooms development allow timely implementation of management measures, reducing the risks associated to these blooms. In this paper, we propose an effective EWS for cyanobacterial bloom forecasting, which uses 6 years of incomplete high-frequency spatio-temporal data from multiparametric probes, including phycocyanin (PC) fluorescence as a proxy for cyanobacteria. A probe agnostic and replicable method is proposed to pre-process the data and to generate time series specific for cyanobacterial bloom forecasting. Using these pre-processed data, six different non-site/species-specific predictive models were compared including the autoregressive and multivariate versions of Linear Regression, Random Forest, and Long-Term Short-Term (LSTM) neural networks. Results were analyzed for seven forecasting time horizons ranging from 4 to 28 days evaluated with a hybrid system that combined regression metrics (MSE, R2, MAPE) for PC values, classification metrics (Accuracy, F1, Kappa) for a proposed alarm level of 10 ug PC/L, and a forecasting-specific metric to measure prediction improvement over the displaced signal (skill). The multivariate version of LSTM showed the best and most consistent results across all forecasting horizons and metrics, achieving accuracies of up to 90% in predicting the proposed PC alarm level. Additionally, positive skill values indicated its outstanding effectiveness to forecast cyanobacterial blooms from 16 to 28 days in advance.
Related papers
- Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Harmful algal bloom forecasting. A comparison between stream and batch
learning [0.7067443325368975]
Harmful Algal Blooms (HABs) pose risks to public health and the shellfish industry.
This study develops a machine learning workflow for predicting the number of cells of a toxic dinoflagellate.
The model DoME emerged as the most effective and interpretable predictor, outperforming the other algorithms.
arXiv Detail & Related papers (2024-02-20T15:01:11Z) - Long-term drought prediction using deep neural networks based on geospatial weather data [75.38539438000072]
High-quality drought forecasting up to a year in advance is critical for agriculture planning and insurance.
We tackle drought data by introducing an end-to-end approach that adopts a systematic end-to-end approach.
Key findings are the exceptional performance of a Transformer model, EarthFormer, in making accurate short-term (up to six months) forecasts.
arXiv Detail & Related papers (2023-09-12T13:28:06Z) - Forecast reconciliation for vaccine supply chain optimization [61.13962963550403]
Vaccine supply chain optimization can benefit from hierarchical time series forecasting.
Forecasts of different hierarchy levels become incoherent when higher levels do not match the sum of the lower levels forecasts.
We tackle the vaccine sale forecasting problem by modeling sales data from GSK between 2010 and 2021 as a hierarchical time series.
arXiv Detail & Related papers (2023-05-02T14:34:34Z) - Plant species richness prediction from DESIS hyperspectral data: A
comparison study on feature extraction procedures and regression models [1.8757823231879849]
This study provides a quantitative assessment on the ability of DESIS hyperspectral data for predicting plant species richness in two different habitat types in southeast Australia.
Relative importance analysis for the DESIS spectral bands showed that the red-edge, red, and blue spectral regions were more important for predicting plant species richness than the green bands and the near-infrared bands beyond red-edge.
arXiv Detail & Related papers (2023-01-05T05:33:56Z) - Interpretable Battery Cycle Life Range Prediction Using Early
Degradation Data at Cell Level [0.8137198664755597]
Quantile Regression Forest (QRF) model is introduced to make cycle life range prediction with uncertainty quantified.
Data-driven methods have been proposed for point prediction of battery cycle life with minimum knowledge of the battery degradation mechanisms.
The interpretability of the final QRF model is explored with two global model-agnostic methods.
arXiv Detail & Related papers (2022-04-26T16:26:27Z) - Feature-weighted Stacking for Nonseasonal Time Series Forecasts: A Case
Study of the COVID-19 Epidemic Curves [0.0]
We investigate ensembling techniques in forecasting and examine their potential for use in nonseasonal time-series.
We propose using late data fusion, using a stacked ensemble of two forecasting models and two meta-features that prove their predictive power during a preliminary forecasting stage.
arXiv Detail & Related papers (2021-08-19T14:44:46Z) - When in Doubt: Neural Non-Parametric Uncertainty Quantification for
Epidemic Forecasting [70.54920804222031]
Most existing forecasting models disregard uncertainty quantification, resulting in mis-calibrated predictions.
Recent works in deep neural models for uncertainty-aware time-series forecasting also have several limitations.
We model the forecasting task as a probabilistic generative process and propose a functional neural process model called EPIFNP.
arXiv Detail & Related papers (2021-06-07T18:31:47Z) - Comparison of Traditional and Hybrid Time Series Models for Forecasting
COVID-19 Cases [0.5849513679510832]
The coronavirus outbreak of December 2019 has already infected millions all over the world and continues to spread on.
Just when the curve of the outbreak had started to flatten, many countries have again started to witness a rise in cases.
A thorough analysis of time-series forecasting models is therefore required to equip state authorities and health officials with immediate strategies for future times.
arXiv Detail & Related papers (2021-05-05T14:56:27Z) - STELAR: Spatio-temporal Tensor Factorization with Latent Epidemiological
Regularization [76.57716281104938]
We develop a tensor method to predict the evolution of epidemic trends for many regions simultaneously.
STELAR enables long-term prediction by incorporating latent temporal regularization through a system of discrete-time difference equations.
We conduct experiments using both county- and state-level COVID-19 data and show that our model can identify interesting latent patterns of the epidemic.
arXiv Detail & Related papers (2020-12-08T21:21:47Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.