Related papers: Harmful algal bloom forecasting. A comparison between stream and batch learning

Harmful algal bloom forecasting. A comparison between stream and batch learning

URL: http://arxiv.org/abs/2402.13304v1
Date: Tue, 20 Feb 2024 15:01:11 GMT
Title: Harmful algal bloom forecasting. A comparison between stream and batch learning
Authors: Andres Molares-Ulloa, Elisabet Rocruz, Daniel Rivero, Xos\'e A. Padin, Rita Nolasco, Jes\'us Dubert and Enrique Fernandez-Blanco
Abstract summary: Harmful Algal Blooms (HABs) pose risks to public health and the shellfish industry. This study develops a machine learning workflow for predicting the number of cells of a toxic dinoflagellate. The model DoME emerged as the most effective and interpretable predictor, outperforming the other algorithms.
Score: 0.7067443325368975
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diarrhetic Shellfish Poisoning (DSP) is a global health threat arising from shellfish contaminated with toxins produced by dinoflagellates. The condition, with its widespread incidence, high morbidity rate, and persistent shellfish toxicity, poses risks to public health and the shellfish industry. High biomass of toxin-producing algae such as DSP are known as Harmful Algal Blooms (HABs). Monitoring and forecasting systems are crucial for mitigating HABs impact. Predicting harmful algal blooms involves a time-series-based problem with a strong historical seasonal component, however, recent anomalies due to changes in meteorological and oceanographic events have been observed. Stream Learning stands out as one of the most promising approaches for addressing time-series-based problems with concept drifts. However, its efficacy in predicting HABs remains unproven and needs to be tested in comparison with Batch Learning. Historical data availability is a critical point in developing predictive systems. In oceanography, the available data collection can have some constrains and limitations, which has led to exploring new tools to obtain more exhaustive time series. In this study, a machine learning workflow for predicting the number of cells of a toxic dinoflagellate, Dinophysis acuminata, was developed with several key advancements. Seven machine learning algorithms were compared within two learning paradigms. Notably, the output data from CROCO, the ocean hydrodynamic model, was employed as the primary dataset, palliating the limitation of time-continuous historical data. This study highlights the value of models interpretability, fair models comparison methodology, and the incorporation of Stream Learning models. The model DoME, with an average R2 of 0.77 in the 3-day-ahead prediction, emerged as the most effective and interpretable predictor, outperforming the other algorithms.

Related papers

Deep Learning for Disease Outbreak Prediction: A Robust Early Warning Signal for Transcritical Bifurcations [6.616648875013729]
Early Warning Signals (EWSs) are vital for implementing preventive measures before a disease turns into a pandemic. measurements during disease outbreaks are often corrupted by different noise sources. This study bridges advancements in deep learning with the ability to provide robust early warning signals in noisy environments.
arXiv Detail & Related papers (2025-01-14T00:47:05Z)
Analyzing Spatio-Temporal Dynamics of Dissolved Oxygen for the River Thames using Superstatistical Methods and Machine Learning [0.0]
We use superstatistical methods and machine learning to predict dissolved oxygen levels in the River Thames. For long-term forecasting, the Informer model consistently delivers superior performance.
arXiv Detail & Related papers (2025-01-10T16:54:52Z)
Development and Comparative Analysis of Machine Learning Models for Hypoxemia Severity Triage in CBRNE Emergency Scenarios Using Physiological and Demographic Data from Medical-Grade Devices [0.0]
Gradient Boosting Models (GBMs) outperformed sequential models in terms of training speed, interpretability, and reliability. A 5-minute prediction window was chosen for timely intervention, with minute-levels standardizing the data. This study highlights ML's potential to improve triage and reduce alarm fatigue.
arXiv Detail & Related papers (2024-10-30T23:24:28Z)
Explainable machine learning for predicting shellfish toxicity in the Adriatic Sea using long-term monitoring data of HABs [0.0]
We train and evaluate machine learning models to accurately predict diarrhetic shellfish poisoning events. The random forest model provided the best prediction of positive toxicity results based on the F1 score. Key species (Dinophysis fortii and D. caudata) and environmental factors (salinity, river discharge and precipitation) were the best predictors of DSP outbreaks.
arXiv Detail & Related papers (2024-05-07T14:55:42Z)
Hybrid Machine Learning techniques in the management of harmful algal blooms impact [0.7864304771129751]
Mollusc farming can be affected by Harmful algal blooms (HABs) HABs are episodes of high concentrations of algae that are potentially toxic for human consumption. To avoid the risk to human consumption, harvesting is prohibited when toxicity is detected.
arXiv Detail & Related papers (2024-02-14T15:59:22Z)
An Extreme-Adaptive Time Series Prediction Model Based on Probability-Enhanced LSTM Neural Networks [6.5700527395783315]
We propose a novel probability-enhanced neural network model, called NEC+, which concurrently learns extreme and normal prediction functions. We evaluate the proposed model on the difficult 3-day ahead hourly water level prediction task applied to 9 reservoirs in California.
arXiv Detail & Related papers (2022-11-29T03:01:59Z)
Back2Future: Leveraging Backfill Dynamics for Improving Real-time Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task. 'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature. We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z)
When in Doubt: Neural Non-Parametric Uncertainty Quantification for Epidemic Forecasting [70.54920804222031]
Most existing forecasting models disregard uncertainty quantification, resulting in mis-calibrated predictions. Recent works in deep neural models for uncertainty-aware time-series forecasting also have several limitations. We model the forecasting task as a probabilistic generative process and propose a functional neural process model called EPIFNP.
arXiv Detail & Related papers (2021-06-07T18:31:47Z)
Bootstrapping Your Own Positive Sample: Contrastive Learning With Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model. We introduce two unique positive sampling strategies specifically tailored for EHR data. Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z)
Gaussian Process Nowcasting: Application to COVID-19 Mortality Reporting [2.8712862578745018]
Updating observations of a signal due to the delays in the measurement process is a common problem in signal processing. We present a flexible approach using a latent Gaussian process that is capable of describing the changing auto-correlation structure present in the reporting time-delay surface. This approach also yields robust estimates of uncertainty for the estimated nowcasted numbers of deaths.
arXiv Detail & Related papers (2021-02-22T18:32:44Z)
STELAR: Spatio-temporal Tensor Factorization with Latent Epidemiological Regularization [76.57716281104938]
We develop a tensor method to predict the evolution of epidemic trends for many regions simultaneously. STELAR enables long-term prediction by incorporating latent temporal regularization through a system of discrete-time difference equations. We conduct experiments using both county- and state-level COVID-19 data and show that our model can identify interesting latent patterns of the epidemic.
arXiv Detail & Related papers (2020-12-08T21:21:47Z)
DeepRite: Deep Recurrent Inverse TreatmEnt Weighting for Adjusting Time-varying Confounding in Modern Longitudinal Observational Data [68.29870617697532]
We propose Deep Recurrent Inverse TreatmEnt weighting (DeepRite) for time-varying confounding in longitudinal data. DeepRite is shown to recover the ground truth from synthetic data, and estimate unbiased treatment effects from real data.
arXiv Detail & Related papers (2020-10-28T15:05:08Z)
A General Framework for Survival Analysis and Multi-State Modelling [70.31153478610229]
We use neural ordinary differential equations as a flexible and general method for estimating multi-state survival models. We show that our model exhibits state-of-the-art performance on popular survival data sets and demonstrate its efficacy in a multi-state setting.
arXiv Detail & Related papers (2020-06-08T19:24:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.