Explainable machine learning for predicting shellfish toxicity in the Adriatic Sea using long-term monitoring data of HABs
- URL: http://arxiv.org/abs/2405.04372v2
- Date: Thu, 9 May 2024 09:46:35 GMT
- Title: Explainable machine learning for predicting shellfish toxicity in the Adriatic Sea using long-term monitoring data of HABs
- Authors: Martin Marzidovšek, Janja Francé, Vid Podpečan, Stanka Vadnjal, Jožica Dolenc, Patricija Mozetič,
- Abstract summary: We train and evaluate machine learning models to accurately predict diarrhetic shellfish poisoning events.
The random forest model provided the best prediction of positive toxicity results based on the F1 score.
Key species (Dinophysis fortii and D. caudata) and environmental factors (salinity, river discharge and precipitation) were the best predictors of DSP outbreaks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this study, explainable machine learning techniques are applied to predict the toxicity of mussels in the Gulf of Trieste (Adriatic Sea) caused by harmful algal blooms. By analysing a newly created 28-year dataset containing records of toxic phytoplankton in mussel farming areas and toxin concentrations in mussels (Mytilus galloprovincialis), we train and evaluate the performance of ML models to accurately predict diarrhetic shellfish poisoning (DSP) events. The random forest model provided the best prediction of positive toxicity results based on the F1 score. Explainability methods such as permutation importance and SHAP identified key species (Dinophysis fortii and D. caudata) and environmental factors (salinity, river discharge and precipitation) as the best predictors of DSP outbreaks. These findings are important for improving early warning systems and supporting sustainable aquaculture practices.
Related papers
- Smoke and Mirrors in Causal Downstream Tasks [59.90654397037007]
This paper looks at the causal inference task of treatment effect estimation, where the outcome of interest is recorded in high-dimensional observations.
We compare 6 480 models fine-tuned from state-of-the-art visual backbones, and find that the sampling and modeling choices significantly affect the accuracy of the causal estimate.
Our results suggest that future benchmarks should carefully consider real downstream scientific questions, especially causal ones.
arXiv Detail & Related papers (2024-05-27T13:26:34Z) - Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Harmful algal bloom forecasting. A comparison between stream and batch
learning [0.7067443325368975]
Harmful Algal Blooms (HABs) pose risks to public health and the shellfish industry.
This study develops a machine learning workflow for predicting the number of cells of a toxic dinoflagellate.
The model DoME emerged as the most effective and interpretable predictor, outperforming the other algorithms.
arXiv Detail & Related papers (2024-02-20T15:01:11Z) - Hybrid Machine Learning techniques in the management of harmful algal
blooms impact [0.7864304771129751]
Mollusc farming can be affected by Harmful algal blooms (HABs)
HABs are episodes of high concentrations of algae that are potentially toxic for human consumption.
To avoid the risk to human consumption, harvesting is prohibited when toxicity is detected.
arXiv Detail & Related papers (2024-02-14T15:59:22Z) - Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous
Dimensions in Pre-trained Language Models Caused by Backdoor or Bias [64.81358555107788]
Pre-trained Language Models (PLMs) may be poisonous with backdoors or bias injected by the suspicious attacker during the fine-tuning process.
We propose the Fine-purifying approach, which utilizes the diffusion theory to study the dynamic process of fine-tuning for finding potentially poisonous dimensions.
To the best of our knowledge, we are the first to study the dynamics guided by the diffusion theory for safety or defense purposes.
arXiv Detail & Related papers (2023-05-08T08:40:30Z) - Understanding Adverse Biological Effect Predictions Using Knowledge
Graphs [11.607236829607135]
We extrapolate effects based on a knowledge graph (KG) consisting of the most relevant effect data as domain-specific background knowledge.
Background knowledge improves the model prediction performance by up to 40% in terms of $R2$ (ie coefficient of determination)
arXiv Detail & Related papers (2022-10-28T08:32:11Z) - Predicting Chemical Hazard across Taxa through Machine Learning [0.3262230127283452]
We analyze the relevance of taxonomy and experimental setup, and show that taking them into account can lead to considerable improvements in the classification performance.
We use our approach with standard machine learning models (K-nearest neighbors, random forests and deep neural networks), as well as the recently proposed Read-Across Structure Activity Relationship (RASAR) models.
arXiv Detail & Related papers (2021-10-07T15:33:58Z) - Early Detection of Fish Diseases by Analyzing Water Quality Using
Machine Learning Algorithm [0.0]
A state-of-art machine learning algorithm has been adopted in this paper to detect and predict the degradation of water quality timely and accurately.
The experimental results show a high accuracy in detecting fish diseases particular to specific water quality based on the algorithm with real datasets.
arXiv Detail & Related papers (2021-02-15T18:52:58Z) - STELAR: Spatio-temporal Tensor Factorization with Latent Epidemiological
Regularization [76.57716281104938]
We develop a tensor method to predict the evolution of epidemic trends for many regions simultaneously.
STELAR enables long-term prediction by incorporating latent temporal regularization through a system of discrete-time difference equations.
We conduct experiments using both county- and state-level COVID-19 data and show that our model can identify interesting latent patterns of the epidemic.
arXiv Detail & Related papers (2020-12-08T21:21:47Z) - Movement Tracks for the Automatic Detection of Fish Behavior in Videos [63.85815474157357]
We offer a dataset of sablefish (Anoplopoma fimbria) startle behaviors in underwater videos, and investigate the use of deep learning (DL) methods for behavior detection on it.
Our proposed detection system identifies fish instances using DL-based frameworks, determines trajectory tracks, derives novel behavior-specific features, and employs Long Short-Term Memory (LSTM) networks to identify startle behavior in sablefish.
arXiv Detail & Related papers (2020-11-28T05:51:19Z) - Enabling Counterfactual Survival Analysis with Balanced Representations [64.17342727357618]
Survival data are frequently encountered across diverse medical applications, i.e., drug development, risk profiling, and clinical trials.
We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes.
arXiv Detail & Related papers (2020-06-14T01:15:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.