Hybrid Machine Learning techniques in the management of harmful algal
blooms impact
- URL: http://arxiv.org/abs/2402.09271v1
- Date: Wed, 14 Feb 2024 15:59:22 GMT
- Title: Hybrid Machine Learning techniques in the management of harmful algal
blooms impact
- Authors: Andres Molares-Ulloa, Daniel Rivero, Jesus Gil Ruiz, Enrique
Fernandez-Blanco and Luis de-la-Fuente-Valent\'in
- Abstract summary: Mollusc farming can be affected by Harmful algal blooms (HABs)
HABs are episodes of high concentrations of algae that are potentially toxic for human consumption.
To avoid the risk to human consumption, harvesting is prohibited when toxicity is detected.
- Score: 0.7864304771129751
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Harmful algal blooms (HABs) are episodes of high concentrations of algae that
are potentially toxic for human consumption. Mollusc farming can be affected by
HABs because, as filter feeders, they can accumulate high concentrations of
marine biotoxins in their tissues. To avoid the risk to human consumption,
harvesting is prohibited when toxicity is detected. At present, the closure of
production areas is based on expert knowledge and the existence of a predictive
model would help when conditions are complex and sampling is not possible.
Although the concentration of toxin in meat is the method most commonly used by
experts in the control of shellfish production areas, it is rarely used as a
target by automatic prediction models. This is largely due to the irregularity
of the data due to the established sampling programs. As an alternative, the
activity status of production areas has been proposed as a target variable
based on whether mollusc meat has a toxicity level below or above the legal
limit. This new option is the most similar to the actual functioning of the
control of shellfish production areas. For this purpose, we have made a
comparison between hybrid machine learning models like Neural-Network-Adding
Bootstrap (BAGNET) and Discriminative Nearest Neighbor Classification (SVM-KNN)
when estimating the state of production areas. The study has been carried out
in several estuaries with different levels of complexity in the episodes of
algal blooms to demonstrate the generalization capacity of the models in bloom
detection. As a result, we could observe that, with an average recall value of
93.41% and without dropping below 90% in any of the estuaries, BAGNET
outperforms the other models both in terms of results and robustness.
Related papers
- Unlearnable Examples Detection via Iterative Filtering [84.59070204221366]
Deep neural networks are proven to be vulnerable to data poisoning attacks.
It is quite beneficial and challenging to detect poisoned samples from a mixed dataset.
We propose an Iterative Filtering approach for UEs identification.
arXiv Detail & Related papers (2024-08-15T13:26:13Z) - Explainable machine learning for predicting shellfish toxicity in the Adriatic Sea using long-term monitoring data of HABs [0.0]
We train and evaluate machine learning models to accurately predict diarrhetic shellfish poisoning events.
The random forest model provided the best prediction of positive toxicity results based on the F1 score.
Key species (Dinophysis fortii and D. caudata) and environmental factors (salinity, river discharge and precipitation) were the best predictors of DSP outbreaks.
arXiv Detail & Related papers (2024-05-07T14:55:42Z) - Harmful algal bloom forecasting. A comparison between stream and batch
learning [0.7067443325368975]
Harmful Algal Blooms (HABs) pose risks to public health and the shellfish industry.
This study develops a machine learning workflow for predicting the number of cells of a toxic dinoflagellate.
The model DoME emerged as the most effective and interpretable predictor, outperforming the other algorithms.
arXiv Detail & Related papers (2024-02-20T15:01:11Z) - Machine Learning in management of precautionary closures caused by
lipophilic biotoxins [43.51581973358462]
Mussel farming is one of the most important aquaculture industries.
The main risk to mussel farming is harmful algal blooms (HABs), which pose a risk to human consumption.
This work proposes a predictive model capable of supporting the application of precautionary closures.
arXiv Detail & Related papers (2024-02-14T15:51:58Z) - Exploring Model Dynamics for Accumulative Poisoning Discovery [62.08553134316483]
We propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information.
By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples.
We thoroughly explore its properties and propose Discrepancy-aware Sample Correction (DSC) to defend against accumulative poisoning attacks.
arXiv Detail & Related papers (2023-06-06T14:45:24Z) - Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous
Dimensions in Pre-trained Language Models Caused by Backdoor or Bias [64.81358555107788]
Pre-trained Language Models (PLMs) may be poisonous with backdoors or bias injected by the suspicious attacker during the fine-tuning process.
We propose the Fine-purifying approach, which utilizes the diffusion theory to study the dynamic process of fine-tuning for finding potentially poisonous dimensions.
To the best of our knowledge, we are the first to study the dynamics guided by the diffusion theory for safety or defense purposes.
arXiv Detail & Related papers (2023-05-08T08:40:30Z) - Multi-Target Tobit Models for Completing Water Quality Data [0.0]
Tobit model is a well-known linear regression model for analyzing censored data.
In this study, we devised a novel extension of the classical Tobit model, called the emphmulti-target Tobit model, to handle multiple censored variables simultaneously.
Experiments conducted using several real-world water quality datasets provided evidence that estimating multiple columns jointly gains a great advantage over estimating them separately.
arXiv Detail & Related papers (2023-02-21T13:06:19Z) - Predicting Chemical Hazard across Taxa through Machine Learning [0.3262230127283452]
We analyze the relevance of taxonomy and experimental setup, and show that taking them into account can lead to considerable improvements in the classification performance.
We use our approach with standard machine learning models (K-nearest neighbors, random forests and deep neural networks), as well as the recently proposed Read-Across Structure Activity Relationship (RASAR) models.
arXiv Detail & Related papers (2021-10-07T15:33:58Z) - Tracking disease outbreaks from sparse data with Bayesian inference [55.82986443159948]
The COVID-19 pandemic provides new motivation for estimating the empirical rate of transmission during an outbreak.
Standard methods struggle to accommodate the partial observability and sparse data common at finer scales.
We propose a Bayesian framework which accommodates partial observability in a principled manner.
arXiv Detail & Related papers (2020-09-12T20:37:33Z) - Improving Maximum Likelihood Training for Text Generation with Density
Ratio Estimation [51.091890311312085]
We propose a new training scheme for auto-regressive sequence generative models, which is effective and stable when operating at large sample space encountered in text generation.
Our method stably outperforms Maximum Likelihood Estimation and other state-of-the-art sequence generative models in terms of both quality and diversity.
arXiv Detail & Related papers (2020-07-12T15:31:24Z) - A General Framework for Survival Analysis and Multi-State Modelling [70.31153478610229]
We use neural ordinary differential equations as a flexible and general method for estimating multi-state survival models.
We show that our model exhibits state-of-the-art performance on popular survival data sets and demonstrate its efficacy in a multi-state setting.
arXiv Detail & Related papers (2020-06-08T19:24:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.