Coastal water quality prediction based on machine learning with feature
interpretation and spatio-temporal analysis
- URL: http://arxiv.org/abs/2107.03230v2
- Date: Fri, 9 Jul 2021 07:09:03 GMT
- Title: Coastal water quality prediction based on machine learning with feature
interpretation and spatio-temporal analysis
- Authors: Luka Grb\v{c}i\'c, Sini\v{s}a Dru\v{z}eta, Goran Mau\v{s}a, Tomislav
Lipi\'c, Darija Vuki\'c Lu\v{s}i\'c, Marta Alvir, Ivana Lu\v{c}in, Ante
Sikirica, Davor Davidovi\'c, Vanja Trava\v{s}, Daniela Kalafatovi\'c,
Kristina Pikelj, Hana Fajkovi\'c, Toni Holjevi\'c and Lado Kranj\v{c}evi\'c
- Abstract summary: Poor coastal water quality can harbor pathogens that are dangerous to human health.
Routine monitoring data of $Escherichia Coli$ and enterococci across 15 public beaches in Rijeka, Croatia, were used to build machine learning models.
Catboost algorithm performed best with R$2$ values of 0.71 and 0.68 for predicting $E. Coli$ and enterococci.
- Score: 1.1124907412872893
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Coastal water quality management is a public health concern, as poor coastal
water quality can harbor pathogens that are dangerous to human health.
Tourism-oriented countries need to actively monitor the condition of coastal
water at tourist popular sites during the summer season. In this study, routine
monitoring data of $Escherichia\ Coli$ and enterococci across 15 public beaches
in the city of Rijeka, Croatia, were used to build machine learning models for
predicting their levels based on environmental parameters as well as to
investigate their relationships with environmental stressors. Gradient Boosting
(Catboost, Xgboost), Random Forests, Support Vector Regression and Artificial
Neural Networks were trained with measurements from all sampling sites and used
to predict $E.\ Coli$ and enterococci values based on environmental features.
The evaluation of stability and generalizability with 10-fold cross validation
analysis of the machine learning models, showed that the Catboost algorithm
performed best with R$^2$ values of 0.71 and 0.68 for predicting $E.\ Coli$ and
enterococci, respectively, compared to other evaluated ML algorithms including
Xgboost, Random Forests, Support Vector Regression and Artificial Neural
Networks. We also use the SHapley Additive exPlanations technique to identify
and interpret which features have the most predictive power. The results show
that site salinity measured is the most important feature for forecasting both
$E.\ Coli$ and enterococci levels. Finally, the spatial and temporal accuracy
of both ML models were examined at sites with the lowest coastal water quality.
The spatial $E. Coli$ and enterococci models achieved strong R$^2$ values of
0.85 and 0.83, while the temporal models achieved R$^2$ values of 0.74 and
0.67. The temporal model also achieved moderate R$^2$ values of 0.44 and 0.46
at a site with high coastal water quality.
Related papers
- Fourier Neural Operator based surrogates for $CO_2$ storage in realistic geologies [57.23978190717341]
We develop a Neural Operator (FNO) based model for real-time, high-resolution simulation of $CO$ plume migration.
The model is trained on a comprehensive dataset generated from realistic subsurface parameters.
We present various strategies for improving the reliability of predictions from the model, which is crucial while assessing actual geological sites.
arXiv Detail & Related papers (2025-03-14T02:58:24Z) - Integrating Boosted learning with Differential Evolution (DE) Optimizer: A Prediction of Groundwater Quality Risk Assessment in Odisha [0.0]
This study developed a machine learning-based predictive model to evaluate the Groundwater Quality Index (GWQI)
It has been achieved with the help of a hybrid machine learning model i.e. LCBoost Fusion.
arXiv Detail & Related papers (2025-02-25T07:47:41Z) - Analyzing Spatio-Temporal Dynamics of Dissolved Oxygen for the River Thames using Superstatistical Methods and Machine Learning [0.0]
We use superstatistical methods and machine learning to predict dissolved oxygen levels in the River Thames.
For long-term forecasting, the Informer model consistently delivers superior performance.
arXiv Detail & Related papers (2025-01-10T16:54:52Z) - WaterQualityNeT: Prediction of Seasonal Water Quality of Nepal Using Hybrid Deep Learning Models [0.0]
This paper presents a hybrid deep learning model for predicting Nepal's seasonal water quality using a small dataset.
The model integrates convolutional neural networks (CNN) and recurrent neural networks (RNN) to exploit temporal and spatial patterns in the data.
arXiv Detail & Related papers (2024-09-17T05:26:59Z) - Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options.
The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z) - Residual Corrective Diffusion Modeling for Km-scale Atmospheric Downscaling [58.456404022536425]
State of the art for physical hazard prediction from weather and climate requires expensive km-scale numerical simulations driven by coarser resolution global inputs.
Here, a generative diffusion architecture is explored for downscaling such global inputs to km-scale, as a cost-effective machine learning alternative.
The model is trained to predict 2km data from a regional weather model over Taiwan, conditioned on a 25km global reanalysis.
arXiv Detail & Related papers (2023-09-24T19:57:22Z) - Short-term prediction of stream turbidity using surrogate data and a
meta-model approach [0.0]
We build and compare the ability of dynamic regression (ARIMA), long short-term memory neural nets (LSTM), and generalized additive models (GAM) to forecast stream turbidity.
We construct a meta-model, trained on time-series features of turbidity, to take advantage of the strengths of each model over different time points.
Our findings indicate that temperature and light-associated variables, for example underwater illuminance, may hold promise as cost-effective surrogates of turbidity.
arXiv Detail & Related papers (2022-10-11T23:05:32Z) - Generalizing electrocardiogram delineation: training convolutional
neural networks with synthetic data augmentation [63.51064808536065]
Existing databases for ECG delineation are small, being insufficient in size and in the array of pathological conditions they represent.
This article delves has two main contributions. First, a pseudo-synthetic data generation algorithm was developed, based in probabilistically composing ECG traces given "pools" of fundamental segments, as cropped from the original databases, and a set of rules for their arrangement into coherent synthetic traces.
Second, two novel segmentation-based loss functions have been developed, which attempt at enforcing the prediction of an exact number of independent structures and at producing closer segmentation boundaries by focusing on a reduced number of samples.
arXiv Detail & Related papers (2021-11-25T10:11:41Z) - Test-time Batch Statistics Calibration for Covariate Shift [66.7044675981449]
We propose to adapt the deep models to the novel environment during inference.
We present a general formulation $alpha$-BN to calibrate the batch statistics.
We also present a novel loss function to form a unified test time adaptation framework Core.
arXiv Detail & Related papers (2021-10-06T08:45:03Z) - Artificial Intelligence Hybrid Deep Learning Model for Groundwater Level
Prediction Using MLP-ADAM [0.0]
In this paper, a multi-layer perceptron is applied to simulate groundwater level.
The adaptive moment estimation algorithm is also used to this matter.
Results indicate that deep learning algorithms can demonstrate a high accuracy prediction.
arXiv Detail & Related papers (2021-07-29T10:11:45Z) - Instance Segmentation of Microscopic Foraminifera [0.0629976670819788]
We present a deep learning-based instance segmentation model for classifying, detecting, and segmenting microscopic foraminifera.
Our model is based on the Mask R-CNN architecture, using model weight parameters that have learned on the COCO detection dataset.
arXiv Detail & Related papers (2021-05-15T10:46:22Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - Automatic sleep stage classification with deep residual networks in a
mixed-cohort setting [63.52264764099532]
We developed a novel deep neural network model to assess the generalizability of several large-scale cohorts.
Overall classification accuracy improved with increasing fractions of training data.
arXiv Detail & Related papers (2020-08-21T10:48:35Z) - Statistical Downscaling of Temperature Distributions from the Synoptic
Scale to the Mesoscale Using Deep Convolutional Neural Networks [0.0]
One of the promising applications is developing a statistical surrogate model that converts the output images of low-resolution dynamic models to high-resolution images.
Our study evaluates a surrogate model that downscales synoptic temperature fields to mesoscale temperature fields every 6 hours.
If the surrogate models are implemented at short time intervals, they will provide high-resolution weather forecast guidance or environment emergency alerts at low cost.
arXiv Detail & Related papers (2020-07-20T06:24:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.