Unveiling 3D Ocean Biogeochemical Provinces: A Machine Learning Approach for Systematic Clustering and Validation
- URL: http://arxiv.org/abs/2504.18181v1
- Date: Fri, 25 Apr 2025 08:55:40 GMT
- Title: Unveiling 3D Ocean Biogeochemical Provinces: A Machine Learning Approach for Systematic Clustering and Validation
- Authors: Yvonne Jenniges, Maike Sonnewald, Sebastian Maneth, Are Olsen, Boris P. Koch,
- Abstract summary: The aim was to objectively define regions of the North Atlantic.<n>About 300 million measured salinity, temperature, and oxygen, nitrate, phosphate and silicate concentration values served as input.
- Score: 1.2932412290302258
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Defining ocean regions and water masses helps to understand marine processes and can serve downstream-tasks such as defining marine protected areas. However, such definitions are often a result of subjective decisions potentially producing misleading, unreproducible results. Here, the aim was to objectively define regions of the North Atlantic. For this, a data-driven, systematic machine learning approach was applied to generate and validate ocean clusters employing external, internal and relative validation techniques. About 300 million measured salinity, temperature, and oxygen, nitrate, phosphate and silicate concentration values served as input for various clustering methods (KMeans, agglomerative Ward, and Density-Based Spatial Clustering of Applications with Noise (DBSCAN)). Uniform Manifold Approximation and Projection (UMAP) emphasised (dis-)similarities in the data while reducing dimensionality. Based on a systematic validation of the considered clustering methods and their hyperparameters, the results showed that UMAP-DBSCAN best represented the data. To address stochastic variability, 100 UMAP-DBSCAN clustering runs were conducted and aggregated using Native Emergent Manifold Interrogation (NEMI), producing a final set of 321 clusters. Reproducibility was evaluated by calculating the ensemble overlap (88.81 +- 1.8%) and the mean grid cell-wise uncertainty estimated by NEMI (15.49 +- 20%). The presented clustering results agreed very well with common water mass definitions. This study revealed a more detailed regionalization compared to previous concepts such as the Longhurst provinces. The applied method is objective, efficient and reproducible and will support future research focusing on biogeochemical differences and changes in oceanic regions.
Related papers
- An Enhanced Classification Method Based on Adaptive Multi-Scale Fusion for Long-tailed Multispectral Point Clouds [67.96583737413296]
We propose an enhanced classification method based on adaptive multi-scale fusion for MPCs with long-tailed distributions.<n>In the training set generation stage, a grid-balanced sampling strategy is designed to reliably generate training samples from sparse labeled datasets.<n>In the feature learning stage, a multi-scale feature fusion module is proposed to fuse shallow features of land-covers at different scales.
arXiv Detail & Related papers (2024-12-16T03:21:20Z) - Clustering Based on Density Propagation and Subcluster Merging [92.15924057172195]
We propose a density-based node clustering approach that automatically determines the number of clusters and can be applied in both data space and graph space.
Unlike traditional density-based clustering methods, which necessitate calculating the distance between any two nodes, our proposed technique determines density through a propagation process.
arXiv Detail & Related papers (2024-11-04T04:09:36Z) - Towards the Uncharted: Density-Descending Feature Perturbation for Semi-supervised Semantic Segmentation [51.66997548477913]
We propose a novel feature-level consistency learning framework named Density-Descending Feature Perturbation (DDFP)
Inspired by the low-density separation assumption in semi-supervised learning, our key insight is that feature density can shed a light on the most promising direction for the segmentation classifier to explore.
The proposed DDFP outperforms other designs on feature-level perturbations and shows state of the art performances on both Pascal VOC and Cityscapes dataset.
arXiv Detail & Related papers (2024-03-11T06:59:05Z) - Integration of geoelectric and geochemical data using Self-Organizing
Maps (SOM) to characterize a landfill [0.0]
The risk of affecting the aquifers for public use is imminent in most cases.
Geoelectric data (resistivity and IP), and surface methane measurements, are integrated and classified using an unsupervised Neural Network.
A precise delimitation of the affected areas in the studied landfill was obtained, integrating the input variables via SOMs.
arXiv Detail & Related papers (2023-09-17T05:38:54Z) - GFDC: A Granule Fusion Density-Based Clustering with Evidential
Reasoning [22.526274021556755]
density-based clustering algorithms are widely applied because they can detect clusters with arbitrary shapes.
This paper proposes a granule fusion density-based clustering with evidential reasoning (GFDC)
Both local and global densities of samples are measured by a sparse degree metric first.
Then information granules are generated in high-density and low-density regions, assisting in processing clusters with significant density differences.
arXiv Detail & Related papers (2023-05-20T06:27:31Z) - SALT: Sea lice Adaptive Lattice Tracking -- An Unsupervised Approach to
Generate an Improved Ocean Model [72.3183990520267]
We propose SALT: Sea lice Adaptive Lattice Tracking approach for efficient estimation of sea lice dispersion and distribution.
Specifically, an adaptive spatial mesh is generated by merging nodes in the lattice graph of the Ocean Model based on local ocean properties.
The proposed SALT technique shows promise for enhancing proactive aquaculture management through predictive modelling of sea lice infestation pressure maps in a changing climate.
arXiv Detail & Related papers (2021-06-24T17:29:42Z) - Hyperspectral and Multispectral Classification for Coastal Wetland Using
Depthwise Feature Interaction Network [20.896413926049398]
Deepwise Feature Interaction Network (DFINet) is proposed for wetland classification.
DFINet is optimized by coordinating consistency loss, discrimination loss, and classification loss.
Comprehensive experimental results on two hyperspectral and multispectral wetland datasets demonstrate that the proposed DFINet outperforms other competitive methods in terms of overall accuracy.
arXiv Detail & Related papers (2021-06-13T01:56:28Z) - Surface Warping Incorporating Machine Learning Assisted Domain
Likelihood Estimation: A New Paradigm in Mine Geology Modelling and
Automation [68.8204255655161]
A Bayesian warping technique has been proposed to reshape modeled surfaces based on geochemical and spatial constraints imposed by newly acquired blasthole data.
This paper focuses on incorporating machine learning in this warping framework to make the likelihood generalizable.
Its foundation is laid by a Bayesian computation in which the geological domain likelihood given the chemistry, p(g|c) plays a similar role to p(y(c)|g.
arXiv Detail & Related papers (2021-02-15T10:37:52Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Learning excursion sets of vector-valued Gaussian random fields for
autonomous ocean sampling [0.41998444721319217]
We develop efficient spatial sampling methods for characterizing regions defined by simultaneous exceedances above prescribed thresholds of several responses.
Specifically, we define a design criterion based on uncertainty in the excursions of vector-valued Gaussian random fields.
We demonstrate how this criterion can be used to prioritize sampling efforts at locations that are ambiguous, making exploration more effective.
arXiv Detail & Related papers (2020-07-07T18:23:46Z) - Fuzziness-based Spatial-Spectral Class Discriminant Information
Preserving Active Learning for Hyperspectral Image Classification [0.456877715768796]
This work proposes a novel fuzziness-based spatial-spectral within and between for both local and global class discriminant information preserving method.
Experimental results on benchmark HSI datasets demonstrate the effectiveness of the FLG method on Generative, Extreme Learning Machine and Sparse Multinomial Logistic Regression.
arXiv Detail & Related papers (2020-05-28T18:58:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.