Fusing Multi- and Hyperspectral Satellite Data for Harmful Algal Bloom Monitoring with Self-Supervised and Hierarchical Deep Learning
- URL: http://arxiv.org/abs/2510.02763v2
- Date: Fri, 10 Oct 2025 19:04:32 GMT
- Title: Fusing Multi- and Hyperspectral Satellite Data for Harmful Algal Bloom Monitoring with Self-Supervised and Hierarchical Deep Learning
- Authors: Nicholas LaHaye, Kelly M. Luis, Michelle M. Gierach,
- Abstract summary: We present a self-supervised machine learning framework for detecting and mapping harmful algal bloom severity and speciation.<n>By fusing reflectance data from operational instruments with TROPOMI solar-induced fluorescence (SIF), our framework, called SIT-FUSE, generates HAB severity and speciation products without requiring per-instrument labeled datasets.<n>The framework employs self-supervised representation learning, hierarchical deep clustering to segment phytoplankton concentrations and speciations into interpretable classes, validated against in-situ data from the Gulf of Mexico and Southern California.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a self-supervised machine learning framework for detecting and mapping harmful algal bloom (HAB) severity and speciation using multi-sensor satellite data. By fusing reflectance data from operational instruments (VIIRS, MODIS, Sentinel-3, PACE) with TROPOMI solar-induced fluorescence (SIF), our framework, called SIT-FUSE, generates HAB severity and speciation products without requiring per-instrument labeled datasets. The framework employs self-supervised representation learning, hierarchical deep clustering to segment phytoplankton concentrations and speciations into interpretable classes, validated against in-situ data from the Gulf of Mexico and Southern California (2018-2025). Results show strong agreement with total phytoplankton, Karenia brevis, Alexandrium spp., and Pseudo-nitzschia spp. measurements. This work advances scalable HAB monitoring in label-scarce environments while enabling exploratory analysis via hierarchical embeddings: a critical step toward operationalizing self-supervised learning for global aquatic biogeochemistry.
Related papers
- REDNET-ML: A Multi-Sensor Machine Learning Pipeline for Harmful Algal Bloom Risk Detection Along the Omani Coast [0.0]
Harmful algal blooms (HABs) can threaten coastal infrastructure, fisheries, and desalination dependent water supplies.<n>This project develops a reproducible machine learning pipeline for HAB risk detection along the Omani coastline.
arXiv Detail & Related papers (2026-03-04T15:36:52Z) - BenthiCat: An opti-acoustic dataset for advancing benthic classification and habitat mapping [0.0]
This paper introduces a thorough multi-modal dataset, comprising about a million side-scan sonar (SSS) tiles collected along the coast of Catalonia (Spain)<n>About num36000 of the SSS tiles have been manually annotated with segmentation masks to enable supervised fine-tuning of classification models.<n>All the raw sensor data, together with mosaics, are also released to support further exploration and algorithm development.
arXiv Detail & Related papers (2025-10-06T15:00:20Z) - Hallucination Detection in LLMs with Topological Divergence on Attention Graphs [64.74977204942199]
Hallucination, i.e., generating factually incorrect content, remains a critical challenge for large language models.<n>We introduce TOHA, a TOpology-based HAllucination detector in the RAG setting.
arXiv Detail & Related papers (2025-04-14T10:06:27Z) - Weakly Supervised Framework Considering Multi-temporal Information for Large-scale Cropland Mapping with Satellite Imagery [11.157693752084214]
This study presents a weakly supervised framework considering multi-temporal information for large-scale cropland mapping.<n>We extract high-quality labels according to their consistency among global land cover (GLC) products to construct the supervised learning signal.<n>The proposed framework has been experimentally validated for strong adaptability across three study areas in large-scale cropland mapping.
arXiv Detail & Related papers (2024-11-27T16:11:52Z) - MPT: A Large-scale Multi-Phytoplankton Tracking Benchmark [36.37530623015916]
We propose a benchmark dataset, Multiple Phytoplankton Tracking (MPT), which covers diverse background information and variations in motion during observation.<n>The dataset includes 27 species of phytoplankton and zooplankton, 14 different backgrounds to simulate diverse and complex underwater environments, and a total of 140 videos.<n>We introduce an additional feature extractor to predict the residuals of the standard feature extractor's output, and compute multi-scale frame-to-frame similarity based on features from different layers of the extractor.
arXiv Detail & Related papers (2024-10-22T04:57:28Z) - TopoFR: A Closer Look at Topology Alignment on Face Recognition [58.45515807380505]
We propose TopoFR, a novel FR model that leverages a topological structure alignment strategy called PTSA and a hard sample mining strategy named SDE.<n> PTSA uses persistent homology to align the topological structures of the input and latent spaces, effectively preserving the structure information and improving the generalization performance of FR model.<n> Experimental results on popular face benchmarks demonstrate the superiority of our TopoFR over the state-of-the-art methods.
arXiv Detail & Related papers (2024-10-14T14:58:30Z) - LiDAR data acquisition and processing for ecology applications [0.0]
Terrestrial laser scanners (TLS) have been used in ecology to reconstruct the 3D structure of vegetation.
The orientation of LiDAR was modified to make observations in the vertical plane and a motor was integrated for its rotation.
From the data generated, histograms of point density variation along the vegetation height were created.
arXiv Detail & Related papers (2024-01-11T13:03:27Z) - FLOGA: A machine learning ready dataset, a benchmark and a novel deep
learning model for burnt area mapping with Sentinel-2 [41.28284355136163]
Wildfires pose significant threats to human and animal lives, ecosystems, and socio-economic stability.
In this work, we create and introduce a machine-learning ready dataset we name FLOGA (Forest wiLdfire Observations for the Greek Area)
This dataset is unique as it comprises of satellite imagery acquired before and after a wildfire event.
We use FLOGA to provide a thorough comparison of multiple Machine Learning and Deep Learning algorithms for the automatic extraction of burnt areas.
arXiv Detail & Related papers (2023-11-06T18:42:05Z) - Spectral Analysis of Marine Debris in Simulated and Observed
Sentinel-2/MSI Images using Unsupervised Classification [0.0]
This study uses Radiative Transfer Model (RTM) simulated data and data from the Multispectral Instrument (MSI) of the Sentinel-2 mission in combination with machine learning algorithms.
The results indicate that the spectral behavior of pollutants is influenced by factors such as the type of polymer and pixel coverage percentage.
These insights can guide future research in remote sensing applications for detecting marine plastic pollution.
arXiv Detail & Related papers (2023-06-26T18:46:47Z) - Deep Omni-supervised Learning for Rib Fracture Detection from Chest
Radiology Images [41.62893318123283]
Deep learning (DL)-based rib fracture detection has shown promise of playing an important role in preventing mortality and improving patient outcome.
DL-based object detection models requires a huge amount of bounding box annotation.
Annotating medical data is time-consuming and expertise-demanding, making obtaining a large amount of fine-grained annotations extremely infeasible.
We present a novel omni-supervised object detection network, ORF-Netv2, to leverage as much available supervision as possible.
arXiv Detail & Related papers (2023-06-23T05:36:03Z) - Multimodal Dataset from Harsh Sub-Terranean Environment with Aerosol
Particles for Frontier Exploration [55.41644538483948]
This paper introduces a multimodal dataset from the harsh and unstructured underground environment with aerosol particles.
It contains synchronized raw data measurements from all onboard sensors in Robot Operating System (ROS) format.
The focus of this paper is not only to capture both temporal and spatial data diversities but also to present the impact of harsh conditions on captured data.
arXiv Detail & Related papers (2023-04-27T20:21:18Z) - Efficient Unsupervised Learning for Plankton Images [12.447149371717]
Monitoring plankton populations in situ is fundamental to preserve the aquatic ecosystem.
The adoption of machine learning algorithms to classify such data may be affected by the significant cost of manual annotation.
We propose an efficient unsupervised learning pipeline to provide accurate classification of plankton microorganisms.
arXiv Detail & Related papers (2022-09-14T15:33:16Z) - Unsupervised Machine Learning for Exploratory Data Analysis of Exoplanet
Transmission Spectra [68.8204255655161]
We focus on unsupervised techniques for analyzing spectral data from transiting exoplanets.
We show that there is a high degree of correlation in the spectral data, which calls for appropriate low-dimensional representations.
We uncover interesting structures in the principal component basis, namely, well-defined branches corresponding to different chemical regimes.
arXiv Detail & Related papers (2022-01-07T22:26:33Z) - Unsupervised Scale-consistent Depth Learning from Video [131.3074342883371]
We propose a monocular depth estimator SC-Depth, which requires only unlabelled videos for training.
Thanks to the capability of scale-consistent prediction, we show that our monocular-trained deep networks are readily integrated into the ORB-SLAM2 system.
The proposed hybrid Pseudo-RGBD SLAM shows compelling results in KITTI, and it generalizes well to the KAIST dataset without additional training.
arXiv Detail & Related papers (2021-05-25T02:17:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.