Related papers: BenthiCat: An opti-acoustic dataset for advancing benthic classification and habitat mapping

BenthiCat: An opti-acoustic dataset for advancing benthic classification and habitat mapping

URL: http://arxiv.org/abs/2510.04876v2
Date: Mon, 13 Oct 2025 18:02:35 GMT
Title: BenthiCat: An opti-acoustic dataset for advancing benthic classification and habitat mapping
Authors: Hayat Rajani, Valerio Franchi, Borja Martinez-Clavel Valles, Raimon Ramos, Rafael Garcia, Nuno Gracias,
Abstract summary: This paper introduces a thorough multi-modal dataset, comprising about a million side-scan sonar (SSS) tiles collected along the coast of Catalonia (Spain)<n>About num36000 of the SSS tiles have been manually annotated with segmentation masks to enable supervised fine-tuning of classification models.<n>All the raw sensor data, together with mosaics, are also released to support further exploration and algorithm development.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Benthic habitat mapping is fundamental for understanding marine ecosystems, guiding conservation efforts, and supporting sustainable resource management. Yet, the scarcity of large, annotated datasets limits the development and benchmarking of machine learning models in this domain. This paper introduces a thorough multi-modal dataset, comprising about a million side-scan sonar (SSS) tiles collected along the coast of Catalonia (Spain), complemented by bathymetric maps and a set of co-registered optical images from targeted surveys using an autonomous underwater vehicle (AUV). Approximately \num{36000} of the SSS tiles have been manually annotated with segmentation masks to enable supervised fine-tuning of classification models. All the raw sensor data, together with mosaics, are also released to support further exploration and algorithm development. To address challenges in multi-sensor data fusion for AUVs, we spatially associate optical images with corresponding SSS tiles, facilitating self-supervised, cross-modal representation learning. Accompanying open-source preprocessing and annotation tools are provided to enhance accessibility and encourage research. This resource aims to establish a standardized benchmark for underwater habitat mapping, promoting advancements in autonomous seafloor classification and multi-sensor integration.

Related papers

Exploring the Underwater World Segmentation without Extra Training [55.291219073365546]
We introduce textbfAquaOV255, the first large-scale and fine-grained underwater segmentation dataset.<n>We also present textbfEarth2Ocean, a training-free OV segmentation framework.
arXiv Detail & Related papers (2025-11-11T07:22:56Z)
A Method for Identifying Farmland System Habitat Types Based on the Dynamic-Weighted Feature Fusion Network Model [0.0]
This study developed an annotated ultra-high-resolution remote sensing image dataset encompassing 15 categories of cultivated land system habitats.<n>We propose a Dynamic-Weighted Feature Fusion Network (DWFF-Net) to extract foundational features.<n>The proposed model achieves a mean Intersection over Union (mIoU) of 0.6979 and an F1-score of 0.8049, outperforming the baseline network by 0.021 and 0.0161, respectively.
arXiv Detail & Related papers (2025-11-11T02:44:38Z)
Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method [54.461213497603154]
Occupancy-centric methods have recently achieved state-of-the-art results by offering consistent conditioning across frames and modalities.<n>Nuplan-Occ is the largest occupancy dataset to date, constructed from the widely used Nuplan benchmark.<n>We develop a unified framework that jointly synthesizes high-quality occupancy, multi-view videos, and LiDAR point clouds.
arXiv Detail & Related papers (2025-10-27T03:52:45Z)
Neptune-X: Active X-to-Maritime Generation for Universal Maritime Object Detection [54.1960918379255]
Neptune-X is a data-centric generative-selection framework for maritime object detection.<n>X-to-Maritime is a multi-modality-conditioned generative model that synthesizes diverse and realistic maritime scenes.<n>Our approach sets a new benchmark in maritime scene synthesis, significantly improving detection accuracy.
arXiv Detail & Related papers (2025-09-25T04:59:02Z)
TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation [65.74990259650984]
We introduce TerraFM, a scalable self-supervised learning model that leverages globally distributed Sentinel-1 and Sentinel-2 imagery.<n>Our training strategy integrates local-global contrastive learning and introduces a dual-centering mechanism.<n>TerraFM achieves strong generalization on both classification and segmentation tasks, outperforming prior models on GEO-Bench and Copernicus-Bench.
arXiv Detail & Related papers (2025-06-06T17:59:50Z)
EarthMind: Leveraging Cross-Sensor Data for Advanced Earth Observation Interpretation with a Unified Multimodal LLM [103.7537991413311]
Earth Observation (EO) data analysis is vital for monitoring environmental and human dynamics.<n>Recent Multimodal Large Language Models (MLLMs) show potential in EO understanding but remain restricted to single-sensor inputs.<n>We propose EarthMind, a unified vision-language framework that handles both single- and cross-sensor inputs.
arXiv Detail & Related papers (2025-06-02T13:36:05Z)
A Sensor Agnostic Domain Generalization Framework for Leveraging Geospatial Foundation Models: Enhancing Semantic Segmentation viaSynergistic Pseudo-Labeling and Generative Learning [5.299218284699214]
High-performance segmentation models are challenged by annotation scarcity and variability across sensors, illumination, and geography.<n>This paper introduces a domain generalization approach to leveraging emerging geospatial foundation models by combining soft-alignment pseudo-labeling with source-to-target generative pre-training.<n> Experiments with hyperspectral and multispectral remote sensing datasets confirm our method's effectiveness in enhancing adaptability and segmentation.
arXiv Detail & Related papers (2025-05-02T19:52:02Z)
SeafloorAI: A Large-scale Vision-Language Dataset for Seafloor Geological Survey [11.642711706384212]
We introduce SeafloorAI, the first extensive AI-ready datasets for seafloor mapping across 5 geological layers. The dataset consists of 62 geo-distributed data surveys spanning 17,300 square kilometers, with 696K sonar images, 827K annotated segmentation masks, 696K detailed language descriptions.
arXiv Detail & Related papers (2024-10-31T19:37:47Z)
Towards Natural Image Matting in the Wild via Real-Scenario Prior [69.96414467916863]
We propose a new matting dataset based on the COCO dataset, namely COCO-Matting. The built COCO-Matting comprises an extensive collection of 38,251 human instance-level alpha mattes in complex natural scenarios. For network architecture, the proposed feature-aligned transformer learns to extract fine-grained edge and transparency features. The proposed matte-aligned decoder aims to segment matting-specific objects and convert coarse masks into high-precision mattes.
arXiv Detail & Related papers (2024-10-09T06:43:19Z)
SeePerSea: Multi-modal Perception Dataset of In-water Objects for Autonomous Surface Vehicles [10.732732686425308]
This paper introduces the first publicly accessible labeled multi-modal perception dataset for autonomous maritime navigation.<n>It focuses on in-water obstacles within the aquatic environment to enhance situational awareness for Autonomous Surface Vehicles (ASVs)
arXiv Detail & Related papers (2024-04-29T04:00:19Z)
Navya3DSeg -- Navya 3D Semantic Segmentation Dataset & split generation for autonomous vehicles [63.20765930558542]
3D semantic data are useful for core perception tasks such as obstacle detection and ego-vehicle localization. We propose a new dataset, Navya 3D (Navya3DSeg), with a diverse label space corresponding to a large scale production grade operational domain. It contains 23 labeled sequences and 25 supplementary sequences without labels, designed to explore self-supervised and semi-supervised semantic segmentation benchmarks on point clouds.
arXiv Detail & Related papers (2023-02-16T13:41:19Z)
SVAM: Saliency-guided Visual Attention Modeling by Autonomous Underwater Robots [16.242924916178282]
This paper presents a holistic approach to saliency-guided visual attention modeling (SVAM) for use by autonomous underwater robots. Our proposed model, named SVAM-Net, integrates deep visual features at various scales and semantics for effective salient object detection (SOD) in natural underwater images.
arXiv Detail & Related papers (2020-11-12T08:17:21Z)
Towards Adaptive Benthic Habitat Mapping [9.904746542801838]
We show how a habitat model can be used to plan efficient Autonomous Underwater Vehicles (AUVs) surveys. A Bayesian neural network is used to predict visually-derived habitat classes when given broad-scale bathymetric data. We demonstrate how these structured uncertainty estimates can be utilised to improve the model with fewer samples.
arXiv Detail & Related papers (2020-06-20T01:03:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.