SeafloorAI: A Large-scale Vision-Language Dataset for Seafloor Geological Survey
- URL: http://arxiv.org/abs/2411.00172v2
- Date: Thu, 07 Nov 2024 04:41:32 GMT
- Title: SeafloorAI: A Large-scale Vision-Language Dataset for Seafloor Geological Survey
- Authors: Kien X. Nguyen, Fengchun Qiao, Arthur Trembanis, Xi Peng,
- Abstract summary: We introduce SeafloorAI, the first extensive AI-ready datasets for seafloor mapping across 5 geological layers.
The dataset consists of 62 geo-distributed data surveys spanning 17,300 square kilometers, with 696K sonar images, 827K annotated segmentation masks, 696K detailed language descriptions.
- Score: 11.642711706384212
- License:
- Abstract: A major obstacle to the advancements of machine learning models in marine science, particularly in sonar imagery analysis, is the scarcity of AI-ready datasets. While there have been efforts to make AI-ready sonar image dataset publicly available, they suffer from limitations in terms of environment setting and scale. To bridge this gap, we introduce SeafloorAI, the first extensive AI-ready datasets for seafloor mapping across 5 geological layers that is curated in collaboration with marine scientists. We further extend the dataset to SeafloorGenAI by incorporating the language component in order to facilitate the development of both vision- and language-capable machine learning models for sonar imagery. The dataset consists of 62 geo-distributed data surveys spanning 17,300 square kilometers, with 696K sonar images, 827K annotated segmentation masks, 696K detailed language descriptions and approximately 7M question-answer pairs. By making our data processing source code publicly available, we aim to engage the marine science community to enrich the data pool and inspire the machine learning community to develop more robust models. This collaborative approach will enhance the capabilities and applications of our datasets within both fields.
Related papers
- Introducing VaDA: Novel Image Segmentation Model for Maritime Object Segmentation Using New Dataset [3.468621550644668]
The maritime shipping industry is undergoing rapid evolution driven by advancements in computer vision artificial intelligence (AI)
object recognition in maritime environments faces challenges such as light reflection, interference, intense lighting, and various weather conditions.
Existing AI recognition models and datasets have limited suitability for composing autonomous navigation systems.
arXiv Detail & Related papers (2024-07-12T05:48:53Z) - BenthicNet: A global compilation of seafloor images for deep learning applications [25.466405216505166]
BenthicNet is a global compilation of seafloor imagery.
An initial set of over 11.4 million images was collected and curated to represent a diversity of seafloor environments.
A large deep learning model was trained on this compilation and preliminary results suggest it has utility for automating large and small-scale image analysis tasks.
arXiv Detail & Related papers (2024-05-08T17:37:57Z) - Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching [60.645802236700035]
Navigating drones through natural language commands remains challenging due to the dearth of accessible multi-modal datasets.
We introduce GeoText-1652, a new natural language-guided geo-localization benchmark.
This dataset is systematically constructed through an interactive human-computer process.
arXiv Detail & Related papers (2023-11-21T17:52:30Z) - An Open Hyperspectral Dataset with Sea-Land-Cloud Ground-Truth from the
HYPSO-1 Satellite [0.0]
The HYPSO-1 Sea-Land-Cloud-Labeled dataset is an open dataset with 200 diverse hyperspectral images from the HYPSO-1 mission.
38 of these images from different countries include ground-truth labels at pixel-level totaling about 25 million spectral signatures labeled for sea/land/cloud categories.
arXiv Detail & Related papers (2023-08-25T21:35:22Z) - A New Path: Scaling Vision-and-Language Navigation with Synthetic
Instructions and Imitation Learning [70.14372215250535]
Recent studies in Vision-and-Language Navigation (VLN) train RL agents to execute natural-language navigation instructions in photorealistic environments.
Given the scarcity of human instruction data and limited diversity in the training environments, these agents still struggle with complex language grounding and spatial language understanding.
We take 500+ indoor environments captured in densely-sampled 360 degree panoramas, construct navigation trajectories through these panoramas, and generate a visually-grounded instruction for each trajectory.
The resulting dataset of 4.2M instruction-trajectory pairs is two orders of magnitude larger than existing human-annotated datasets.
arXiv Detail & Related papers (2022-10-06T17:59:08Z) - Semantic Segmentation of Vegetation in Remote Sensing Imagery Using Deep
Learning [77.34726150561087]
We propose an approach for creating a multi-modal and large-temporal dataset comprised of publicly available Remote Sensing data.
We use Convolutional Neural Networks (CNN) models that are capable of separating different classes of vegetation.
arXiv Detail & Related papers (2022-09-28T18:51:59Z) - FathomNet: A global underwater image training set for enabling
artificial intelligence in the ocean [0.0]
Ocean-going platforms are integrating high-resolution camera feeds for observation and navigation, producing a deluge of visual data.
Recent advances in machine learning enable fast, sophisticated analysis of visual data, but have had limited success in the oceanographic world.
We will demonstrate how machine learning models trained on FathomNet data can be applied across different institutional video data.
arXiv Detail & Related papers (2021-09-29T18:08:42Z) - Paradigm selection for Data Fusion of SAR and Multispectral Sentinel
data applied to Land-Cover Classification [63.072664304695465]
In this letter, four data fusion paradigms, based on Convolutional Neural Networks (CNNs) are analyzed and implemented.
The goals are to provide a systematic procedure for choosing the best data fusion framework, resulting in the best classification results.
The procedure has been validated for land-cover classification but it can be transferred to other cases.
arXiv Detail & Related papers (2021-06-18T11:36:54Z) - REGRAD: A Large-Scale Relational Grasp Dataset for Safe and
Object-Specific Robotic Grasping in Clutter [52.117388513480435]
We present a new dataset named regrad to sustain the modeling of relationships among objects and grasps.
Our dataset is collected in both forms of 2D images and 3D point clouds.
Users are free to import their own object models for the generation of as many data as they want.
arXiv Detail & Related papers (2021-04-29T05:31:21Z) - DeepSatData: Building large scale datasets of satellite images for
training machine learning models [77.17638664503215]
This report presents design considerations for automatically generating satellite imagery datasets for training machine learning models.
We discuss issues faced from the point of view of deep neural network training and evaluation.
arXiv Detail & Related papers (2021-04-28T15:13:12Z) - FathomNet: An underwater image training database for ocean exploration
and discovery [0.0]
FathomNet is a novel baseline image training set optimized to accelerate development of modern, intelligent, and automated analysis of underwater imagery.
To date, there are more than 80,000 images and 106,000 localizations for 233 different classes, including midwater and benthic organisms.
While we find quality results on prediction for this new dataset, our results indicate that we are ultimately in need of a larger data set for ocean exploration.
arXiv Detail & Related papers (2020-06-30T21:23:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.