A comprehensive review of datasets and deep learning techniques for vision in Unmanned Surface Vehicles
- URL: http://arxiv.org/abs/2412.01461v1
- Date: Mon, 02 Dec 2024 12:54:18 GMT
- Title: A comprehensive review of datasets and deep learning techniques for vision in Unmanned Surface Vehicles
- Authors: Linh Trinh, Siegfried Mercelis, Ali Anwar,
- Abstract summary: Unmanned Surface Vehicles (USVs) have emerged as a major platform in maritime operations.
USVs can help reduce labor costs, increase safety, save energy, and allow for difficult unmanned tasks in harsh maritime environments.
With the rapid development of USVs, many vision tasks such as detection and segmentation become increasingly important.
- Score: 2.9109581496560044
- License:
- Abstract: Unmanned Surface Vehicles (USVs) have emerged as a major platform in maritime operations, capable of supporting a wide range of applications. USVs can help reduce labor costs, increase safety, save energy, and allow for difficult unmanned tasks in harsh maritime environments. With the rapid development of USVs, many vision tasks such as detection and segmentation become increasingly important. Datasets play an important role in encouraging and improving the research and development of reliable vision algorithms for USVs. In this regard, a large number of recent studies have focused on the release of vision datasets for USVs. Along with the development of datasets, a variety of deep learning techniques have also been studied, with a focus on USVs. However, there is a lack of a systematic review of recent studies in both datasets and vision techniques to provide a comprehensive picture of the current development of vision on USVs, including limitations and trends. In this study, we provide a comprehensive review of both USV datasets and deep learning techniques for vision tasks. Our review was conducted using a large number of vision datasets from USVs. We elaborate several challenges and potential opportunities for research and development in USV vision based on a thorough analysis of current datasets and deep learning techniques.
Related papers
- Vision-Language Models for Edge Networks: A Comprehensive Survey [32.05172973290599]
Vision Large Language Models (VLMs) combine visual understanding with natural language processing, enabling tasks like image captioning, visual question answering, and video analysis.
VLMs show impressive capabilities across domains such as autonomous vehicles, smart surveillance, and healthcare.
Their deployment on resource-constrained edge devices remains challenging due to processing power, memory, and energy limitations.
arXiv Detail & Related papers (2025-02-11T14:04:43Z) - UAV (Unmanned Aerial Vehicles): Diverse Applications of UAV Datasets in Segmentation, Classification, Detection, and Tracking [0.0]
Unmanned Aerial Vehicles (UAVs) have revolutionized the process of gathering and analyzing data in diverse research domains.
UAV datasets consist of various types of data, such as satellite imagery, images captured by drones, and videos.
These datasets play a crucial role in disaster damage assessment, aerial surveillance, object recognition, and tracking.
arXiv Detail & Related papers (2024-09-05T04:47:36Z) - Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs [61.143381152739046]
We introduce Cambrian-1, a family of multimodal LLMs (MLLMs) designed with a vision-centric approach.
Our study uses LLMs and visual instruction tuning as an interface to evaluate various visual representations.
We provide model weights, code, supporting tools, datasets, and detailed instruction-tuning and evaluation recipes.
arXiv Detail & Related papers (2024-06-24T17:59:42Z) - A Comprehensive Survey on Underwater Image Enhancement Based on Deep Learning [51.7818820745221]
Underwater image enhancement (UIE) presents a significant challenge within computer vision research.
Despite the development of numerous UIE algorithms, a thorough and systematic review is still absent.
arXiv Detail & Related papers (2024-05-30T04:46:40Z) - Collaborative Perception Datasets in Autonomous Driving: A Survey [0.0]
The paper systematically analyzes a variety of datasets, comparing them based on aspects such as diversity, sensor setup, quality, public availability, and their applicability to downstream tasks.
The importance of addressing privacy and security concerns in the development of datasets is emphasized, regarding data sharing and dataset creation.
arXiv Detail & Related papers (2024-04-22T09:36:17Z) - SynDrone -- Multi-modal UAV Dataset for Urban Scenarios [11.338399194998933]
The scarcity of large-scale real datasets with pixel-level annotations poses a significant challenge to researchers.
We propose a multimodal synthetic dataset containing both images and 3D data taken at multiple flying heights.
The dataset will be made publicly available to support the development of novel computer vision methods targeting UAV applications.
arXiv Detail & Related papers (2023-08-21T06:22:10Z) - Vision-Language Models for Vision Tasks: A Survey [62.543250338410836]
Vision-Language Models (VLMs) learn rich vision-language correlation from web-scale image-text pairs.
This paper provides a systematic review of visual language models for various visual recognition tasks.
arXiv Detail & Related papers (2023-04-03T02:17:05Z) - Vision-Centric BEV Perception: A Survey [92.98068828762833]
Vision-centric Bird's Eye View (BEV) perception has garnered significant interest from both industry and academia.
The rapid advancements in deep learning have led to the proposal of numerous methods for addressing vision-centric BEV perception challenges.
This paper compiles and organizes up-to-date knowledge, offering a systematic review and summary of prevalent algorithms.
arXiv Detail & Related papers (2022-08-04T17:53:17Z) - A Survey on RGB-D Datasets [69.73803123972297]
This paper reviewed and categorized image datasets that include depth information.
We gathered 203 datasets that contain accessible data and grouped them into three categories: scene/objects, body, and medical.
arXiv Detail & Related papers (2022-01-15T05:35:19Z) - Deep Learning for Embodied Vision Navigation: A Survey [108.13766213265069]
"Embodied visual navigation" problem requires an agent to navigate in a 3D environment mainly rely on its first-person observation.
This paper attempts to establish an outline of the current works in the field of embodied visual navigation by providing a comprehensive literature survey.
arXiv Detail & Related papers (2021-07-07T12:09:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.