Inland Waterway Object Detection in Multi-environment: Dataset and Approach
- URL: http://arxiv.org/abs/2504.04835v1
- Date: Mon, 07 Apr 2025 08:45:00 GMT
- Title: Inland Waterway Object Detection in Multi-environment: Dataset and Approach
- Authors: Shanshan Wang, Haixiang Xu, Hui Feng, Xiaoqian Wang, Pei Song, Sijie Liu, Jianhua He,
- Abstract summary: This paper introduces the Multi-environment Inland Waterway Vessel dataset (MEIWVD)<n>MEIWVD comprises 32,478 high-quality images from diverse scenarios, including sunny, rainy, foggy, and artificial lighting conditions.<n>This paper proposes a scene-guided image enhancement module to improve water surface images based on environmental conditions adaptively.
- Score: 12.00732943849236
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The success of deep learning in intelligent ship visual perception relies heavily on rich image data. However, dedicated datasets for inland waterway vessels remain scarce, limiting the adaptability of visual perception systems in complex environments. Inland waterways, characterized by narrow channels, variable weather, and urban interference, pose significant challenges to object detection systems based on existing datasets. To address these issues, this paper introduces the Multi-environment Inland Waterway Vessel Dataset (MEIWVD), comprising 32,478 high-quality images from diverse scenarios, including sunny, rainy, foggy, and artificial lighting conditions. MEIWVD covers common vessel types in the Yangtze River Basin, emphasizing diversity, sample independence, environmental complexity, and multi-scale characteristics, making it a robust benchmark for vessel detection. Leveraging MEIWVD, this paper proposes a scene-guided image enhancement module to improve water surface images based on environmental conditions adaptively. Additionally, a parameter-limited dilated convolution enhances the representation of vessel features, while a multi-scale dilated residual fusion method integrates multi-scale features for better detection. Experiments show that MEIWVD provides a more rigorous benchmark for object detection algorithms, and the proposed methods significantly improve detector performance, especially in complex multi-environment scenarios.
Related papers
- Learning Underwater Active Perception in Simulation [51.205673783866146]
Turbidity can jeopardise the whole mission as it may prevent correct visual documentation of the inspected structures.
Previous works have introduced methods to adapt to turbidity and backscattering.
We propose a simple yet efficient approach to enable high-quality image acquisition of assets in a broad range of water conditions.
arXiv Detail & Related papers (2025-04-23T06:48:38Z) - WS-DETR: Robust Water Surface Object Detection through Vision-Radar Fusion with Detection Transformer [4.768265044725289]
Water surface object detection faces challenges from blurred edges and diverse object scales.
Existing approaches suffer from cross-modal feature conflicts, which negatively affect model robustness.
We propose a robust vision-radar fusion model WS-DETR, which achieves state-of-the-art (SOTA) performance.
arXiv Detail & Related papers (2025-04-10T04:16:46Z) - Image-Based Relocalization and Alignment for Long-Term Monitoring of Dynamic Underwater Environments [57.59857784298534]
We propose an integrated pipeline that combines Visual Place Recognition (VPR), feature matching, and image segmentation on video-derived images.<n>This method enables robust identification of revisited areas, estimation of rigid transformations, and downstream analysis of ecosystem changes.
arXiv Detail & Related papers (2025-03-06T05:13:19Z) - Real-Time Multi-Scene Visibility Enhancement for Promoting Navigational Safety of Vessels Under Complex Weather Conditions [48.529493393948435]
The visible-light camera has emerged as an essential imaging sensor for marine surface vessels in intelligent waterborne transportation systems.
The visual imaging quality inevitably suffers from several kinds of degradations under complex weather conditions.
We develop a general-purpose multi-scene visibility enhancement method to restore degraded images captured under different weather conditions.
arXiv Detail & Related papers (2024-09-02T23:46:27Z) - AMANet: Advancing SAR Ship Detection with Adaptive Multi-Hierarchical
Attention Network [0.5437298646956507]
A novel adaptive multi-hierarchical attention module (AMAM) is proposed to learn multi-scale features and adaptively aggregate salient features from various feature layers.
We first fuse information from adjacent feature layers to enhance the detection of smaller targets, thereby achieving multi-scale feature enhancement.
Thirdly, we present a novel adaptive multi-hierarchical attention network (AMANet) by embedding the AMAM between the backbone network and the feature pyramid network.
arXiv Detail & Related papers (2024-01-24T03:56:33Z) - MuLA-GAN: Multi-Level Attention GAN for Enhanced Underwater Visibility [1.9272863690919875]
We introduce MuLA-GAN, a novel approach that leverages the synergistic power of Geneversarative Adrial Networks (GANs) and Multi-Level Attention mechanisms for comprehensive underwater image enhancement.
Our model excels in capturing and preserving intricate details in underwater imagery, essential for various applications.
This work not only addresses a significant research gap in underwater image enhancement but also underscores the pivotal role of Multi-Level Attention in enhancing GANs.
arXiv Detail & Related papers (2023-12-25T07:33:47Z) - An Efficient Detection and Control System for Underwater Docking using
Machine Learning and Realistic Simulation: A Comprehensive Approach [5.039813366558306]
This work compares different deep-learning architectures to perform underwater docking detection and classification.
A Generative Adversarial Network (GAN) is used to do image-to-image translation, converting the Gazebo simulation image into an underwater-looking image.
Results show an improvement of 20% in the high turbidity scenarios regardless of the underwater currents.
arXiv Detail & Related papers (2023-11-02T18:10:20Z) - Learning Heavily-Degraded Prior for Underwater Object Detection [59.5084433933765]
This paper seeks transferable prior knowledge from detector-friendly images.
It is based on statistical observations that, the heavily degraded regions of detector-friendly (DFUI) and underwater images have evident feature distribution gaps.
Our method with higher speeds and less parameters still performs better than transformer-based detectors.
arXiv Detail & Related papers (2023-08-24T12:32:46Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection [74.19291916812921]
forged images generated by Deepfake techniques pose a serious threat to the trustworthiness of digital information.
In this paper, we aim to capture the subtle manipulation artifacts at different scales for Deepfake detection.
We introduce a high-quality Deepfake dataset, SR-DF, which consists of 4,000 DeepFake videos generated by state-of-the-art face swapping and facial reenactment methods.
arXiv Detail & Related papers (2021-04-20T05:43:44Z) - A Parallel Down-Up Fusion Network for Salient Object Detection in
Optical Remote Sensing Images [82.87122287748791]
We propose a novel Parallel Down-up Fusion network (PDF-Net) for salient object detection in optical remote sensing images (RSIs)
It takes full advantage of the in-path low- and high-level features and cross-path multi-resolution features to distinguish diversely scaled salient objects and suppress the cluttered backgrounds.
Experiments on the ORSSD dataset demonstrate that the proposed network is superior to the state-of-the-art approaches both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-10-02T05:27:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.