An Empirical Study on the Robustness of YOLO Models for Underwater Object Detection
- URL: http://arxiv.org/abs/2509.17561v1
- Date: Mon, 22 Sep 2025 10:55:21 GMT
- Title: An Empirical Study on the Robustness of YOLO Models for Underwater Object Detection
- Authors: Edwine Nabahirwa, Wei Song, Minghua Zhang, Shufan Chen,
- Abstract summary: We present one of the first comprehensive evaluations of recent YOLO variants (YOLOv8-YOLOv12) across six simulated underwater environments.<n>Our findings show that YOLOv12 delivers the strongest overall performance but is highly vulnerable to noise.<n>Experiments revealed that image counts and instance frequency primarily drive detection performance, while object appearance exerts only a secondary influence.
- Score: 5.084022830578536
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Underwater object detection (UOD) remains a critical challenge in computer vision due to underwater distortions which degrade low-level features and compromise the reliability of even state-of-the-art detectors. While YOLO models have become the backbone of real-time object detection, little work has systematically examined their robustness under these uniquely challenging conditions. This raises a critical question: Are YOLO models genuinely robust when operating under the chaotic and unpredictable conditions of underwater environments? In this study, we present one of the first comprehensive evaluations of recent YOLO variants (YOLOv8-YOLOv12) across six simulated underwater environments. Using a unified dataset of 10,000 annotated images from DUO and Roboflow100, we not only benchmark model robustness but also analyze how distortions affect key low-level features such as texture, edges, and color. Our findings show that (1) YOLOv12 delivers the strongest overall performance but is highly vulnerable to noise, and (2) noise disrupts edge and texture features, explaining the poor detection performance in noisy images. Class imbalance is a persistent challenge in UOD. Experiments revealed that (3) image counts and instance frequency primarily drive detection performance, while object appearance exerts only a secondary influence. Finally, we evaluated lightweight training-aware strategies: noise-aware sample injection, which improves robustness in both noisy and real-world conditions, and fine-tuning with advanced enhancement, which boosts accuracy in enhanced domains but slightly lowers performance in original data, demonstrating strong potential for domain adaptation, respectively. Together, these insights provide practical guidance for building resilient and cost-efficient UOD systems.
Related papers
- Adaptive Enhancement and Dual-Pooling Sequential Attention for Lightweight Underwater Object Detection with YOLOv10 [0.0]
This manuscript introduces a streamlined yet robust framework for underwater object detection, grounded in the YOLOv10 architecture.<n>The proposed method integrates a Multi-Stage Adaptive Enhancement module to improve image quality and a Dual-Pooling Sequential Attention mechanism to strengthen multi-scale feature representation.
arXiv Detail & Related papers (2026-03-04T07:39:57Z) - RoSe: Robust Self-supervised Stereo Matching under Adverse Weather Conditions [58.37558408672509]
We propose a robust self-supervised training paradigm, consisting of two key steps: robust self-supervised scene correspondence learning and adverse weather distillation.<n>Experiments demonstrate the effectiveness and versatility of our proposed solution, which outperforms existing state-of-the-art self-supervised methods.
arXiv Detail & Related papers (2025-09-23T15:41:40Z) - Underwater Waste Detection Using Deep Learning A Performance Comparison of YOLOv7 to 10 and Faster RCNN [0.0]
We investigated the performance of five cutting-edge object recognition algorithms, including YOLOv7, YOLOv8, YOLOv9, and Faster Region-Convolutional Neural Network (R-CNN)<n>YOLOv8 outperformed the others, with a mean Average Precision (mAP) of 80.9%, indicating a significant performance.<n>These findings highlight the YOLOv8 model's potential as an effective tool in the global fight against pollution.
arXiv Detail & Related papers (2025-07-25T05:36:37Z) - RoHOI: Robustness Benchmark for Human-Object Interaction Detection [78.18946529195254]
Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support.<n>We introduce the first benchmark for HOI detection, evaluating model resilience under diverse challenges.<n>Our benchmark, RoHOI, includes 20 corruption types based on the HICO-DET and V-COCO datasets and a new robustness-focused metric.
arXiv Detail & Related papers (2025-07-12T01:58:04Z) - YOLO-Based Pipeline Monitoring in Challenging Visual Environments [0.0]
Condition monitoring subsea pipelines in low-visibility underwater environments poses significant challenges due to turbidity, light distortion, and image degradation.<n>Traditional visual-based inspection systems often fail to provide reliable data for mapping, object recognition, or defect detection in such conditions.<n>This study explores the integration of advanced artificial intelligence (AI) techniques to enhance image quality, detect pipeline structures, and support autonomous fault diagnosis.
arXiv Detail & Related papers (2025-06-30T14:47:30Z) - Sonar-based Deep Learning in Underwater Robotics: Overview, Robustness and Challenges [0.46873264197900916]
The predominant use of sonar in underwater environments, characterized by limited training data and inherent noise, poses challenges to model robustness.<n>This paper studies sonar-based perception task models, such as classification, object detection, segmentation, and SLAM.<n>It systematizes sonar-based state-of-the-art datasets, simulators, and robustness methods such as neural network verification, out-of-distribution, and adversarial attacks.
arXiv Detail & Related papers (2024-12-16T15:03:08Z) - Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.<n>In this paper, we investigate how detection performance varies across model backbones, types, and datasets.<n>We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z) - Exploring the Physical World Adversarial Robustness of Vehicle Detection [13.588120545886229]
Adrial attacks can compromise the robustness of real-world detection models.
We propose an innovative instant-level data generation pipeline using the CARLA simulator.
Our findings highlight diverse model performances under adversarial conditions.
arXiv Detail & Related papers (2023-08-07T11:09:12Z) - OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution
Shifts of Individual Nuisances in Natural Images [59.51657161097337]
OOD-CV-v2 is a benchmark dataset that includes out-of-distribution examples of 10 object categories in terms of pose, shape, texture, context and the weather conditions.
In addition to this novel dataset, we contribute extensive experiments using popular baseline methods.
arXiv Detail & Related papers (2023-04-17T20:39:25Z) - Scale-Equivalent Distillation for Semi-Supervised Object Detection [57.59525453301374]
Recent Semi-Supervised Object Detection (SS-OD) methods are mainly based on self-training, generating hard pseudo-labels by a teacher model on unlabeled data as supervisory signals.
We analyze the challenges these methods meet with the empirical experiment results.
We introduce a novel approach, Scale-Equivalent Distillation (SED), which is a simple yet effective end-to-end knowledge distillation framework robust to large object size variance and class imbalance.
arXiv Detail & Related papers (2022-03-23T07:33:37Z) - Underwater Object Classification and Detection: first results and open
challenges [1.1549572298362782]
This work reviews the problem of object detection in underwater environments.
We analyse and quantify the shortcomings of conventional state-of-the-art (SOTA) algorithms.
arXiv Detail & Related papers (2022-01-04T04:54:08Z) - Assessing out-of-domain generalization for robust building damage
detection [78.6363825307044]
Building damage detection can be automated by applying computer vision techniques to satellite imagery.
Models must be robust to a shift in distribution between disaster imagery available for training and the images of the new event.
We argue that future work should focus on the OOD regime instead.
arXiv Detail & Related papers (2020-11-20T10:30:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.