Related papers: RSOD: Reliability-Guided Sonar Image Object Detection with Extremely Limited Labels

RSOD: Reliability-Guided Sonar Image Object Detection with Extremely Limited Labels

URL: http://arxiv.org/abs/2601.12715v1
Date: Mon, 19 Jan 2026 04:37:34 GMT
Title: RSOD: Reliability-Guided Sonar Image Object Detection with Extremely Limited Labels
Authors: Chengzhou Li, Ping Guo, Guanchen Meng, Qi Jia, Jinyuan Liu, Zhu Liu, Xiaokang Liu, Yu Liu, Zhongxuan Luo, Xin Fan,
Abstract summary: Object detection in sonar images is a key technology in underwater detection systems.<n>We propose a teacher-student framework called RSOD to fully learn the characteristics of sonar images.<n>We introduce an object mixed pseudo-label method to tackle the shortage of labeled data in sonar images.
Score: 31.951604817203656
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Object detection in sonar images is a key technology in underwater detection systems. Compared to natural images, sonar images contain fewer texture details and are more susceptible to noise, making it difficult for non-experts to distinguish subtle differences between classes. This leads to their inability to provide precise annotation data for sonar images. Therefore, designing effective object detection methods for sonar images with extremely limited labels is particularly important. To address this, we propose a teacher-student framework called RSOD, which aims to fully learn the characteristics of sonar images and develop a pseudo-label strategy suitable for these images to mitigate the impact of limited labels. First, RSOD calculates a reliability score by assessing the consistency of the teacher's predictions across different views. To leverage this score, we introduce an object mixed pseudo-label method to tackle the shortage of labeled data in sonar images. Finally, we optimize the performance of the student by implementing a reliability-guided adaptive constraint. By taking full advantage of unlabeled data, the student can perform well even in situations with extremely limited labels. Notably, on the UATD dataset, our method, using only 5% of labeled data, achieves results that can compete against those of our baseline algorithm trained on 100% labeled data. We also collected a new dataset to provide more valuable data for research in the field of sonar.

Related papers

Learning from Noisy Pseudo-labels for All-Weather Land Cover Mapping [20.979328369582486]
SAR imagery lacks detailed information and is plagued by significant speckle noise.<n>Recent efforts have resorted to annotating paired optical-SAR images to generate pseudo-labels.<n>We introduce a more precise method for generating pseudo-labels by incorporating semi-supervised learning alongside a novel image resolution alignment augmentation.
arXiv Detail & Related papers (2025-04-18T04:24:47Z)
Learning Camouflaged Object Detection from Noisy Pseudo Label [60.9005578956798]
This paper introduces the first weakly semi-supervised Camouflaged Object Detection (COD) method. It aims for budget-efficient and high-precision camouflaged object segmentation with an extremely limited number of fully labeled images. We propose a noise correction loss that facilitates the model's learning of correct pixels in the early learning stage. When using only 20% of fully labeled data, our method shows superior performance over the state-of-the-art methods.
arXiv Detail & Related papers (2024-07-18T04:53:51Z)
Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection [157.18560601328534]
RichSem is a robust method to learn rich semantics from coarse locations without the need of accurate bounding boxes. We add a semantic branch to our detector to learn these soft semantics and enhance feature representations for long-tailed object detection. Our method achieves state-of-the-art performance without requiring complex training and testing procedures.
arXiv Detail & Related papers (2023-10-18T17:59:41Z)
Semi-supervised Ranking for Object Image Blur Assessment [37.778436378659656]
We establish a large-scale realistic face image blur assessment dataset with reliable labels. We propose a method to obtain the blur scores only with the pairwise rank labels as supervision. To further improve the performance, we propose a self-supervised method based on quadruplet ranking consistency.
arXiv Detail & Related papers (2022-07-13T09:49:22Z)
Boosting Facial Expression Recognition by A Semi-Supervised Progressive Teacher [54.50747989860957]
We propose a semi-supervised learning algorithm named Progressive Teacher (PT) to utilize reliable FER datasets as well as large-scale unlabeled expression images for effective training. Experiments on widely-used databases RAF-DB and FERPlus validate the effectiveness of our method, which achieves state-of-the-art performance with accuracy of 89.57% on RAF-DB.
arXiv Detail & Related papers (2022-05-28T07:47:53Z)
Deep Image Retrieval is not Robust to Label Noise [0.0]
We show that image retrieval methods are less robust to label noise than image classification ones. For the first time, we investigate different types of label noise specific to image retrieval tasks.
arXiv Detail & Related papers (2022-05-23T11:04:09Z)
Mixed Supervision Learning for Whole Slide Image Classification [88.31842052998319]
We propose a mixed supervision learning framework for super high-resolution images. During the patch training stage, this framework can make use of coarse image-level labels to refine self-supervised learning. A comprehensive strategy is proposed to suppress pixel-level false positives and false negatives.
arXiv Detail & Related papers (2021-07-02T09:46:06Z)
Distilling effective supervision for robust medical image segmentation with noisy labels [21.68138582276142]
We propose a novel framework to address segmenting with noisy labels by distilling effective supervision information from both pixel and image levels. In particular, we explicitly estimate the uncertainty of every pixel as pixel-wise noise estimation. We present an image-level robust learning method to accommodate more information as the complements to pixel-level learning.
arXiv Detail & Related papers (2021-06-21T13:33:38Z)
Attention-Aware Noisy Label Learning for Image Classification [97.26664962498887]
Deep convolutional neural networks (CNNs) learned on large-scale labeled samples have achieved remarkable progress in computer vision. The cheapest way to obtain a large body of labeled visual data is to crawl from websites with user-supplied labels, such as Flickr. This paper proposes the attention-aware noisy label learning approach to improve the discriminative capability of the network trained on datasets with potential label noise.
arXiv Detail & Related papers (2020-09-30T15:45:36Z)
Data-driven Meta-set Based Fine-Grained Visual Classification [61.083706396575295]
We propose a data-driven meta-set based approach to deal with noisy web images for fine-grained recognition. Specifically, guided by a small amount of clean meta-set, we train a selection net in a meta-learning manner to distinguish in- and out-of-distribution noisy images.
arXiv Detail & Related papers (2020-08-06T03:04:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.