Edge-guided Representation Learning for Underwater Object Detection
- URL: http://arxiv.org/abs/2306.00440v1
- Date: Thu, 1 Jun 2023 08:29:44 GMT
- Title: Edge-guided Representation Learning for Underwater Object Detection
- Authors: Linhui Dai, Hong Liu, Pinhao Song, Hao Tang, Runwei Ding, Shengquan Li
- Abstract summary: Underwater object detection is crucial for marine economic development, environmental protection, and the planet's sustainable development.
Main challenges of this task arise from low-contrast, small objects, and mimicry of aquatic organisms.
We propose an Edge-guided Representation Learning Network, termed ERL-Net, that aims to achieve discriminative representation learning and aggregation.
- Score: 15.832646455660278
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Underwater object detection (UOD) is crucial for marine economic development,
environmental protection, and the planet's sustainable development. The main
challenges of this task arise from low-contrast, small objects, and mimicry of
aquatic organisms. The key to addressing these challenges is to focus the model
on obtaining more discriminative information. We observe that the edges of
underwater objects are highly unique and can be distinguished from low-contrast
or mimicry environments based on their edges. Motivated by this observation, we
propose an Edge-guided Representation Learning Network, termed ERL-Net, that
aims to achieve discriminative representation learning and aggregation under
the guidance of edge cues. Firstly, we introduce an edge-guided attention
module to model the explicit boundary information, which generates more
discriminative features. Secondly, a feature aggregation module is proposed to
aggregate the multi-scale discriminative features by regrouping them into three
levels, effectively aggregating global and local information for locating and
recognizing underwater objects. Finally, we propose a wide and asymmetric
receptive field block to enable features to have a wider receptive field,
allowing the model to focus on more small object information. Comprehensive
experiments on three challenging underwater datasets show that our method
achieves superior performance on the UOD task.
Related papers
- Advancing Robust Underwater Acoustic Target Recognition through Multi-task Learning and Multi-Gate Mixture-of-Experts [25.187507472845944]
This study proposes a recognition framework called M3 to enhance the model's ability to capture robust patterns.
In this framework, an auxiliary task that focuses on target properties, such as estimating target size, is designed.
M3 incorporates multi-expert and multi-gate mechanisms, allowing for the allocation of distinct parameter spaces to various underwater signals.
arXiv Detail & Related papers (2024-11-05T03:52:36Z) - Mining and Transferring Feature-Geometry Coherence for Unsupervised Point Cloud Registration [23.909530805458605]
Motivated by this observation, we propose a novel unsupervised registration method termed INTEGER to incorporate high-level contextual information for reliable pseudo-label mining.
Specifically, we propose the Feature-Geometry Coherence Mining module to dynamically adapt the teacher for each mini-batch of data during training and discover reliable pseudo-labels.
Lastly, we introduce a Mixed-Density Student to learn density-invariant features, addressing challenges related to density variation and low overlap in the outdoor scenario.
arXiv Detail & Related papers (2024-11-04T07:57:44Z) - PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - MuLA-GAN: Multi-Level Attention GAN for Enhanced Underwater Visibility [1.9272863690919875]
We introduce MuLA-GAN, a novel approach that leverages the synergistic power of Geneversarative Adrial Networks (GANs) and Multi-Level Attention mechanisms for comprehensive underwater image enhancement.
Our model excels in capturing and preserving intricate details in underwater imagery, essential for various applications.
This work not only addresses a significant research gap in underwater image enhancement but also underscores the pivotal role of Multi-Level Attention in enhancing GANs.
arXiv Detail & Related papers (2023-12-25T07:33:47Z) - ADOD: Adaptive Domain-Aware Object Detection with Residual Attention for
Underwater Environments [1.2624532490634643]
This research presents ADOD, a novel approach to address domain generalization in underwater object detection.
Our method enhances the model's ability to generalize across diverse and unseen domains, ensuring robustness in various underwater environments.
arXiv Detail & Related papers (2023-12-11T19:20:56Z) - Bidirectional Knowledge Reconfiguration for Lightweight Point Cloud
Analysis [74.00441177577295]
Point cloud analysis faces computational system overhead, limiting its application on mobile or edge devices.
This paper explores feature distillation for lightweight point cloud models.
We propose bidirectional knowledge reconfiguration to distill informative contextual knowledge from the teacher to the student.
arXiv Detail & Related papers (2023-10-08T11:32:50Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Salient Object Detection via Integrity Learning [104.13483971954233]
Integrity is the concept of highlighting all parts that belong to a certain salient object.
To facilitate integrity learning for salient object detection, we design a novel Integrity Cognition Network (ICON)
ICON explores three important components to learn strong integrity features.
arXiv Detail & Related papers (2021-01-19T14:53:12Z) - Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a
First-person Simulated 3D Environment [73.9469267445146]
First-person object-interaction tasks in high-fidelity, 3D, simulated environments such as the AI2Thor pose significant sample-efficiency challenges for reinforcement learning agents.
We show that one can learn object-interaction tasks from scratch without supervision by learning an attentive object-model as an auxiliary task.
arXiv Detail & Related papers (2020-10-28T19:27:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.