FAFA: Frequency-Aware Flow-Aided Self-Supervision for Underwater Object Pose Estimation
- URL: http://arxiv.org/abs/2409.16600v1
- Date: Wed, 25 Sep 2024 03:54:01 GMT
- Title: FAFA: Frequency-Aware Flow-Aided Self-Supervision for Underwater Object Pose Estimation
- Authors: Jingyi Tang, Gu Wang, Zeyu Chen, Shengquan Li, Xiu Li, Xiangyang Ji,
- Abstract summary: We introduce FAFA, a Frequency-Aware Flow-Aided self-supervised framework for 6D pose estimation of unmanned underwater vehicles (UUVs)
Our framework relies solely on the 3D model and RGB images, alleviating the need for any real pose annotations or other-modality data like depths.
We evaluate the effectiveness of FAFA on common underwater object pose benchmarks and showcase significant performance improvements compared to state-of-the-art methods.
- Score: 65.01601309903971
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Although methods for estimating the pose of objects in indoor scenes have achieved great success, the pose estimation of underwater objects remains challenging due to difficulties brought by the complex underwater environment, such as degraded illumination, blurring, and the substantial cost of obtaining real annotations. In response, we introduce FAFA, a Frequency-Aware Flow-Aided self-supervised framework for 6D pose estimation of unmanned underwater vehicles (UUVs). Essentially, we first train a frequency-aware flow-based pose estimator on synthetic data, where an FFT-based augmentation approach is proposed to facilitate the network in capturing domain-invariant features and target domain styles from a frequency perspective. Further, we perform self-supervised training by enforcing flow-aided multi-level consistencies to adapt it to the real-world underwater environment. Our framework relies solely on the 3D model and RGB images, alleviating the need for any real pose annotations or other-modality data like depths. We evaluate the effectiveness of FAFA on common underwater object pose benchmarks and showcase significant performance improvements compared to state-of-the-art methods. Code is available at github.com/tjy0703/FAFA.
Related papers
- Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose Estimation [68.81887041766373]
We introduce a diffusion-based paradigm for domain-generalized 9-DoF object pose estimation.
We propose an effective diffusion model to redefine 9-DoF object pose estimation from a generative perspective.
We show that our method achieves state-of-the-art domain generalization performance.
arXiv Detail & Related papers (2025-02-04T17:46:34Z) - Sonar-based Deep Learning in Underwater Robotics: Overview, Robustness and Challenges [0.46873264197900916]
The predominant use of sonar in underwater environments, characterized by limited training data and inherent noise, poses challenges to model robustness.
This paper studies sonar-based perception task models, such as classification, object detection, segmentation, and SLAM.
It systematizes sonar-based state-of-the-art datasets, simulators, and robustness methods such as neural network verification, out-of-distribution, and adversarial attacks.
arXiv Detail & Related papers (2024-12-16T15:03:08Z) - UW-SDF: Exploiting Hybrid Geometric Priors for Neural SDF Reconstruction from Underwater Multi-view Monocular Images [63.32490897641344]
We propose a framework for reconstructing target objects from multi-view underwater images based on neural SDF.
We introduce hybrid geometric priors to optimize the reconstruction process, markedly enhancing the quality and efficiency of neural SDF reconstruction.
arXiv Detail & Related papers (2024-10-10T16:33:56Z) - A Sonar-based AUV Positioning System for Underwater Environments with Low Infrastructure Density [2.423370951696279]
We present a novel real-time sonar-based global positioning algorithm for AUVs (Autonomous Underwater Vehicles) designed for environments with a sparse distribution of human-made assets.
Preliminary experiments carried out on a simulated environment resembling a real underwater plant provided promising results.
arXiv Detail & Related papers (2024-05-03T09:53:28Z) - ADOD: Adaptive Domain-Aware Object Detection with Residual Attention for
Underwater Environments [1.2624532490634643]
This research presents ADOD, a novel approach to address domain generalization in underwater object detection.
Our method enhances the model's ability to generalize across diverse and unseen domains, ensuring robustness in various underwater environments.
arXiv Detail & Related papers (2023-12-11T19:20:56Z) - PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with
Dual-Discriminators [120.06891448820447]
How to obtain clear and visually pleasant images has become a common concern of people.
The task of underwater image enhancement (UIE) has also emerged as the times require.
In this paper, we propose a physical model-guided GAN model for UIE, referred to as PUGAN.
Our PUGAN outperforms state-of-the-art methods in both qualitative and quantitative metrics.
arXiv Detail & Related papers (2023-06-15T07:41:12Z) - Fully Self-Supervised Depth Estimation from Defocus Clue [79.63579768496159]
We propose a self-supervised framework that estimates depth purely from a sparse focal stack.
We show that our framework circumvents the needs for the depth and AIF image ground-truth, and receives superior predictions.
arXiv Detail & Related papers (2023-03-19T19:59:48Z) - Model-Based Underwater 6D Pose Estimation from RGB [1.9160624126555885]
We propose an approach that leverages 2D object detection to reliably compute 6D pose estimates in different underwater scenarios.
All objects and scenes are made available in an open-source dataset that includes annotations for object detection and pose estimation.
arXiv Detail & Related papers (2023-02-14T04:27:03Z) - SVAM: Saliency-guided Visual Attention Modeling by Autonomous Underwater
Robots [16.242924916178282]
This paper presents a holistic approach to saliency-guided visual attention modeling (SVAM) for use by autonomous underwater robots.
Our proposed model, named SVAM-Net, integrates deep visual features at various scales and semantics for effective salient object detection (SOD) in natural underwater images.
arXiv Detail & Related papers (2020-11-12T08:17:21Z) - Leveraging Photometric Consistency over Time for Sparsely Supervised
Hand-Object Reconstruction [118.21363599332493]
We present a method to leverage photometric consistency across time when annotations are only available for a sparse subset of frames in a video.
Our model is trained end-to-end on color images to jointly reconstruct hands and objects in 3D by inferring their poses.
We achieve state-of-the-art results on 3D hand-object reconstruction benchmarks and demonstrate that our approach allows us to improve the pose estimation accuracy.
arXiv Detail & Related papers (2020-04-28T12:03:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.