Related papers: FAFA: Frequency-Aware Flow-Aided Self-Supervision for Underwater Object Pose Estimation

FAFA: Frequency-Aware Flow-Aided Self-Supervision for Underwater Object Pose Estimation

URL: http://arxiv.org/abs/2409.16600v1
Date: Wed, 25 Sep 2024 03:54:01 GMT
Title: FAFA: Frequency-Aware Flow-Aided Self-Supervision for Underwater Object Pose Estimation
Authors: Jingyi Tang, Gu Wang, Zeyu Chen, Shengquan Li, Xiu Li, Xiangyang Ji,
Abstract summary: We introduce FAFA, a Frequency-Aware Flow-Aided self-supervised framework for 6D pose estimation of unmanned underwater vehicles (UUVs) Our framework relies solely on the 3D model and RGB images, alleviating the need for any real pose annotations or other-modality data like depths. We evaluate the effectiveness of FAFA on common underwater object pose benchmarks and showcase significant performance improvements compared to state-of-the-art methods.
Score: 65.01601309903971
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Although methods for estimating the pose of objects in indoor scenes have achieved great success, the pose estimation of underwater objects remains challenging due to difficulties brought by the complex underwater environment, such as degraded illumination, blurring, and the substantial cost of obtaining real annotations. In response, we introduce FAFA, a Frequency-Aware Flow-Aided self-supervised framework for 6D pose estimation of unmanned underwater vehicles (UUVs). Essentially, we first train a frequency-aware flow-based pose estimator on synthetic data, where an FFT-based augmentation approach is proposed to facilitate the network in capturing domain-invariant features and target domain styles from a frequency perspective. Further, we perform self-supervised training by enforcing flow-aided multi-level consistencies to adapt it to the real-world underwater environment. Our framework relies solely on the 3D model and RGB images, alleviating the need for any real pose annotations or other-modality data like depths. We evaluate the effectiveness of FAFA on common underwater object pose benchmarks and showcase significant performance improvements compared to state-of-the-art methods. Code is available at github.com/tjy0703/FAFA.

Related papers

Underwater Monocular Metric Depth Estimation: Real-World Benchmarks and Synthetic Fine-Tuning with Vision Foundation Models [0.0]
We present a benchmark of zero-shot and fine-tuned monocular metric depth estimation models on real-world underwater datasets.<n>Our results show that large-scale models trained on terrestrial data (real or synthetic) are effective in in-air settings, but perform poorly underwater.<n>This study presents a detailed evaluation and visualization of monocular metric depth estimation in underwater scenes.
arXiv Detail & Related papers (2025-07-02T21:06:39Z)
Learning Underwater Active Perception in Simulation [51.205673783866146]
Turbidity can jeopardise the whole mission as it may prevent correct visual documentation of the inspected structures. Previous works have introduced methods to adapt to turbidity and backscattering. We propose a simple yet efficient approach to enable high-quality image acquisition of assets in a broad range of water conditions.
arXiv Detail & Related papers (2025-04-23T06:48:38Z)
Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose Estimation [68.81887041766373]
We introduce a diffusion-based paradigm for domain-generalized 9-DoF object pose estimation. We propose an effective diffusion model to redefine 9-DoF object pose estimation from a generative perspective. We show that our method achieves state-of-the-art domain generalization performance.
arXiv Detail & Related papers (2025-02-04T17:46:34Z)
PIGUIQA: A Physical Imaging Guided Perceptual Framework for Underwater Image Quality Assessment [59.9103803198087]
We propose a Physical Imaging Guided perceptual framework for Underwater Image Quality Assessment (UIQA) By leveraging underwater radiative transfer theory, we integrate physics-based imaging estimations to establish quantitative metrics for these distortions. The proposed model accurately predicts image quality scores and achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-12-20T03:31:45Z)
Sonar-based Deep Learning in Underwater Robotics: Overview, Robustness and Challenges [0.46873264197900916]
The predominant use of sonar in underwater environments, characterized by limited training data and inherent noise, poses challenges to model robustness. This paper studies sonar-based perception task models, such as classification, object detection, segmentation, and SLAM. It systematizes sonar-based state-of-the-art datasets, simulators, and robustness methods such as neural network verification, out-of-distribution, and adversarial attacks.
arXiv Detail & Related papers (2024-12-16T15:03:08Z)
UW-SDF: Exploiting Hybrid Geometric Priors for Neural SDF Reconstruction from Underwater Multi-view Monocular Images [63.32490897641344]
We propose a framework for reconstructing target objects from multi-view underwater images based on neural SDF. We introduce hybrid geometric priors to optimize the reconstruction process, markedly enhancing the quality and efficiency of neural SDF reconstruction.
arXiv Detail & Related papers (2024-10-10T16:33:56Z)
TanDepth: Leveraging Global DEMs for Metric Monocular Depth Estimation in UAVs [5.6168844664788855]
This work presents TanDepth, a practical, online scale recovery method for obtaining metric depth results from relative estimations at inference-time. Tailored for Unmanned Aerial Vehicle (UAV) applications, our method leverages sparse measurements from Global Digital Elevation Models (GDEM) by projecting them to the camera view. An adaptation to the Cloth Simulation Filter is presented, which allows selecting ground points from the estimated depth map to then correlate with the projected reference points.
arXiv Detail & Related papers (2024-09-08T15:54:43Z)
A Sonar-based AUV Positioning System for Underwater Environments with Low Infrastructure Density [2.423370951696279]
We present a novel real-time sonar-based global positioning algorithm for AUVs (Autonomous Underwater Vehicles) designed for environments with a sparse distribution of human-made assets. Preliminary experiments carried out on a simulated environment resembling a real underwater plant provided promising results.
arXiv Detail & Related papers (2024-05-03T09:53:28Z)
ADOD: Adaptive Domain-Aware Object Detection with Residual Attention for Underwater Environments [1.2624532490634643]
This research presents ADOD, a novel approach to address domain generalization in underwater object detection. Our method enhances the model's ability to generalize across diverse and unseen domains, ensuring robustness in various underwater environments.
arXiv Detail & Related papers (2023-12-11T19:20:56Z)
PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with Dual-Discriminators [120.06891448820447]
How to obtain clear and visually pleasant images has become a common concern of people. The task of underwater image enhancement (UIE) has also emerged as the times require. In this paper, we propose a physical model-guided GAN model for UIE, referred to as PUGAN. Our PUGAN outperforms state-of-the-art methods in both qualitative and quantitative metrics.
arXiv Detail & Related papers (2023-06-15T07:41:12Z)
Fully Self-Supervised Depth Estimation from Defocus Clue [79.63579768496159]
We propose a self-supervised framework that estimates depth purely from a sparse focal stack. We show that our framework circumvents the needs for the depth and AIF image ground-truth, and receives superior predictions.
arXiv Detail & Related papers (2023-03-19T19:59:48Z)
Model-Based Underwater 6D Pose Estimation from RGB [1.9160624126555885]
We propose an approach that leverages 2D object detection to reliably compute 6D pose estimates in different underwater scenarios. All objects and scenes are made available in an open-source dataset that includes annotations for object detection and pose estimation.
arXiv Detail & Related papers (2023-02-14T04:27:03Z)
CPPF++: Uncertainty-Aware Sim2Real Object Pose Estimation by Vote Aggregation [67.12857074801731]
We introduce a novel method, CPPF++, designed for sim-to-real pose estimation. To address the challenge posed by vote collision, we propose a novel approach that involves modeling the voting uncertainty. We incorporate several innovative modules, including noisy pair filtering, online alignment optimization, and a feature ensemble.
arXiv Detail & Related papers (2022-11-24T03:27:00Z)
SVAM: Saliency-guided Visual Attention Modeling by Autonomous Underwater Robots [16.242924916178282]
This paper presents a holistic approach to saliency-guided visual attention modeling (SVAM) for use by autonomous underwater robots. Our proposed model, named SVAM-Net, integrates deep visual features at various scales and semantics for effective salient object detection (SOD) in natural underwater images.
arXiv Detail & Related papers (2020-11-12T08:17:21Z)
Leveraging Photometric Consistency over Time for Sparsely Supervised Hand-Object Reconstruction [118.21363599332493]
We present a method to leverage photometric consistency across time when annotations are only available for a sparse subset of frames in a video. Our model is trained end-to-end on color images to jointly reconstruct hands and objects in 3D by inferring their poses. We achieve state-of-the-art results on 3D hand-object reconstruction benchmarks and demonstrate that our approach allows us to improve the pose estimation accuracy.
arXiv Detail & Related papers (2020-04-28T12:03:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.