Advanced Object Detection and Pose Estimation with Hybrid Task Cascade and High-Resolution Networks
- URL: http://arxiv.org/abs/2502.03877v1
- Date: Thu, 06 Feb 2025 08:48:34 GMT
- Title: Advanced Object Detection and Pose Estimation with Hybrid Task Cascade and High-Resolution Networks
- Authors: Yuhui Jin, Yaqiong Zhang, Zheyuan Xu, Wenqing Zhang, Jingyu Xu,
- Abstract summary: This study proposes an improved 6D object detection and pose estimation pipeline based on the existing 6D-VNet framework.
By leveraging the strengths of HTC's multi-stage refinement process and HRNet's ability to maintain high-resolution representations, our approach significantly improves detection accuracy and pose estimation precision.
- Score: 7.403403805442553
- License:
- Abstract: In the field of computer vision, 6D object detection and pose estimation are critical for applications such as robotics, augmented reality, and autonomous driving. Traditional methods often struggle with achieving high accuracy in both object detection and precise pose estimation simultaneously. This study proposes an improved 6D object detection and pose estimation pipeline based on the existing 6D-VNet framework, enhanced by integrating a Hybrid Task Cascade (HTC) and a High-Resolution Network (HRNet) backbone. By leveraging the strengths of HTC's multi-stage refinement process and HRNet's ability to maintain high-resolution representations, our approach significantly improves detection accuracy and pose estimation precision. Furthermore, we introduce advanced post-processing techniques and a novel model integration strategy that collectively contribute to superior performance on public and private benchmarks. Our method demonstrates substantial improvements over state-of-the-art models, making it a valuable contribution to the domain of 6D object detection and pose estimation.
Related papers
- 6DOPE-GS: Online 6D Object Pose Estimation using Gaussian Splatting [7.7145084897748974]
We present 6DOPE-GS, a novel method for online 6D object pose estimation & tracking with a single RGB-D camera.
We show that 6DOPE-GS matches the performance of state-of-the-art baselines for model-free simultaneous 6D pose tracking and reconstruction.
We also demonstrate the method's suitability for live, dynamic object tracking and reconstruction in a real-world setting.
arXiv Detail & Related papers (2024-12-02T14:32:19Z) - FAST GDRNPP: Improving the Speed of State-of-the-Art 6D Object Pose Estimation [0.0]
6D object pose estimation involves determining the three-dimensional translation and rotation of an object within a scene.
Current models, both classical and deep-learning-based, often struggle with the trade-off between accuracy and latency.
We employ several techniques to reduce the model size and improve inference time.
arXiv Detail & Related papers (2024-09-18T12:30:02Z) - ASDF: Assembly State Detection Utilizing Late Fusion by Integrating 6D Pose Estimation [5.117781843071097]
In medical and industrial domains, providing guidance for assembly processes can be critical to ensure efficiency and safety.
In order to enable in-situ visualization, 6D pose estimation can be leveraged to identify the correct location for an augmentation.
We build upon the strengths of YOLOv8, a real-time capable object detection framework, to address the challenges of 6D pose estimation in combination with assembly state detection.
arXiv Detail & Related papers (2024-03-25T03:30:37Z) - Advancing 6D Pose Estimation in Augmented Reality -- Overcoming Projection Ambiguity with Uncontrolled Imagery [0.0]
This study addresses the challenge of accurate 6D pose estimation in Augmented Reality (AR)
We propose a novel approach that strategically decomposes the estimation of z-axis translation and focal length.
This methodology not only streamlines the 6D pose estimation process but also significantly enhances the accuracy of 3D object overlaying in AR settings.
arXiv Detail & Related papers (2024-03-20T09:22:22Z) - Instance-aware Multi-Camera 3D Object Detection with Structural Priors
Mining and Self-Boosting Learning [93.71280187657831]
Camera-based bird-eye-view (BEV) perception paradigm has made significant progress in the autonomous driving field.
We propose IA-BEV, which integrates image-plane instance awareness into the depth estimation process within a BEV-based detector.
arXiv Detail & Related papers (2023-12-13T09:24:42Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Rigidity-Aware Detection for 6D Object Pose Estimation [60.88857851869196]
Most recent 6D object pose estimation methods first use object detection to obtain 2D bounding boxes before actually regressing the pose.
We propose a rigidity-aware detection method exploiting the fact that, in 6D pose estimation, the target objects are rigid.
Key to the success of our approach is a visibility map, which we propose to build using a minimum barrier distance between every pixel in the bounding box and the box boundary.
arXiv Detail & Related papers (2023-03-22T09:02:54Z) - FS6D: Few-Shot 6D Pose Estimation of Novel Objects [116.34922994123973]
6D object pose estimation networks are limited in their capability to scale to large numbers of object instances.
In this work, we study a new open set problem; the few-shot 6D object poses estimation: estimating the 6D pose of an unknown object by a few support views without extra training.
arXiv Detail & Related papers (2022-03-28T10:31:29Z) - Spatial Attention Improves Iterative 6D Object Pose Estimation [52.365075652976735]
We propose a new method for 6D pose estimation refinement from RGB images.
Our main insight is that after the initial pose estimate, it is important to pay attention to distinct spatial features of the object.
We experimentally show that this approach learns to attend to salient spatial features and learns to ignore occluded parts of the object, leading to better pose estimation across datasets.
arXiv Detail & Related papers (2021-01-05T17:18:52Z) - Self6D: Self-Supervised Monocular 6D Object Pose Estimation [114.18496727590481]
We propose the idea of monocular 6D pose estimation by means of self-supervised learning.
We leverage recent advances in neural rendering to further self-supervise the model on unannotated real RGB-D data.
arXiv Detail & Related papers (2020-04-14T13:16:36Z) - Object 6D Pose Estimation with Non-local Attention [29.929911622127502]
We propose a network that integrate 6D object pose parameter estimation into the object detection framework.
The proposed method reaches the state-of-the-art performance on the YCB-video and the Linemod datasets.
arXiv Detail & Related papers (2020-02-20T14:23:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.