Related papers: AlignPose: Generalizable 6D Pose Estimation via Multi-view Feature-metric Alignment

AlignPose: Generalizable 6D Pose Estimation via Multi-view Feature-metric Alignment

URL: http://arxiv.org/abs/2512.20538v1
Date: Tue, 23 Dec 2025 17:29:08 GMT
Title: AlignPose: Generalizable 6D Pose Estimation via Multi-view Feature-metric Alignment
Authors: Anna Šárová Mikeštíková, Médéric Fourmy, Martin Cífka, Josef Sivic, Vladimir Petrik,
Abstract summary: We introduce AlignPose, a 6D object pose estimation method that aggregates information from multiple RGB views.<n>Key component of this approach is a new multi-view feature-metric refinement specifically designed for object pose.<n>It optimize a single, consistent world-frame object pose minimizing the feature discrepancy between on-the-fly rendered object features and observed image features.
Score: 18.198789096671245
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Single-view RGB model-based object pose estimation methods achieve strong generalization but are fundamentally limited by depth ambiguity, clutter, and occlusions. Multi-view pose estimation methods have the potential to solve these issues, but existing works rely on precise single-view pose estimates or lack generalization to unseen objects. We address these challenges via the following three contributions. First, we introduce AlignPose, a 6D object pose estimation method that aggregates information from multiple extrinsically calibrated RGB views and does not require any object-specific training or symmetry annotation. Second, the key component of this approach is a new multi-view feature-metric refinement specifically designed for object pose. It optimizes a single, consistent world-frame object pose minimizing the feature discrepancy between on-the-fly rendered object features and observed image features across all views simultaneously. Third, we report extensive experiments on four datasets (YCB-V, T-LESS, ITODD-MV, HouseCat6D) using the BOP benchmark evaluation and show that AlignPose outperforms other published methods, especially on challenging industrial datasets where multiple views are readily available in practice.

Related papers

PoseGAM: Robust Unseen Object Pose Estimation via Geometry-Aware Multi-View Reasoning [49.66437612420291]
PoseGAM is a geometry-aware multi-view framework that directly predicts object pose from a query image and multiple template images.<n>We construct a large-scale synthetic dataset containing more than 190k objects under diverse environmental conditions.
arXiv Detail & Related papers (2025-12-11T17:29:25Z)
One2Any: One-Reference 6D Pose Estimation for Any Object [98.50085481362808]
6D object pose estimation remains challenging for many applications due to dependencies on complete 3D models, multi-view images, or training limited to specific object categories.<n>We propose a novel method One2Any that estimates the relative 6-degrees of freedom (DOF) object pose using only a single reference-single query RGB-D image.<n> Experiments on multiple benchmark datasets demonstrate that our model generalizes well to novel objects, achieving state-of-the-art accuracy and even rivaling methods that require multi-view or CAD inputs, at a fraction of compute.
arXiv Detail & Related papers (2025-05-07T03:54:59Z)
BoxDreamer: Dreaming Box Corners for Generalizable Object Pose Estimation [81.24160191975781]
This paper presents a general RGB-based approach for object pose estimation, specifically designed to address challenges in sparse-view settings.<n>To overcome these limitations, we introduce corner points of the object bounding box as an intermediate representation of the object pose.<n>The 3D object corners can be reliably recovered from sparse input views, while the 2D corner points in the target view are estimated through a novel reference-based point datasets.
arXiv Detail & Related papers (2025-04-10T17:58:35Z)
Active 6D Pose Estimation for Textureless Objects using Multi-View RGB Frames [10.859307261818362]
Estimating the 6D pose of textureless objects from RBG images is an important problem in robotics.<n>We propose a comprehensive active perception framework for estimating the 6D poses of textureless objects using only RGB images.
arXiv Detail & Related papers (2025-03-05T18:28:32Z)
Generalizable Single-view Object Pose Estimation by Two-side Generating and Matching [19.730504197461144]
We present a novel generalizable object pose estimation method to determine the object pose using only one RGB image. Our method offers generalization to unseen objects without extensive training, operates with a single reference image of the object, and eliminates the need for 3D object models or multiple views of the object.
arXiv Detail & Related papers (2024-11-24T14:31:50Z)
POPE: 6-DoF Promptable Pose Estimation of Any Object, in Any Scene, with One Reference [72.32413378065053]
We propose a general paradigm for object pose estimation, called Promptable Object Pose Estimation (POPE) POPE enables zero-shot 6DoF object pose estimation for any target object in any scene, while only a single reference is adopted as the support view. Comprehensive experimental results demonstrate that POPE exhibits unrivaled robust performance in zero-shot settings.
arXiv Detail & Related papers (2023-05-25T05:19:17Z)
Multi-view object pose estimation from correspondence distributions and epipolar geometry [0.0]
We present a multi-view pose estimation method which aggregates learned 2D-3D distributions from multiple views for both the initial estimate and optional refinement. Our method reduces pose estimation errors by 80-91% compared to the best single-view method, and we present state-of-the-art results on T-LESS with four views, even compared with methods using five and eight views.
arXiv Detail & Related papers (2022-10-03T13:30:40Z)
Coupled Iterative Refinement for 6D Multi-Object Pose Estimation [64.7198752089041]
Given a set of known 3D objects and an RGB or RGB-D input image, we detect and estimate the 6D pose of each object. Our approach iteratively refines both pose and correspondence in a tightly coupled manner, allowing us to dynamically remove outliers to improve accuracy.
arXiv Detail & Related papers (2022-04-26T18:00:08Z)
CosyPose: Consistent multi-view multi-object 6D pose estimation [48.097599674329004]
We present a single-view single-object 6D pose estimation method, which we use to generate 6D object pose hypotheses. Second, we develop a robust method for matching individual 6D object pose hypotheses across different input images. Third, we develop a method for global scene refinement given multiple object hypotheses and their correspondences across views.
arXiv Detail & Related papers (2020-08-19T14:11:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.