SE(3)-PoseFlow: Estimating 6D Pose Distributions for Uncertainty-Aware Robotic Manipulation
- URL: http://arxiv.org/abs/2511.01501v1
- Date: Mon, 03 Nov 2025 12:11:35 GMT
- Title: SE(3)-PoseFlow: Estimating 6D Pose Distributions for Uncertainty-Aware Robotic Manipulation
- Authors: Yufeng Jin, Niklas Funk, Vignesh Prasad, Zechu Li, Mathias Franzius, Jan Peters, Georgia Chalvatzaki,
- Abstract summary: We propose a novel probabilistic framework that leverages flow matching on the SE(3) manifold for estimating 6D object pose distributions.<n>We achieve state-of-the-art results on Real275, YCB-V, and LM-O, and demonstrate how our sample-based pose estimates can be leveraged in downstream robotic manipulation tasks.
- Score: 21.433019604658366
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Object pose estimation is a fundamental problem in robotics and computer vision, yet it remains challenging due to partial observability, occlusions, and object symmetries, which inevitably lead to pose ambiguity and multiple hypotheses consistent with the same observation. While deterministic deep networks achieve impressive performance under well-constrained conditions, they are often overconfident and fail to capture the multi-modality of the underlying pose distribution. To address these challenges, we propose a novel probabilistic framework that leverages flow matching on the SE(3) manifold for estimating 6D object pose distributions. Unlike existing methods that regress a single deterministic output, our approach models the full pose distribution with a sample-based estimate and enables reasoning about uncertainty in ambiguous cases such as symmetric objects or severe occlusions. We achieve state-of-the-art results on Real275, YCB-V, and LM-O, and demonstrate how our sample-based pose estimates can be leveraged in downstream robotic manipulation tasks such as active perception for disambiguating uncertain viewpoints or guiding grasp synthesis in an uncertainty-aware manner.
Related papers
- Adaptive Dual Uncertainty Optimization: Boosting Monocular 3D Object Detection under Test-Time Shifts [80.32933059529135]
Test-Time Adaptation (TTA) methods have emerged to adapt to target distributions during inference.<n>We propose Dual Uncertainty Optimization (DUO), the first TTA framework designed to jointly minimize both uncertainties for robust M3OD.<n>In parallel, we design a semantic-aware normal field constraint that preserves geometric coherence in regions with clear semantic cues.
arXiv Detail & Related papers (2025-08-28T07:09:21Z) - Uncertainty-aware Probabilistic 3D Human Motion Forecasting via Invertible Networks [6.671593490919892]
3D human motion forecasting aims to enable autonomous applications.<n>We propose ProbHMI, which introduces invertible networks to parameterize poses in a disentangled latent space.<n>A forecasting module then explicitly predicts future latent distributions, allowing effective uncertainty quantification.
arXiv Detail & Related papers (2025-07-19T17:02:07Z) - ProPLIKS: Probablistic 3D human body pose estimation [7.397323069796547]
We present a novel approach for 3D human pose estimation by employing probabilistic modeling.<n>Specifically, our method employs normalizing flow tailored to the SO(3) rotational group, incorporating a coupling mechanism based on the M"obius transformation.<n>We also reinterpret the challenge of reconstructing 3D human figures from 2D pixel-aligned inputs as the task of mapping these inputs to a range of probable poses.
arXiv Detail & Related papers (2024-12-05T23:21:05Z) - HandFlow: Quantifying View-Dependent 3D Ambiguity in Two-Hand
Reconstruction with Normalizing Flow [73.7895717883622]
We explicitly model the distribution of plausible reconstructions in a conditional normalizing flow framework.
We show that explicit ambiguity modeling is better-suited for this challenging problem.
arXiv Detail & Related papers (2022-10-04T15:42:22Z) - Ki-Pode: Keypoint-based Implicit Pose Distribution Estimation of Rigid
Objects [1.209625228546081]
We propose a novel pose distribution estimation method.
An implicit formulation of the probability distribution over object pose is derived from an intermediary representation of an object as a set of keypoints.
The method has been evaluated on the task of rotation distribution estimation on the YCB-V and T-LESS datasets.
arXiv Detail & Related papers (2022-09-20T11:59:05Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - Implicit-PDF: Non-Parametric Representation of Probability Distributions
on the Rotation Manifold [47.31074799708132]
We introduce a method to estimate arbitrary, non-parametric distributions on SO(3).
Our key idea is to represent the distributions implicitly, with a neural network that estimates the probability given the input image and a candidate pose.
We achieve state-of-the-art performance on Pascal3D+ and ModelNet10-SO(3) benchmarks.
arXiv Detail & Related papers (2021-06-10T17:57:23Z) - Deep Bingham Networks: Dealing with Uncertainty and Ambiguity in Pose
Estimation [74.76155168705975]
Deep Bingham Networks (DBN) can handle pose-related uncertainties and ambiguities arising in almost all real life applications concerning 3D data.
DBN extends the state of the art direct pose regression networks by (i) a multi-hypotheses prediction head which can yield different distribution modes.
We propose new training strategies so as to avoid mode or posterior collapse during training and to improve numerical stability.
arXiv Detail & Related papers (2020-12-20T19:20:26Z) - Kinematic-Structure-Preserved Representation for Unsupervised 3D Human
Pose Estimation [58.72192168935338]
Generalizability of human pose estimation models developed using supervision on large-scale in-studio datasets remains questionable.
We propose a novel kinematic-structure-preserved unsupervised 3D pose estimation framework, which is not restrained by any paired or unpaired weak supervisions.
Our proposed model employs three consecutive differentiable transformations named as forward-kinematics, camera-projection and spatial-map transformation.
arXiv Detail & Related papers (2020-06-24T23:56:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.