SCFlow2: Plug-and-Play Object Pose Refiner with Shape-Constraint Scene Flow
- URL: http://arxiv.org/abs/2504.09160v1
- Date: Sat, 12 Apr 2025 09:48:01 GMT
- Title: SCFlow2: Plug-and-Play Object Pose Refiner with Shape-Constraint Scene Flow
- Authors: Qingyuan Wang, Rui Song, Jiaojiao Li, Kerui Cheng, David Ferstl, Yinlin Hu,
- Abstract summary: SCFlow2 is a plug-and-play refinement framework for 6D object pose estimation.<n>It formulates the additional depth as a regularization in the iteration via 3D scene flow for RGBD frames.
- Score: 13.668161011991865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce SCFlow2, a plug-and-play refinement framework for 6D object pose estimation. Most recent 6D object pose methods rely on refinement to get accurate results. However, most existing refinement methods either suffer from noises in establishing correspondences, or rely on retraining for novel objects. SCFlow2 is based on the SCFlow model designed for refinement with shape constraint, but formulates the additional depth as a regularization in the iteration via 3D scene flow for RGBD frames. The key design of SCFlow2 is an introduction of geometry constraints into the training of recurrent matching network, by combining the rigid-motion embeddings in 3D scene flow and 3D shape prior of the target. We train SCFlow2 on a combination of dataset Objaverse, GSO and ShapeNet, and evaluate on BOP datasets with novel objects. After using our method as a post-processing, most state-of-the-art methods produce significantly better results, without any retraining or fine-tuning. The source code is available at https://scflow2.github.io.
Related papers
- T-3DGS: Removing Transient Objects for 3D Scene Reconstruction [83.05271859398779]
Transient objects in video sequences can significantly degrade the quality of 3D scene reconstructions.<n>We propose T-3DGS, a novel framework that robustly filters out transient distractors during 3D reconstruction using Gaussian Splatting.
arXiv Detail & Related papers (2024-11-29T07:45:24Z) - ScaleFlow++: Robust and Accurate Estimation of 3D Motion from Video [26.01796507893086]
This paper proposes a 3D motion perception method called ScaleFlow++ that is easy to generalize.
With just a pair of RGB images, ScaleFlow++ can robustly estimate optical flow and motion-in-depth (MID)
On KITTI, ScaleFlow++ achieved the best monocular scene flow estimation performance, reducing SF-all from 6.21 to 5.79.
arXiv Detail & Related papers (2024-09-16T11:59:27Z) - ScaleFlow++: Robust and Accurate Estimation of 3D Motion from Video [15.629496237910999]
This paper proposes a 3D motion perception method called ScaleFlow++ that is easy to generalize.
With just a pair of RGB images, ScaleFlow++ can robustly estimate optical flow and motion-in-depth (MID)
On KITTI, ScaleFlow++ achieved the best monocular scene flow estimation performance, reducing SF-all from 6.21 to 5.79.
arXiv Detail & Related papers (2024-07-13T07:58:48Z) - DiffComplete: Diffusion-based Generative 3D Shape Completion [114.43353365917015]
We introduce a new diffusion-based approach for shape completion on 3D range scans.
We strike a balance between realism, multi-modality, and high fidelity.
DiffComplete sets a new SOTA performance on two large-scale 3D shape completion benchmarks.
arXiv Detail & Related papers (2023-06-28T16:07:36Z) - CheckerPose: Progressive Dense Keypoint Localization for Object Pose
Estimation with Graph Neural Network [66.24726878647543]
Estimating the 6-DoF pose of a rigid object from a single RGB image is a crucial yet challenging task.
Recent studies have shown the great potential of dense correspondence-based solutions.
We propose a novel pose estimation algorithm named CheckerPose, which improves on three main aspects.
arXiv Detail & Related papers (2023-03-29T17:30:53Z) - BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown
Objects [89.2314092102403]
We present a near real-time method for 6-DoF tracking of an unknown object from a monocular RGBD video sequence.
Our method works for arbitrary rigid objects, even when visual texture is largely absent.
arXiv Detail & Related papers (2023-03-24T17:13:49Z) - Exploiting Implicit Rigidity Constraints via Weight-Sharing Aggregation
for Scene Flow Estimation from Point Clouds [21.531037702059933]
We propose a novel weight-sharing aggregation (WSA) method for feature and scene flow up-sampling.
WSA does not rely on estimated poses and segmented objects, and can implicitly enforce rigidity constraints to avoid structure distortion.
We modify the PointPWC-Net and integrate the proposed WSA and deformation degree module into the enhanced PointPWC-Net to derive an end-to-end scene flow estimation network, called WSAFlowNet.
arXiv Detail & Related papers (2023-03-04T16:55:57Z) - PointFlowHop: Green and Interpretable Scene Flow Estimation from
Consecutive Point Clouds [49.7285297470392]
An efficient 3D scene flow estimation method called PointFlowHop is proposed in this work.
PointFlowHop takes two consecutive point clouds and determines the 3D flow vectors for every point in the first point cloud.
It decomposes the scene flow estimation task into a set of subtasks, including ego-motion compensation, object association and object-wise motion estimation.
arXiv Detail & Related papers (2023-02-27T23:06:01Z) - CRT-6D: Fast 6D Object Pose Estimation with Cascaded Refinement
Transformers [51.142988196855484]
This paper introduces a novel method we call Cascaded Refinement Transformers, or CRT-6D.
We replace the commonly used dense intermediate representation with a sparse set of features sampled from the feature pyramid we call Os(Object Keypoint Features) where each element corresponds to an object keypoint.
We achieve inferences 2x faster than the closest real-time state of the art methods while supporting up to 21 objects on a single model.
arXiv Detail & Related papers (2022-10-21T04:06:52Z) - What Matters for 3D Scene Flow Network [44.02710380584977]
3D scene flow estimation from point clouds is a low-level 3D motion perception task in computer vision.
We propose a novel all-to-all flow embedding layer with backward reliability validation during the initial scene flow estimation.
Our proposed model surpasses all existing methods by at least 38.2% on FlyingThings3D dataset and 24.7% on KITTI Scene Flow dataset for EPE3D metric.
arXiv Detail & Related papers (2022-07-19T09:27:05Z) - CamLiFlow: Bidirectional Camera-LiDAR Fusion for Joint Optical Flow and
Scene Flow Estimation [15.98323974821097]
We study the problem of jointly estimating the optical flow and scene flow from synchronized 2D and 3D data.
To address the problem, we propose a novel end-to-end framework, called CamLiFlow.
Our method ranks 1st on the KITTI Scene Flow benchmark, outperforming the previous art with 1/7 parameters.
arXiv Detail & Related papers (2021-11-20T02:58:38Z) - Weakly Supervised Learning of Rigid 3D Scene Flow [81.37165332656612]
We propose a data-driven scene flow estimation algorithm exploiting the observation that many 3D scenes can be explained by a collection of agents moving as rigid bodies.
We showcase the effectiveness and generalization capacity of our method on four different autonomous driving datasets.
arXiv Detail & Related papers (2021-02-17T18:58:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.