PoseStreamer: A Multi-modal Framework for 3D Tracking of Unseen Moving Objects
- URL: http://arxiv.org/abs/2512.22979v3
- Date: Fri, 02 Jan 2026 12:58:07 GMT
- Title: PoseStreamer: A Multi-modal Framework for 3D Tracking of Unseen Moving Objects
- Authors: Huiming Yang, Linglin Liao, Fei Ding, Sibo Wang, Zijian Zeng,
- Abstract summary: PoseStreamer is a robust multi-modal 6DoF pose estimation framework for high-speed moving scenarios.<n>MoCapCube6D is a novel multi-modal dataset constructed to benchmark performance under rapid motion.
- Score: 4.1334804706669095
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Six degree of freedom (6DoF) pose estimation for novel objects is a critical task in computer vision, yet it faces significant challenges in high-speed and low-light scenarios where standard RGB cameras suffer from motion blur. While event cameras offer a promising solution due to their high temporal resolution, current 6DoF pose estimation methods typically yield suboptimal performance in high-speed object moving scenarios. To address this gap, we propose PoseStreamer, a robust multi-modal 6DoF pose estimation framework designed specifically on high-speed moving scenarios. Our approach integrates three core components: an Adaptive Pose Memory Queue that utilizes historical orientation cues for temporal consistency, an Object-centric 2D Tracker that provides strong 2D priors to boost 3D center recall, and a Ray Pose Filter for geometric refinement along camera rays. Furthermore, we introduce MoCapCube6D, a novel multi-modal dataset constructed to benchmark performance under rapid motion. Extensive experiments demonstrate that PoseStreamer not only achieves superior accuracy in high-speed moving scenarios, but also exhibits strong generalizability as a template-free framework for unseen moving objects.
Related papers
- Flow4R: Unifying 4D Reconstruction and Tracking with Scene Flow [61.297800738187355]
Flow4R predicts a minimal per-pixel property set-3D point position, scene flow, pose weight, and confidence-from two-view inputs using a Vision Transformer.<n> trained jointly on static and dynamic datasets, Flow4R achieves state-of-the-art performance on 4D reconstruction and tracking tasks.
arXiv Detail & Related papers (2026-02-15T06:58:08Z) - Optical Flow-Guided 6DoF Object Pose Tracking with an Event Camera [18.13747114612191]
We present an optical flow-guided 6DoF object pose tracking method with an event camera.<n>We show that our methods outperform event-based state-of-the-art methods in terms of both accuracy and robustness.
arXiv Detail & Related papers (2025-12-24T08:40:57Z) - OnlineSplatter: Pose-Free Online 3D Reconstruction for Free-Moving Objects [58.38338242973447]
OnlineSplatter is a novel framework generating high-quality, object-centric 3D Gaussians directly from RGB frames.<n>Our approach anchors reconstruction using the first frame and progressively refines the object representation through a dense Gaussian primitive field.<n>Our core contribution is a dual-key memory module combining latent appearance-geometry keys with explicit directional keys.
arXiv Detail & Related papers (2025-10-23T14:37:25Z) - Color-Pair Guided Robust Zero-Shot 6D Pose Estimation and Tracking of Cluttered Objects on Edge Devices [4.261261166281339]
We present a unified framework explicitly designed for efficient execution on edge devices.<n>Key to our approach is a shared, lighting-invariant color-pair feature representation.<n>For initial estimation, this feature facilitates robust registration between the live RGB-D view and the object's 3D mesh.<n>For tracking, the same feature logic validates temporal correspondences, enabling a lightweight model to reliably regress the object's motion.
arXiv Detail & Related papers (2025-09-28T05:07:49Z) - 6-DoF Object Tracking with Event-based Optical Flow and Frames [12.63903994540524]
We propose an event-based optical flow algorithm for object motion measurement to implement an object 6-DoF velocity tracker.<n>By integrating the tracked object 6-DoF velocity with low frequency estimated pose from the global pose estimator, the method can track pose when objects move at high-speed.
arXiv Detail & Related papers (2025-08-20T15:22:51Z) - DynamicPose: Real-time and Robust 6D Object Pose Tracking for Fast-Moving Cameras and Objects [4.15520326813392]
We present DynamicPose, a retraining-free 6D pose tracking framework.<n>It improves tracking robustness in fast-moving camera and object scenarios.
arXiv Detail & Related papers (2025-08-16T07:25:08Z) - Any6D: Model-free 6D Pose Estimation of Novel Objects [76.30057578269668]
We introduce Any6D, a model-free framework for 6D object pose estimation.<n>It requires only a single RGB-D anchor image to estimate both the 6D pose and size of unknown objects in novel scenes.<n>We evaluate our method on five challenging datasets.
arXiv Detail & Related papers (2025-03-24T13:46:21Z) - MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion [118.74385965694694]
We present Motion DUSt3R (MonST3R), a novel geometry-first approach that directly estimates per-timestep geometry from dynamic scenes.<n>By simply estimating a pointmap for each timestep, we can effectively adapt DUST3R's representation, previously only used for static scenes, to dynamic scenes.<n>We show that by posing the problem as a fine-tuning task, identifying several suitable datasets, and strategically training the model on this limited data, we can surprisingly enable the model to handle dynamics.
arXiv Detail & Related papers (2024-10-04T18:00:07Z) - PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection [66.94819989912823]
We propose a point-trajectory transformer with long short-term memory for efficient temporal 3D object detection.
We use point clouds of current-frame objects and their historical trajectories as input to minimize the memory bank storage requirement.
We conduct extensive experiments on the large-scale dataset to demonstrate that our approach performs well against state-of-the-art methods.
arXiv Detail & Related papers (2023-12-13T18:59:13Z) - Motion-from-Blur: 3D Shape and Motion Estimation of Motion-blurred
Objects in Videos [115.71874459429381]
We propose a method for jointly estimating the 3D motion, 3D shape, and appearance of highly motion-blurred objects from a video.
Experiments on benchmark datasets demonstrate that our method outperforms previous methods for fast moving object deblurring and 3D reconstruction.
arXiv Detail & Related papers (2021-11-29T11:25:14Z) - ROFT: Real-Time Optical Flow-Aided 6D Object Pose and Velocity Tracking [7.617467911329272]
We introduce ROFT, a Kalman filtering approach for 6D object pose and velocity tracking from a stream of RGB-D images.
By leveraging real-time optical flow, ROFT synchronizes delayed outputs of low frame rate Convolutional Neural Networks for instance segmentation and 6D object pose estimation.
Results demonstrate that our approach outperforms state-of-the-art methods for 6D object pose tracking, while also providing 6D object velocity tracking.
arXiv Detail & Related papers (2021-11-06T07:30:00Z) - CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects
from Point Clouds [97.63549045541296]
We propose a unified framework that can handle 9DoF pose tracking for novel rigid object instances and per-part pose tracking for articulated objects.
Our method achieves new state-of-the-art performance on category-level rigid object pose (NOCS-REAL275) and articulated object pose benchmarks (SAPIEN, BMVC) at the fastest FPS 12.
arXiv Detail & Related papers (2021-04-08T00:14:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.