Related papers: PoseStreamer: A Multi-modal Framework for 3D Tracking of Unseen Moving Objects

PoseStreamer: A Multi-modal Framework for 3D Tracking of Unseen Moving Objects

URL: http://arxiv.org/abs/2512.22979v3
Date: Fri, 02 Jan 2026 12:58:07 GMT
Title: PoseStreamer: A Multi-modal Framework for 3D Tracking of Unseen Moving Objects
Authors: Huiming Yang, Linglin Liao, Fei Ding, Sibo Wang, Zijian Zeng,
Abstract summary: PoseStreamer is a robust multi-modal 6DoF pose estimation framework for high-speed moving scenarios.<n>MoCapCube6D is a novel multi-modal dataset constructed to benchmark performance under rapid motion.
Score: 4.1334804706669095
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Six degree of freedom (6DoF) pose estimation for novel objects is a critical task in computer vision, yet it faces significant challenges in high-speed and low-light scenarios where standard RGB cameras suffer from motion blur. While event cameras offer a promising solution due to their high temporal resolution, current 6DoF pose estimation methods typically yield suboptimal performance in high-speed object moving scenarios. To address this gap, we propose PoseStreamer, a robust multi-modal 6DoF pose estimation framework designed specifically on high-speed moving scenarios. Our approach integrates three core components: an Adaptive Pose Memory Queue that utilizes historical orientation cues for temporal consistency, an Object-centric 2D Tracker that provides strong 2D priors to boost 3D center recall, and a Ray Pose Filter for geometric refinement along camera rays. Furthermore, we introduce MoCapCube6D, a novel multi-modal dataset constructed to benchmark performance under rapid motion. Extensive experiments demonstrate that PoseStreamer not only achieves superior accuracy in high-speed moving scenarios, but also exhibits strong generalizability as a template-free framework for unseen moving objects.

Related papers

Flow4R: Unifying 4D Reconstruction and Tracking with Scene Flow [61.297800738187355]
Flow4R predicts a minimal per-pixel property set-3D point position, scene flow, pose weight, and confidence-from two-view inputs using a Vision Transformer.<n> trained jointly on static and dynamic datasets, Flow4R achieves state-of-the-art performance on 4D reconstruction and tracking tasks.
arXiv Detail & Related papers (2026-02-15T06:58:08Z)
Optical Flow-Guided 6DoF Object Pose Tracking with an Event Camera [18.13747114612191]
We present an optical flow-guided 6DoF object pose tracking method with an event camera.<n>We show that our methods outperform event-based state-of-the-art methods in terms of both accuracy and robustness.
arXiv Detail & Related papers (2025-12-24T08:40:57Z)
OnlineSplatter: Pose-Free Online 3D Reconstruction for Free-Moving Objects [58.38338242973447]
OnlineSplatter is a novel framework generating high-quality, object-centric 3D Gaussians directly from RGB frames.<n>Our approach anchors reconstruction using the first frame and progressively refines the object representation through a dense Gaussian primitive field.<n>Our core contribution is a dual-key memory module combining latent appearance-geometry keys with explicit directional keys.
arXiv Detail & Related papers (2025-10-23T14:37:25Z)
Color-Pair Guided Robust Zero-Shot 6D Pose Estimation and Tracking of Cluttered Objects on Edge Devices [4.261261166281339]
We present a unified framework explicitly designed for efficient execution on edge devices.<n>Key to our approach is a shared, lighting-invariant color-pair feature representation.<n>For initial estimation, this feature facilitates robust registration between the live RGB-D view and the object's 3D mesh.<n>For tracking, the same feature logic validates temporal correspondences, enabling a lightweight model to reliably regress the object's motion.
arXiv Detail & Related papers (2025-09-28T05:07:49Z)
6-DoF Object Tracking with Event-based Optical Flow and Frames [12.63903994540524]
We propose an event-based optical flow algorithm for object motion measurement to implement an object 6-DoF velocity tracker.<n>By integrating the tracked object 6-DoF velocity with low frequency estimated pose from the global pose estimator, the method can track pose when objects move at high-speed.
arXiv Detail & Related papers (2025-08-20T15:22:51Z)
DynamicPose: Real-time and Robust 6D Object Pose Tracking for Fast-Moving Cameras and Objects [4.15520326813392]
We present DynamicPose, a retraining-free 6D pose tracking framework.<n>It improves tracking robustness in fast-moving camera and object scenarios.
arXiv Detail & Related papers (2025-08-16T07:25:08Z)
Any6D: Model-free 6D Pose Estimation of Novel Objects [76.30057578269668]
We introduce Any6D, a model-free framework for 6D object pose estimation.<n>It requires only a single RGB-D anchor image to estimate both the 6D pose and size of unknown objects in novel scenes.<n>We evaluate our method on five challenging datasets.
arXiv Detail & Related papers (2025-03-24T13:46:21Z)
MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion [118.74385965694694]
We present Motion DUSt3R (MonST3R), a novel geometry-first approach that directly estimates per-timestep geometry from dynamic scenes.<n>By simply estimating a pointmap for each timestep, we can effectively adapt DUST3R's representation, previously only used for static scenes, to dynamic scenes.<n>We show that by posing the problem as a fine-tuning task, identifying several suitable datasets, and strategically training the model on this limited data, we can surprisingly enable the model to handle dynamics.
arXiv Detail & Related papers (2024-10-04T18:00:07Z)
PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection [66.94819989912823]
We propose a point-trajectory transformer with long short-term memory for efficient temporal 3D object detection. We use point clouds of current-frame objects and their historical trajectories as input to minimize the memory bank storage requirement. We conduct extensive experiments on the large-scale dataset to demonstrate that our approach performs well against state-of-the-art methods.
arXiv Detail & Related papers (2023-12-13T18:59:13Z)
Motion-from-Blur: 3D Shape and Motion Estimation of Motion-blurred Objects in Videos [115.71874459429381]
We propose a method for jointly estimating the 3D motion, 3D shape, and appearance of highly motion-blurred objects from a video. Experiments on benchmark datasets demonstrate that our method outperforms previous methods for fast moving object deblurring and 3D reconstruction.
arXiv Detail & Related papers (2021-11-29T11:25:14Z)
ROFT: Real-Time Optical Flow-Aided 6D Object Pose and Velocity Tracking [7.617467911329272]
We introduce ROFT, a Kalman filtering approach for 6D object pose and velocity tracking from a stream of RGB-D images. By leveraging real-time optical flow, ROFT synchronizes delayed outputs of low frame rate Convolutional Neural Networks for instance segmentation and 6D object pose estimation. Results demonstrate that our approach outperforms state-of-the-art methods for 6D object pose tracking, while also providing 6D object velocity tracking.
arXiv Detail & Related papers (2021-11-06T07:30:00Z)
CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds [97.63549045541296]
We propose a unified framework that can handle 9DoF pose tracking for novel rigid object instances and per-part pose tracking for articulated objects. Our method achieves new state-of-the-art performance on category-level rigid object pose (NOCS-REAL275) and articulated object pose benchmarks (SAPIEN, BMVC) at the fastest FPS 12.
arXiv Detail & Related papers (2021-04-08T00:14:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.