Related papers: A Modular Pipeline for 3D Object Tracking Using RGB Cameras

A Modular Pipeline for 3D Object Tracking Using RGB Cameras

URL: http://arxiv.org/abs/2503.04322v1
Date: Thu, 06 Mar 2025 11:14:59 GMT
Title: A Modular Pipeline for 3D Object Tracking Using RGB Cameras
Authors: Lars Bredereke, Yale Hartmann, Tanja Schultz,
Abstract summary: We present a new modular pipeline that calculates 3D trajectories of multiple objects.<n>It is adaptable to various settings where multiple time-synced and stationary cameras record moving objects.<n>It scales to hundreds of table-setting trials with very little human annotation input.
Score: 17.77519622617273
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Object tracking is a key challenge of computer vision with various applications that all require different architectures. Most tracking systems have limitations such as constraining all movement to a 2D plane and they often track only one object. In this paper, we present a new modular pipeline that calculates 3D trajectories of multiple objects. It is adaptable to various settings where multiple time-synced and stationary cameras record moving objects, using off the shelf webcams. Our pipeline was tested on the Table Setting Dataset, where participants are recorded with various sensors as they set a table with tableware objects. We need to track these manipulated objects, using 6 rgb webcams. Challenges include: Detecting small objects in 9.874.699 camera frames, determining camera poses, discriminating between nearby and overlapping objects, temporary occlusions, and finally calculating a 3D trajectory using the right subset of an average of 11.12.456 pixel coordinates per 3-minute trial. We implement a robust pipeline that results in accurate trajectories with covariance of x,y,z-position as a confidence metric. It deals dynamically with appearing and disappearing objects, instantiating new Extended Kalman Filters. It scales to hundreds of table-setting trials with very little human annotation input, even with the camera poses of each trial unknown. The code is available at https://github.com/LarsBredereke/object_tracking

Related papers

CoMotion: Concurrent Multi-person 3D Motion [88.27833466761234]
We introduce an approach for detecting and tracking detailed 3D poses of multiple people from a single monocular camera stream. Our model performs both strong per-frame detection and a learned pose update to track people from frame to frame. We train on numerous image and video datasets leveraging pseudo-labeled annotations to produce a model that matches state-of-the-art systems in 3D pose estimation accuracy.
arXiv Detail & Related papers (2025-04-16T15:40:15Z)
PickScan: Object discovery and reconstruction from handheld interactions [99.99566882133179]
We develop an interaction-guided and class-agnostic method to reconstruct 3D representations of scenes. Our main contribution is a novel approach to detecting user-object interactions and extracting the masks of manipulated objects. Compared to Co-Fusion, the only comparable interaction-based and class-agnostic baseline, this corresponds to a reduction in chamfer distance of 73%.
arXiv Detail & Related papers (2024-11-17T23:09:08Z)
TAPVid-3D: A Benchmark for Tracking Any Point in 3D [63.060421798990845]
We introduce a new benchmark, TAPVid-3D, for evaluating the task of Tracking Any Point in 3D. This benchmark will serve as a guidepost to improve our ability to understand precise 3D motion and surface deformation from monocular video.
arXiv Detail & Related papers (2024-07-08T13:28:47Z)
Delving into Motion-Aware Matching for Monocular 3D Object Tracking [81.68608983602581]
We find that the motion cue of objects along different time frames is critical in 3D multi-object tracking. We propose MoMA-M3T, a framework that mainly consists of three motion-aware components. We conduct extensive experiments on the nuScenes and KITTI datasets to demonstrate our MoMA-M3T achieves competitive performance against state-of-the-art methods.
arXiv Detail & Related papers (2023-08-22T17:53:58Z)
ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection Box [81.45219802386444]
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects across video frames. We propose a hierarchical data association strategy to mine the true objects in low-score detection boxes. In 3D scenarios, it is much easier for the tracker to predict object velocities in the world coordinate.
arXiv Detail & Related papers (2023-03-27T15:35:21Z)
MUTR3D: A Multi-camera Tracking Framework via 3D-to-2D Queries [18.70932813595532]
3D tracking from multiple cameras is a key component in a vision-based autonomous driving system. We propose an end-to-end textbfMUlti-camera textbfTRacking framework called MUTR3D. MUTR3D does not explicitly rely on the spatial and appearance similarity of objects. It outperforms state-of-the-art methods by 5.3 AMOTA on the nuScenes dataset.
arXiv Detail & Related papers (2022-05-02T01:45:41Z)
EagerMOT: 3D Multi-Object Tracking via Sensor Fusion [68.8204255655161]
Multi-object tracking (MOT) enables mobile robots to perform well-informed motion planning and navigation by localizing surrounding objects in 3D space and time. Existing methods rely on depth sensors (e.g., LiDAR) to detect and track targets in 3D space, but only up to a limited sensing range due to the sparsity of the signal. We propose EagerMOT, a simple tracking formulation that integrates all available object observations from both sensor modalities to obtain a well-informed interpretation of the scene dynamics.
arXiv Detail & Related papers (2021-04-29T22:30:29Z)
Simultaneous Multi-View Camera Pose Estimation and Object Tracking with Square Planar Markers [0.0]
This work proposes a novel method to simultaneously solve the above-mentioned problems. From a video sequence showing a rigid set of planar markers recorded from multiple cameras, the proposed method is able to automatically obtain the three-dimensional configuration of the markers. Once the parameters are obtained, tracking of the object can be done in real time with a low computational cost.
arXiv Detail & Related papers (2021-03-16T15:33:58Z)
Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving. We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.