OmniTrack++: Omnidirectional Multi-Object Tracking by Learning Large-FoV Trajectory Feedback
- URL: http://arxiv.org/abs/2511.00510v1
- Date: Sat, 01 Nov 2025 11:28:05 GMT
- Title: OmniTrack++: Omnidirectional Multi-Object Tracking by Learning Large-FoV Trajectory Feedback
- Authors: Kai Luo, Hao Shi, Kunyu Peng, Fei Teng, Sheng Wu, Kaiwei Wang, Kailun Yang,
- Abstract summary: This paper investigates Multi-Object Tracking (MOT) in panoramic imagery.<n>MOT introduces unique challenges including a 360deg Field of View (FoV), resolution dilution, and severe view-dependent distortions.<n>To address panoramic distortion, large search space, and identity ambiguity under a 360deg FoV, OmniTrack++ adopts a feedback-driven framework.
- Score: 40.746857157430256
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper investigates Multi-Object Tracking (MOT) in panoramic imagery, which introduces unique challenges including a 360{\deg} Field of View (FoV), resolution dilution, and severe view-dependent distortions. Conventional MOT methods designed for narrow-FoV pinhole cameras generalize unsatisfactorily under these conditions. To address panoramic distortion, large search space, and identity ambiguity under a 360{\deg} FoV, OmniTrack++ adopts a feedback-driven framework that progressively refines perception with trajectory cues. A DynamicSSM block first stabilizes panoramic features, implicitly alleviating geometric distortion. On top of normalized representations, FlexiTrack Instances use trajectory-informed feedback for flexible localization and reliable short-term association. To ensure long-term robustness, an ExpertTrack Memory consolidates appearance cues via a Mixture-of-Experts design, enabling recovery from fragmented tracks and reducing identity drift. Finally, a Tracklet Management module adaptively switches between end-to-end and tracking-by-detection modes according to scene dynamics, offering a balanced and scalable solution for panoramic MOT. To support rigorous evaluation, we establish the EmboTrack benchmark, a comprehensive dataset for panoramic MOT that includes QuadTrack, captured with a quadruped robot, and BipTrack, collected with a bipedal wheel-legged robot. Together, these datasets span wide-angle environments and diverse motion patterns, providing a challenging testbed for real-world panoramic perception. Extensive experiments on JRDB and EmboTrack demonstrate that OmniTrack++ achieves state-of-the-art performance, yielding substantial HOTA improvements of +25.5% on JRDB and +43.07% on QuadTrack over the original OmniTrack. Datasets and code will be made publicly available at https://github.com/xifen523/OmniTrack.
Related papers
- FutrTrack: A Camera-LiDAR Fusion Transformer for 3D Multiple Object Tracking [4.65812324892521]
FutrTrack builds on existing 3D detectors by introducing a transformer-based smoother and a fusion-driven tracker.<n>Our fusion tracker integrates bounding boxes with multimodal bird's-eye-view (BEV) fusion features from multiple cameras and LiDAR.<n>FutrTrack achieves strong performance on 3D MOT benchmarks, reducing identity switches while maintaining competitive accuracy.
arXiv Detail & Related papers (2025-10-22T19:25:01Z) - Multi-View 3D Point Tracking [67.21282192436031]
We introduce the first data-driven multi-view 3D point tracker, designed to track arbitrary points in dynamic scenes using multiple camera views.<n>Our model directly predicts 3D correspondences using a practical number of cameras.<n>We train on 5K synthetic multi-view Kubric sequences and evaluate on two real-world benchmarks.
arXiv Detail & Related papers (2025-08-28T17:58:20Z) - Tracking the Unstable: Appearance-Guided Motion Modeling for Robust Multi-Object Tracking in UAV-Captured Videos [58.156141601478794]
Multi-object tracking (UAVT) aims to track multiple objects while maintaining consistent identities across frames of a given video.<n>Existing methods typically model motion cues and appearance separately, overlooking their interplay and resulting in suboptimal tracking performance.<n>We propose AMOT, which exploits appearance and motion cues through two key components: an Appearance-Motion Consistency (AMC) matrix and a Motion-aware Track Continuation (MTC) module.
arXiv Detail & Related papers (2025-08-03T12:06:47Z) - SpatialTrackerV2: 3D Point Tracking Made Easy [73.0350898700048]
SpatialTrackerV2 is a feed-forward 3D point tracking method for monocular videos.<n>It decomposes world-space 3D motion into scene geometry, camera ego-motion, and pixel-wise object motion.<n>By learning geometry and motion jointly from such heterogeneous data, SpatialTrackerV2 outperforms existing 3D tracking methods by 30%.
arXiv Detail & Related papers (2025-07-16T17:59:03Z) - Efficient Motion Prompt Learning for Robust Visual Tracking [58.59714916705317]
We propose a lightweight and plug-and-play motion prompt tracking method.<n>It can be easily integrated into existing vision-based trackers to build a joint tracking framework.<n>Experiments on seven tracking benchmarks demonstrate that the proposed motion module significantly improves the robustness of vision-based trackers.
arXiv Detail & Related papers (2025-05-22T07:22:58Z) - Omnidirectional Multi-Object Tracking [27.858084330925372]
Panoramic imagery offers comprehensive information to support Multi-Object Tracking (MOT)<n>Most MOT algorithms are tailored for pinhole images with limited views, impairing their effectiveness in panoramic settings.<n>We propose OmniTrack, an omni MOT framework that incorporates Tracklet Management to introduce temporal cues, FlexiTrack Instances for object localization and association, and the CircularStatE Module to alleviate image and geometric distortions.
arXiv Detail & Related papers (2025-03-06T15:53:42Z) - RockTrack: A 3D Robust Multi-Camera-Ken Multi-Object Tracking Framework [28.359633046753228]
We propose RockTrack, a 3D MOT method for multi-camera detectors.
RockTrack incorporates a confidence-guided preprocessing module to extract reliable motion and image observations.
RockTrack achieves state-of-the-art performance on the nuScenes vision-only tracking leaderboard with 59.1% AMOTA.
arXiv Detail & Related papers (2024-09-18T07:08:08Z) - RoboSense: Large-scale Dataset and Benchmark for Egocentric Robot Perception and Navigation in Crowded and Unstructured Environments [62.5830455357187]
We setup an egocentric multi-sensor data collection platform based on 3 main types of sensors (Camera, LiDAR and Fisheye)<n>A large-scale multimodal dataset is constructed, named RoboSense, to facilitate egocentric robot perception.
arXiv Detail & Related papers (2024-08-28T03:17:40Z) - RaTrack: Moving Object Detection and Tracking with 4D Radar Point Cloud [10.593320435411714]
We introduce RaTrack, an innovative solution tailored for radar-based tracking.
Our method focuses on motion segmentation and clustering, enriched by a motion estimation module.
RaTrack showcases superior tracking precision of moving objects, largely surpassing the performance of the state of the art.
arXiv Detail & Related papers (2023-09-18T13:02:29Z) - An Effective Motion-Centric Paradigm for 3D Single Object Tracking in
Point Clouds [50.19288542498838]
3D single object tracking in LiDAR point clouds (LiDAR SOT) plays a crucial role in autonomous driving.
Current approaches all follow the Siamese paradigm based on appearance matching.
We introduce a motion-centric paradigm to handle LiDAR SOT from a new perspective.
arXiv Detail & Related papers (2023-03-21T17:28:44Z) - InterTrack: Interaction Transformer for 3D Multi-Object Tracking [9.283656931246645]
3D multi-object tracking (MOT) is a key problem for autonomous vehicles.
Our proposed solution, InterTrack, generates discriminative object representations for data association.
We validate our approach on the nuScenes 3D MOT benchmark, where we observe significant improvements.
arXiv Detail & Related papers (2022-08-17T03:24:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.