TransFiner: A Full-Scale Refinement Approach for Multiple Object
Tracking
- URL: http://arxiv.org/abs/2207.12967v1
- Date: Tue, 26 Jul 2022 15:21:42 GMT
- Title: TransFiner: A Full-Scale Refinement Approach for Multiple Object
Tracking
- Authors: Bin Sun and Jiale Cao
- Abstract summary: Multiple object tracking (MOT) is the task containing detection and association.
We propose TransFiner, a transformer-based post-refinement approach for MOT.
- Score: 17.784388121222392
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multiple object tracking (MOT) is the task containing detection and
association. Plenty of trackers have achieved competitive performance.
Unfortunately, for the lack of informative exchange on these subtasks, they are
often biased toward one of the two and remain underperforming in complex
scenarios, such as the expected false negatives and mistaken trajectories of
targets when passing each other. In this paper, we propose TransFiner, a
transformer-based post-refinement approach for MOT. It is a generic attachment
framework that leverages the images and tracking results (locations and class
predictions) from the original tracker as inputs, which are then used to launch
TransFiner powerfully. Moreover, TransFiner depends on query pairs, which
produce pairs of detection and motion through the fusion decoder and achieve
comprehensive tracking improvement. We also provide targeted refinement by
labeling query pairs according to different refinement levels. Experiments show
that our design is effective, on the MOT17 benchmark, we elevate the
CenterTrack from 67.8% MOTA and 64.7% IDF1 to 71.5% MOTA and 66.8% IDF1.
Related papers
- Temporal Correlation Meets Embedding: Towards a 2nd Generation of JDE-based Real-Time Multi-Object Tracking [52.04679257903805]
Joint Detection and Embedding (JDE) trackers have demonstrated excellent performance in Multi-Object Tracking (MOT) tasks.
Our tracker, named TCBTrack, achieves state-of-the-art performance on multiple public benchmarks.
arXiv Detail & Related papers (2024-07-19T07:48:45Z) - Separable Self and Mixed Attention Transformers for Efficient Object
Tracking [3.9160947065896803]
This paper proposes an efficient self and mixed attention transformer-based architecture for lightweight tracking.
With these contributions, the proposed lightweight tracker deploys a transformer-based backbone and head module concurrently for the first time.
Simulations show that our Separable Self and Mixed Attention-based Tracker, SMAT, surpasses the performance of related lightweight trackers on GOT10k, TrackingNet, LaSOT, NfS30, UAV123, and AVisT datasets.
arXiv Detail & Related papers (2023-09-07T19:23:02Z) - Efficient Joint Detection and Multiple Object Tracking with Spatially
Aware Transformer [0.8808021343665321]
We propose a light-weight and highly efficient Joint Detection and Tracking pipeline for the task of Multi-Object Tracking.
It is driven by a transformer based backbone instead of CNN, which is highly scalable with the input resolution.
As a result of our modifications, we reduce the overall model size of TransTrack by 58.73% and the complexity by 78.72%.
arXiv Detail & Related papers (2022-11-09T07:19:33Z) - Strong-TransCenter: Improved Multi-Object Tracking based on Transformers
with Dense Representations [1.2891210250935146]
TransCenter is a transformer-based MOT architecture with dense object queries for accurately tracking all the objects.
This paper shows an improvement to this tracker using post processing mechanism based in the Track-by-Detection paradigm.
Our new tracker shows significant improvements in the IDF1 and HOTA metrics and comparable results on the MOTA metric.
arXiv Detail & Related papers (2022-10-24T19:47:58Z) - Joint Spatial-Temporal and Appearance Modeling with Transformer for
Multiple Object Tracking [59.79252390626194]
We propose a novel solution named TransSTAM, which leverages Transformer to model both the appearance features of each object and the spatial-temporal relationships among objects.
The proposed method is evaluated on multiple public benchmarks including MOT16, MOT17, and MOT20, and it achieves a clear performance improvement in both IDF1 and HOTA.
arXiv Detail & Related papers (2022-05-31T01:19:18Z) - Global Tracking Transformers [76.58184022651596]
We present a novel transformer-based architecture for global multi-object tracking.
The core component is a global tracking transformer that operates on objects from all frames in the sequence.
Our framework seamlessly integrates into state-of-the-art large-vocabulary detectors to track any objects.
arXiv Detail & Related papers (2022-03-24T17:58:04Z) - VariabilityTrack:Multi-Object Tracking with Variable Speed Object
Movement [1.6385815610837167]
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in videos.
We propose a variable speed Kalman filter algorithm based on environmental feedback and improve the matching process.
arXiv Detail & Related papers (2022-03-12T12:39:41Z) - ByteTrack: Multi-Object Tracking by Associating Every Detection Box [51.93588012109943]
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in videos.
Most methods obtain identities by associating detection boxes whose scores are higher than a threshold.
We present a simple, effective and generic association method, called BYTE, tracking BY associaTing every detection box instead of only the high score ones.
arXiv Detail & Related papers (2021-10-13T17:01:26Z) - Tracklets Predicting Based Adaptive Graph Tracking [51.352829280902114]
We present an accurate and end-to-end learning framework for multi-object tracking, namely textbfTPAGT.
It re-extracts the features of the tracklets in the current frame based on motion predicting, which is the key to solve the problem of features inconsistent.
arXiv Detail & Related papers (2020-10-18T16:16:49Z) - Tracking-by-Counting: Using Network Flows on Crowd Density Maps for
Tracking Multiple Targets [96.98888948518815]
State-of-the-art multi-object tracking(MOT) methods follow the tracking-by-detection paradigm.
We propose a new MOT paradigm, tracking-by-counting, tailored for crowded scenes.
arXiv Detail & Related papers (2020-07-18T19:51:53Z) - Multiple Object Tracking by Flowing and Fusing [31.58422046611455]
Flow-Fuse-Tracker (FFT) is a tracking approach that learns the indefinite number of target-wise motions jointly from pixel-level optical flows.
In target fusing, a FuseTracker module refines and fuses targets proposed by FlowTracker and frame-wise object detection.
As an online MOT approach, FFT produced the top MOTA of 46.3 on the 2DMOT15, 56.5 on the MOT16, and 56.5 on the MOT17 tracking benchmarks.
arXiv Detail & Related papers (2020-01-30T05:17:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.