7th AI Driving Olympics: 1st Place Report for Panoptic Tracking
- URL: http://arxiv.org/abs/2112.05210v1
- Date: Thu, 9 Dec 2021 20:52:28 GMT
- Title: 7th AI Driving Olympics: 1st Place Report for Panoptic Tracking
- Authors: Rohit Mohan, Abhinav Valada
- Abstract summary: Our architecture won the panoptic tracking challenge in the 7th AI Driving Olympics at NeurIPS 2021.
Our approach exploits three consecutive accumulated scans to predict locally consistent panoptic tracking IDs and also the overlap between the scans to predict globally consistent panoptic tracking IDs for a given sequence.
The benchmarking results from the 7th AI Driving Olympics at NeurIPS 2021 show that our model is ranked #1 for the panoptic tracking task on the Panoptic nuScenes dataset.
- Score: 6.226227982115869
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this technical report, we describe our EfficientLPT architecture that won
the panoptic tracking challenge in the 7th AI Driving Olympics at NeurIPS 2021.
Our architecture builds upon the top-down EfficientLPS panoptic segmentation
approach. EfficientLPT consists of a shared backbone with a modified
EfficientNet-B5 model comprising the proximity convolution module as the
encoder followed by the range-aware FPN to aggregate semantically rich
range-aware multi-scale features. Subsequently, we employ two task-specific
heads, the scale-invariant semantic head and hybrid task cascade with feedback
from the semantic head as the instance head. Further, we employ a novel
panoptic fusion module to adaptively fuse logits from each of the heads to
yield the panoptic tracking output. Our approach exploits three consecutive
accumulated scans to predict locally consistent panoptic tracking IDs and also
the overlap between the scans to predict globally consistent panoptic tracking
IDs for a given sequence. The benchmarking results from the 7th AI Driving
Olympics at NeurIPS 2021 show that our model is ranked #1 for the panoptic
tracking task on the Panoptic nuScenes dataset.
Related papers
- Skeleton2vec: A Self-supervised Learning Framework with Contextualized
Target Representations for Skeleton Sequence [56.092059713922744]
We show that using high-level contextualized features as prediction targets can achieve superior performance.
Specifically, we propose Skeleton2vec, a simple and efficient self-supervised 3D action representation learning framework.
Our proposed Skeleton2vec outperforms previous methods and achieves state-of-the-art results.
arXiv Detail & Related papers (2024-01-01T12:08:35Z) - SDTracker: Synthetic Data Based Multi-Object Tracking [8.43201092674197]
We present SDTracker, a method that harnesses the potential of synthetic data for multi-object tracking of real-world scenes.
We use the ImageNet dataset as an auxiliary to randomize the style of synthetic data.
We also adopt the pseudo-labeling method to effectively utilize the unlabeled MOT17 training data.
arXiv Detail & Related papers (2023-03-26T08:21:22Z) - Self-Supervised Representation Learning from Temporal Ordering of
Automated Driving Sequences [49.91741677556553]
We propose TempO, a temporal ordering pretext task for pre-training region-level feature representations for perception tasks.
We embed each frame by an unordered set of proposal feature vectors, a representation that is natural for object detection or tracking systems.
Extensive evaluations on the BDD100K, nuImages, and MOT17 datasets show that our TempO pre-training approach outperforms single-frame self-supervised learning methods.
arXiv Detail & Related papers (2023-02-17T18:18:27Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - Panoptic-PHNet: Towards Real-Time and High-Precision LiDAR Panoptic
Segmentation via Clustering Pseudo Heatmap [9.770808277353128]
We propose a fast and high-performance LiDAR-based framework, referred to as Panoptic-PHNet.
We introduce a clustering pseudo heatmap as a new paradigm, which, followed by a center grouping module, yields instance centers for efficient clustering.
For backbone design, we fuse the fine-grained voxel features and the 2D Bird's Eye View (BEV) features with different receptive fields to utilize both detailed and global information.
arXiv Detail & Related papers (2022-05-14T08:16:13Z) - LiDAR-based 4D Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm.
Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods in both tasks.
We extend DS-Net to 4D panoptic LiDAR segmentation by the temporally unified instance clustering on aligned LiDAR frames.
arXiv Detail & Related papers (2022-03-14T15:25:42Z) - LiDAR-based Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
LiDAR-based panoptic segmentation aims to parse both objects and scenes in a unified manner.
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm.
Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods.
arXiv Detail & Related papers (2020-11-24T08:44:46Z) - Robust Vision Challenge 2020 -- 1st Place Report for Panoptic
Segmentation [13.23676270963484]
Our network is a lightweight version of our state-of-the-art EfficientPS architecture.
It consists of our proposed shared backbone with a modified EfficientNet-B5 model as the encoder, followed by the 2-way FPN to learn semantically rich multi-scale features.
Our proposed panoptic fusion module adaptively fuses logits from each of the heads to yield the panoptic segmentation output.
arXiv Detail & Related papers (2020-08-23T21:41:43Z) - Dense Scene Multiple Object Tracking with Box-Plane Matching [73.54369833671772]
Multiple Object Tracking (MOT) is an important task in computer vision.
We propose the Box-Plane Matching (BPM) method to improve the MOT performacne in dense scenes.
With the effectiveness of the three modules, our team achieves the 1st place on the Track-1 leaderboard in the ACM MM Grand Challenge HiEve 2020.
arXiv Detail & Related papers (2020-07-30T16:39:22Z) - MOPT: Multi-Object Panoptic Tracking [33.77171216778909]
We introduce a novel perception task denoted as multi-object panoptic tracking (MOPT)
MOPT allows for exploiting pixel-level semantic information of 'thing' and'stuff' classes, temporal coherence, and pixel-level associations over time.
We present extensive quantitative and qualitative evaluations of both vision-based and LiDAR-based MOPT that demonstrate encouraging results.
arXiv Detail & Related papers (2020-04-17T11:45:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.