DG-Labeler and DGL-MOTS Dataset: Boost the Autonomous Driving Perception
- URL: http://arxiv.org/abs/2110.07790v1
- Date: Fri, 15 Oct 2021 01:04:31 GMT
- Title: DG-Labeler and DGL-MOTS Dataset: Boost the Autonomous Driving Perception
- Authors: Yiming Cui, Zhiwen Cao, Yixin Xie, Xingyu Jiang, Feng Tao, Yingjie
Chen, Lin Li, Dongfang Liu
- Abstract summary: We introduce the DG-Labeler and DGL-MOTS dataset to facilitate the training data annotation for the MOTS task.
Results on extensive cross-dataset evaluations indicate significant performance improvements for several state-of-the-art methods trained on our dataset.
- Score: 15.988493804970092
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-object tracking and segmentation (MOTS) is a critical task for
autonomous driving applications. The existing MOTS studies face two critical
challenges: 1) the published datasets inadequately capture the real-world
complexity for network training to address various driving settings; 2) the
working pipeline annotation tool is under-studied in the literature to improve
the quality of MOTS learning examples. In this work, we introduce the
DG-Labeler and DGL-MOTS dataset to facilitate the training data annotation for
the MOTS task and accordingly improve network training accuracy and efficiency.
DG-Labeler uses the novel Depth-Granularity Module to depict the instance
spatial relations and produce fine-grained instance masks. Annotated by
DG-Labeler, our DGL-MOTS dataset exceeds the prior effort (i.e., KITTI MOTS and
BDD100K) in data diversity, annotation quality, and temporal representations.
Results on extensive cross-dataset evaluations indicate significant performance
improvements for several state-of-the-art methods trained on our DGL-MOTS
dataset. We believe our DGL-MOTS Dataset and DG-Labeler hold the valuable
potential to boost the visual perception of future transportation.
Related papers
- K-Link: Knowledge-Link Graph from LLMs for Enhanced Representation
Learning in Multivariate Time-Series Data [39.83677994033754]
We propose a novel framework named K-Link, leveraging Large Language Models (LLMs) to encode extensive general knowledge.
We propose a graph alignment module, facilitating the transfer of semantic knowledge within the knowledge-link graph into the MTS-derived graph.
arXiv Detail & Related papers (2024-03-06T12:08:14Z) - SeiT++: Masked Token Modeling Improves Storage-efficient Training [36.95646819348317]
Recent advancements in Deep Neural Network (DNN) models have significantly improved performance across computer vision tasks.
achieving highly generalizable and high-performing vision models requires expansive datasets, resulting in significant storage requirements.
Recent breakthrough by SeiT proposed the use of Vector-Quantized (VQ) feature vectors (i.e., tokens) as network inputs for vision classification.
In this paper, we extend SeiT by integrating Masked Token Modeling (MTM) for self-supervised pre-training.
arXiv Detail & Related papers (2023-12-15T04:11:34Z) - Diffusion Model is an Effective Planner and Data Synthesizer for
Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis.
For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z) - LWSIS: LiDAR-guided Weakly Supervised Instance Segmentation for
Autonomous Driving [34.119642131912485]
We present a more artful framework, LiDAR-guided Weakly Supervised Instance (LWSIS)
LWSIS uses the off-the-shelf 3D data, i.e., Point Cloud, together with the 3D boxes, as natural weak supervisions for training the 2D image instance segmentation models.
Our LWSIS not only exploits the complementary information in multimodal data during training, but also significantly reduces the cost of the dense 2D masks.
arXiv Detail & Related papers (2022-12-07T08:08:01Z) - Self-Supervised Graph Neural Network for Multi-Source Domain Adaptation [51.21190751266442]
Domain adaptation (DA) tries to tackle the scenarios when the test data does not fully follow the same distribution of the training data.
By learning from large-scale unlabeled samples, self-supervised learning has now become a new trend in deep learning.
We propose a novel textbfSelf-textbfSupervised textbfGraph Neural Network (SSG) to enable more effective inter-task information exchange and knowledge sharing.
arXiv Detail & Related papers (2022-04-08T03:37:56Z) - Gram-SLD: Automatic Self-labeling and Detection for Instance Objects [6.512856940779818]
We propose a new framework based on co-training called Gram Self-Labeling and Detection (Gram-SLD)
Gram-SLD can automatically annotate a large amount of data with very limited manually labeled key data and achieve competitive performance.
arXiv Detail & Related papers (2021-12-07T11:34:55Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z) - Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with
Self-Supervised Depth Estimation [94.16816278191477]
We present a framework for semi-adaptive and domain-supervised semantic segmentation.
It is enhanced by self-supervised monocular depth estimation trained only on unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset.
arXiv Detail & Related papers (2021-08-28T01:33:38Z) - Large-scale Unsupervised Semantic Segmentation [163.3568726730319]
We propose a new problem of large-scale unsupervised semantic segmentation (LUSS) with a newly created benchmark dataset to track the research progress.
Based on the ImageNet dataset, we propose the ImageNet-S dataset with 1.2 million training images and 40k high-quality semantic segmentation annotations for evaluation.
arXiv Detail & Related papers (2021-06-06T15:02:11Z) - DAGA: Data Augmentation with a Generation Approach for Low-resource
Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences.
Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z) - Improving the Performance of Fine-Grain Image Classifiers via Generative
Data Augmentation [0.5161531917413706]
We develop Data Augmentation from Proficient Pre-Training of Robust Generative Adrial Networks (DAPPER GAN)
DAPPER GAN is an ML analytics support tool that automatically generates novel views of training images.
We experimentally evaluate this technique on the Stanford Cars dataset, demonstrating improved vehicle make and model classification accuracy.
arXiv Detail & Related papers (2020-08-12T15:29:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.