Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and
Reasoning
- URL: http://arxiv.org/abs/2309.06597v2
- Date: Wed, 8 Nov 2023 09:12:01 GMT
- Title: Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and
Reasoning
- Authors: Enna Sachdeva, Nakul Agarwal, Suhas Chundi, Sean Roelofs, Jiachen Li,
Mykel Kochenderfer, Chiho Choi, Behzad Dariush
- Abstract summary: This paper introduces a novel dataset, Rank2Tell, a multi-modal ego-centric dataset for Ranking the importance level and Telling the reason for the importance.
Using various close and open-ended visual question answering, the dataset provides dense annotations of various semantic, spatial, temporal, and relational attributes of various important objects in complex traffic scenarios.
- Score: 19.43430577960824
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The widespread adoption of commercial autonomous vehicles (AVs) and advanced
driver assistance systems (ADAS) may largely depend on their acceptance by
society, for which their perceived trustworthiness and interpretability to
riders are crucial. In general, this task is challenging because modern
autonomous systems software relies heavily on black-box artificial intelligence
models. Towards this goal, this paper introduces a novel dataset, Rank2Tell, a
multi-modal ego-centric dataset for Ranking the importance level and Telling
the reason for the importance. Using various close and open-ended visual
question answering, the dataset provides dense annotations of various semantic,
spatial, temporal, and relational attributes of various important objects in
complex traffic scenarios. The dense annotations and unique attributes of the
dataset make it a valuable resource for researchers working on visual scene
understanding and related fields. Furthermore, we introduce a joint model for
joint importance level ranking and natural language captions generation to
benchmark our dataset and demonstrate performance with quantitative
evaluations.
Related papers
- A Survey on Autonomous Driving Datasets: Statistics, Annotation Quality, and a Future Outlook [24.691922611156937]
We present an exhaustive study of 265 autonomous driving datasets from multiple perspectives.
We introduce a novel metric to evaluate the impact of datasets, which can also be a guide for creating new datasets.
We discuss the current challenges and the development trend of the future autonomous driving datasets.
arXiv Detail & Related papers (2024-01-02T22:35:33Z) - Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future [130.87142103774752]
This review systematically assesses over seventy open-source autonomous driving datasets.
It offers insights into various aspects, such as the principles underlying the creation of high-quality datasets.
It also delves into the scientific and technical challenges that warrant resolution.
arXiv Detail & Related papers (2023-12-06T10:46:53Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - Important Object Identification with Semi-Supervised Learning for
Autonomous Driving [37.654878298744855]
We propose a novel approach for important object identification in egocentric driving scenarios.
We present a semi-supervised learning pipeline to enable the model to learn from unlimited unlabeled data.
Our approach also outperforms rule-based baselines by a large margin.
arXiv Detail & Related papers (2022-03-05T01:23:13Z) - Perceptual Score: What Data Modalities Does Your Model Perceive? [73.75255606437808]
We introduce the perceptual score, a metric that assesses the degree to which a model relies on the different subsets of the input features.
We find that recent, more accurate multi-modal models for visual question-answering tend to perceive the visual data less than their predecessors.
Using the perceptual score also helps to analyze model biases by decomposing the score into data subset contributions.
arXiv Detail & Related papers (2021-10-27T12:19:56Z) - A survey on datasets for fairness-aware machine learning [6.962333053044713]
A large variety of fairness-aware machine learning solutions have been proposed.
In this paper, we overview real-world datasets used for fairness-aware machine learning.
For a deeper understanding of bias and fairness in the datasets, we investigate the interesting relationships using exploratory analysis.
arXiv Detail & Related papers (2021-10-01T16:54:04Z) - Diverse Complexity Measures for Dataset Curation in Self-driving [80.55417232642124]
We propose a new data selection method that exploits a diverse set of criteria that quantize interestingness of traffic scenes.
Our experiments show that the proposed curation pipeline is able to select datasets that lead to better generalization and higher performance.
arXiv Detail & Related papers (2021-01-16T23:45:02Z) - The Multimodal Sentiment Analysis in Car Reviews (MuSe-CaR) Dataset:
Collection, Insights and Improvements [14.707930573950787]
We present MuSe-CaR, a first of its kind multimodal dataset.
The data is publicly available as it recently served as the testing bed for the 1st Multimodal Sentiment Analysis Challenge.
arXiv Detail & Related papers (2021-01-15T10:40:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.