Attribute Annotation and Bias Evaluation in Visual Datasets for
Autonomous Driving
- URL: http://arxiv.org/abs/2312.06306v1
- Date: Mon, 11 Dec 2023 11:27:01 GMT
- Title: Attribute Annotation and Bias Evaluation in Visual Datasets for
Autonomous Driving
- Authors: David Fern\'andez Llorca, Pedro Frau, Ignacio Parra, Rub\'en
Izquierdo, Emilia G\'omez
- Abstract summary: We focus our analysis on biases present in some of the most commonly used visual datasets for training person and vehicle detection systems.
We introduce an annotation methodology and a specialised annotation tool, both designed to annotate protected attributes of agents in visual datasets.
These include annotations for the attributes age, sex, skin tone, group, and means of transport for more than 90K people, as well as vehicle type, colour, and car type for over 50K vehicles.
- Score: 0.3595110752516458
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the often overlooked issue of fairness in the autonomous
driving domain, particularly in vision-based perception and prediction systems,
which play a pivotal role in the overall functioning of Autonomous Vehicles
(AVs). We focus our analysis on biases present in some of the most commonly
used visual datasets for training person and vehicle detection systems. We
introduce an annotation methodology and a specialised annotation tool, both
designed to annotate protected attributes of agents in visual datasets. We
validate our methodology through an inter-rater agreement analysis and provide
the distribution of attributes across all datasets. These include annotations
for the attributes age, sex, skin tone, group, and means of transport for more
than 90K people, as well as vehicle type, colour, and car type for over 50K
vehicles. Generally, diversity is very low for most attributes, with some
groups, such as children, wheelchair users, or personal mobility vehicle users,
being extremely underrepresented in the analysed datasets. The study
contributes significantly to efforts to consider fairness in the evaluation of
perception and prediction systems for AVs. This paper follows reproducibility
principles. The annotation tool, scripts and the annotated attributes can be
accessed publicly at https://github.com/ec-jrc/humaint_annotator.
Related papers
- Learning Decomposable and Debiased Representations via Attribute-Centric Information Bottlenecks [21.813755593742858]
Biased attributes, spuriously correlated with target labels in a dataset, can problematically lead to neural networks that learn improper shortcuts for classifications.
We propose a novel debiasing framework, Debiasing Global Workspace, introducing attention-based information bottlenecks for learning compositional representations of attributes.
We conduct comprehensive evaluations on biased datasets, along with both quantitative and qualitative analyses, to showcase our approach's efficacy.
arXiv Detail & Related papers (2024-03-21T05:33:49Z) - Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and
Reasoning [19.43430577960824]
This paper introduces a novel dataset, Rank2Tell, a multi-modal ego-centric dataset for Ranking the importance level and Telling the reason for the importance.
Using various close and open-ended visual question answering, the dataset provides dense annotations of various semantic, spatial, temporal, and relational attributes of various important objects in complex traffic scenarios.
arXiv Detail & Related papers (2023-09-12T20:51:07Z) - SODA10M: Towards Large-Scale Object Detection Benchmark for Autonomous
Driving [94.11868795445798]
We release a Large-Scale Object Detection benchmark for Autonomous driving, named as SODA10M, containing 10 million unlabeled images and 20K images labeled with 6 representative object categories.
To improve diversity, the images are collected every ten seconds per frame within 32 different cities under different weather conditions, periods and location scenes.
We provide extensive experiments and deep analyses of existing supervised state-of-the-art detection models, popular self-supervised and semi-supervised approaches, and some insights about how to develop future models.
arXiv Detail & Related papers (2021-06-21T13:55:57Z) - Large Scale Autonomous Driving Scenarios Clustering with Self-supervised
Feature Extraction [6.804209932400134]
This article proposes a comprehensive data clustering framework for a large set of vehicle driving data.
Our approach thoroughly considers the traffic elements, including both in-traffic agent objects and map information.
With the newly designed driving data clustering evaluation metrics based on data-augmentation, the accuracy assessment does not require a human-labeled data-set.
arXiv Detail & Related papers (2021-03-30T06:22:40Z) - Diverse Complexity Measures for Dataset Curation in Self-driving [80.55417232642124]
We propose a new data selection method that exploits a diverse set of criteria that quantize interestingness of traffic scenes.
Our experiments show that the proposed curation pipeline is able to select datasets that lead to better generalization and higher performance.
arXiv Detail & Related papers (2021-01-16T23:45:02Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - Detecting 32 Pedestrian Attributes for Autonomous Vehicles [103.87351701138554]
In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes.
We introduce a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way.
We show competitive detection and attribute recognition results, as well as a more stable MTL training.
arXiv Detail & Related papers (2020-12-04T15:10:12Z) - Semi-Automatic Data Annotation guided by Feature Space Projection [117.9296191012968]
We present a semi-automatic data annotation approach based on suitable feature space projection and semi-supervised label estimation.
We validate our method on the popular MNIST dataset and on images of human intestinal parasites with and without fecal impurities.
Our results demonstrate the added-value of visual analytics tools that combine complementary abilities of humans and machines for more effective machine learning.
arXiv Detail & Related papers (2020-07-27T17:03:50Z) - VehicleNet: Learning Robust Visual Representation for Vehicle
Re-identification [116.1587709521173]
We propose to build a large-scale vehicle dataset (called VehicleNet) by harnessing four public vehicle datasets.
We design a simple yet effective two-stage progressive approach to learning more robust visual representation from VehicleNet.
We achieve the state-of-art accuracy of 86.07% mAP on the private test set of AICity Challenge.
arXiv Detail & Related papers (2020-04-14T05:06:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.