Reducing DNN Labelling Cost using Surprise Adequacy: An Industrial Case
Study for Autonomous Driving
- URL: http://arxiv.org/abs/2006.00894v2
- Date: Mon, 7 Sep 2020 05:43:23 GMT
- Title: Reducing DNN Labelling Cost using Surprise Adequacy: An Industrial Case
Study for Autonomous Driving
- Authors: Jinhan Kim, Jeongil Ju, Robert Feldt, Shin Yoo
- Abstract summary: Deep Neural Networks (DNNs) are rapidly being adopted by the automotive industry, due to their impressive performance in tasks that are essential for autonomous driving.
This paper shows how development of a DNN based object segmentation can be improved by exploiting the correlation between Surprise Adequacy (SA) and model performance.
In our industrial case study the technique allows cost savings of up to 50% with negligible evaluation inaccuracy.
- Score: 23.054842564447895
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks (DNNs) are rapidly being adopted by the automotive
industry, due to their impressive performance in tasks that are essential for
autonomous driving. Object segmentation is one such task: its aim is to
precisely locate boundaries of objects and classify the identified objects,
helping autonomous cars to recognise the road environment and the traffic
situation. Not only is this task safety critical, but developing a DNN based
object segmentation module presents a set of challenges that are significantly
different from traditional development of safety critical software. The
development process in use consists of multiple iterations of data collection,
labelling, training, and evaluation. Among these stages, training and
evaluation are computation intensive while data collection and labelling are
manual labour intensive. This paper shows how development of DNN based object
segmentation can be improved by exploiting the correlation between Surprise
Adequacy (SA) and model performance. The correlation allows us to predict model
performance for inputs without manually labelling them. This, in turn, enables
understanding of model performance, more guided data collection, and informed
decisions about further training. In our industrial case study the technique
allows cost savings of up to 50% with negligible evaluation inaccuracy.
Furthermore, engineers can trade off cost savings versus the tolerable level of
inaccuracy depending on different development phases and scenarios.
Related papers
- How Much Data are Enough? Investigating Dataset Requirements for Patch-Based Brain MRI Segmentation Tasks [74.21484375019334]
Training deep neural networks reliably requires access to large-scale datasets.
To mitigate both the time and financial costs associated with model development, a clear understanding of the amount of data required to train a satisfactory model is crucial.
This paper proposes a strategic framework for estimating the amount of annotated data required to train patch-based segmentation networks.
arXiv Detail & Related papers (2024-04-04T13:55:06Z) - AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios.
We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z) - Evaluating the Robustness of Off-Road Autonomous Driving Segmentation
against Adversarial Attacks: A Dataset-Centric analysis [1.6538732383658392]
This study investigates the vulnerability of semantic segmentation models to adversarial input perturbations.
We compare the effects of adversarial attacks on different segmentation network architectures.
This work contributes to the safe navigation of autonomous robot Unimog U5023 in rough off-road unstructured environments.
arXiv Detail & Related papers (2024-02-03T13:48:57Z) - ANNA: A Deep Learning Based Dataset in Heterogeneous Traffic for
Autonomous Vehicles [2.932123507260722]
This study discusses a custom-built dataset that includes some unidentified vehicles in the perspective of Bangladesh.
A dataset validity check was performed by evaluating models using the Intersection Over Union (IOU) metric.
The results demonstrated that the model trained on our custom dataset was more precise and efficient than the models trained on the KITTI or COCO dataset concerning Bangladeshi traffic.
arXiv Detail & Related papers (2024-01-21T01:14:04Z) - BAT: Behavior-Aware Human-Like Trajectory Prediction for Autonomous
Driving [24.123577277806135]
We pioneer a novel behavior-aware trajectory prediction model (BAT)
Our model consists of behavior-aware, interaction-aware, priority-aware, and position-aware modules.
We evaluate BAT's performance across the Next Generation Simulation (NGSIM), Highway Drone (HighD), Roundabout Drone (RounD), and Macao Connected Autonomous Driving (MoCAD) datasets.
arXiv Detail & Related papers (2023-12-11T13:27:51Z) - Unsupervised Self-Driving Attention Prediction via Uncertainty Mining
and Knowledge Embedding [51.8579160500354]
We propose an unsupervised way to predict self-driving attention by uncertainty modeling and driving knowledge integration.
Results show equivalent or even more impressive performance compared to fully-supervised state-of-the-art approaches.
arXiv Detail & Related papers (2023-03-17T00:28:33Z) - Discrete Key-Value Bottleneck [95.61236311369821]
Deep neural networks perform well on classification tasks where data streams are i.i.d. and labeled data is abundant.
One powerful approach that has addressed this challenge involves pre-training of large encoders on volumes of readily available data, followed by task-specific tuning.
Given a new task, however, updating the weights of these encoders is challenging as a large number of weights needs to be fine-tuned, and as a result, they forget information about the previous tasks.
We propose a model architecture to address this issue, building upon a discrete bottleneck containing pairs of separate and learnable key-value codes.
arXiv Detail & Related papers (2022-07-22T17:52:30Z) - CausalAgents: A Robustness Benchmark for Motion Forecasting using Causal
Relationships [8.679073301435265]
We construct a new benchmark for evaluating and improving model robustness by applying perturbations to existing data.
We use these labels to perturb the data by deleting non-causal agents from the scene.
Under non-causal perturbations, we observe a $25$-$38%$ relative change in minADE as compared to the original.
arXiv Detail & Related papers (2022-07-07T21:28:23Z) - Important Object Identification with Semi-Supervised Learning for
Autonomous Driving [37.654878298744855]
We propose a novel approach for important object identification in egocentric driving scenarios.
We present a semi-supervised learning pipeline to enable the model to learn from unlimited unlabeled data.
Our approach also outperforms rule-based baselines by a large margin.
arXiv Detail & Related papers (2022-03-05T01:23:13Z) - Just Label What You Need: Fine-Grained Active Selection for Perception
and Prediction through Partially Labeled Scenes [78.23907801786827]
We introduce generalizations that ensure that our approach is both cost-aware and allows for fine-grained selection of examples through partially labeled scenes.
Our experiments on a real-world, large-scale self-driving dataset suggest that fine-grained selection can improve the performance across perception, prediction, and downstream planning tasks.
arXiv Detail & Related papers (2021-04-08T17:57:41Z) - Diverse Complexity Measures for Dataset Curation in Self-driving [80.55417232642124]
We propose a new data selection method that exploits a diverse set of criteria that quantize interestingness of traffic scenes.
Our experiments show that the proposed curation pipeline is able to select datasets that lead to better generalization and higher performance.
arXiv Detail & Related papers (2021-01-16T23:45:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.