Related papers: Reducing DNN Labelling Cost using Surprise Adequacy: An Industrial Case Study for Autonomous Driving

Reducing DNN Labelling Cost using Surprise Adequacy: An Industrial Case Study for Autonomous Driving

URL: http://arxiv.org/abs/2006.00894v2
Date: Mon, 7 Sep 2020 05:43:23 GMT
Title: Reducing DNN Labelling Cost using Surprise Adequacy: An Industrial Case Study for Autonomous Driving
Authors: Jinhan Kim, Jeongil Ju, Robert Feldt, Shin Yoo
Abstract summary: Deep Neural Networks (DNNs) are rapidly being adopted by the automotive industry, due to their impressive performance in tasks that are essential for autonomous driving. This paper shows how development of a DNN based object segmentation can be improved by exploiting the correlation between Surprise Adequacy (SA) and model performance. In our industrial case study the technique allows cost savings of up to 50% with negligible evaluation inaccuracy.
Score: 23.054842564447895
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Neural Networks (DNNs) are rapidly being adopted by the automotive industry, due to their impressive performance in tasks that are essential for autonomous driving. Object segmentation is one such task: its aim is to precisely locate boundaries of objects and classify the identified objects, helping autonomous cars to recognise the road environment and the traffic situation. Not only is this task safety critical, but developing a DNN based object segmentation module presents a set of challenges that are significantly different from traditional development of safety critical software. The development process in use consists of multiple iterations of data collection, labelling, training, and evaluation. Among these stages, training and evaluation are computation intensive while data collection and labelling are manual labour intensive. This paper shows how development of DNN based object segmentation can be improved by exploiting the correlation between Surprise Adequacy (SA) and model performance. The correlation allows us to predict model performance for inputs without manually labelling them. This, in turn, enables understanding of model performance, more guided data collection, and informed decisions about further training. In our industrial case study the technique allows cost savings of up to 50% with negligible evaluation inaccuracy. Furthermore, engineers can trade off cost savings versus the tolerable level of inaccuracy depending on different development phases and scenarios.

Related papers

Semantic Segmentation based Scene Understanding in Autonomous Vehicles [0.0]
We propose several efficient models to investigate scene understanding through semantic segmentation.<n>The obtained results show that choosing the appropriate backbone has a great effect on the performance of the model.<n>In the end, we analyze and evaluate the proposed models in terms of accuracy, mean IoU, and loss function, and the results show that these metrics are improved.
arXiv Detail & Related papers (2025-07-18T18:21:47Z)
CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking [22.817457688303513]
We propose CogDDN, a VLM-based framework that emulates the human cognitive and learning mechanisms.<n>CogDDN identifies appropriate target objects by semantically aligning detected objects with the given instructions.<n>It incorporates a dual-process decision-making module, comprising a Heuristic Process for rapid, efficient decisions and an Analytic Process that analyzes past errors.
arXiv Detail & Related papers (2025-07-15T14:06:24Z)
Data Scaling Laws for End-to-End Autonomous Driving [83.85463296830743]
We evaluate the performance of a simple end-to-end driving architecture on internal driving datasets ranging in size from 16 to 8192 hours. Specifically, we investigate how much additional training data is needed to achieve a target performance gain.
arXiv Detail & Related papers (2025-04-06T03:23:48Z)
On-device edge learning for IoT data streams: a survey [1.7186863539230333]
This literature review explores continual learning methods for on-device training in the context of neural networks (NNs) and decision trees (DTs) We highlight key constraints, such as data architecture (batch vs. stream) and network capacity (cloud vs. edge) The survey details the challenges of deploying deep learners on resource-constrained edge devices.
arXiv Detail & Related papers (2025-02-25T02:41:23Z)
NetFlowGen: Leveraging Generative Pre-training for Network Traffic Dynamics [72.95483148058378]
We propose to pre-train a general-purpose machine learning model to capture traffic dynamics with only traffic data from NetFlow records. We address challenges such as unifying network feature representations, learning from large unlabeled traffic data volume, and testing on real downstream tasks in DDoS attack detection.
arXiv Detail & Related papers (2024-12-30T00:47:49Z)
How Much Data are Enough? Investigating Dataset Requirements for Patch-Based Brain MRI Segmentation Tasks [74.21484375019334]
Training deep neural networks reliably requires access to large-scale datasets. To mitigate both the time and financial costs associated with model development, a clear understanding of the amount of data required to train a satisfactory model is crucial. This paper proposes a strategic framework for estimating the amount of annotated data required to train patch-based segmentation networks.
arXiv Detail & Related papers (2024-04-04T13:55:06Z)
AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios. We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z)
Evaluating the Robustness of Off-Road Autonomous Driving Segmentation against Adversarial Attacks: A Dataset-Centric analysis [1.6538732383658392]
This study investigates the vulnerability of semantic segmentation models to adversarial input perturbations. We compare the effects of adversarial attacks on different segmentation network architectures. This work contributes to the safe navigation of autonomous robot Unimog U5023 in rough off-road unstructured environments.
arXiv Detail & Related papers (2024-02-03T13:48:57Z)
ANNA: A Deep Learning Based Dataset in Heterogeneous Traffic for Autonomous Vehicles [2.932123507260722]
This study discusses a custom-built dataset that includes some unidentified vehicles in the perspective of Bangladesh. A dataset validity check was performed by evaluating models using the Intersection Over Union (IOU) metric. The results demonstrated that the model trained on our custom dataset was more precise and efficient than the models trained on the KITTI or COCO dataset concerning Bangladeshi traffic.
arXiv Detail & Related papers (2024-01-21T01:14:04Z)
BAT: Behavior-Aware Human-Like Trajectory Prediction for Autonomous Driving [24.123577277806135]
We pioneer a novel behavior-aware trajectory prediction model (BAT) Our model consists of behavior-aware, interaction-aware, priority-aware, and position-aware modules. We evaluate BAT's performance across the Next Generation Simulation (NGSIM), Highway Drone (HighD), Roundabout Drone (RounD), and Macao Connected Autonomous Driving (MoCAD) datasets.
arXiv Detail & Related papers (2023-12-11T13:27:51Z)
Unsupervised Self-Driving Attention Prediction via Uncertainty Mining and Knowledge Embedding [51.8579160500354]
We propose an unsupervised way to predict self-driving attention by uncertainty modeling and driving knowledge integration. Results show equivalent or even more impressive performance compared to fully-supervised state-of-the-art approaches.
arXiv Detail & Related papers (2023-03-17T00:28:33Z)
Discrete Key-Value Bottleneck [95.61236311369821]
Deep neural networks perform well on classification tasks where data streams are i.i.d. and labeled data is abundant. One powerful approach that has addressed this challenge involves pre-training of large encoders on volumes of readily available data, followed by task-specific tuning. Given a new task, however, updating the weights of these encoders is challenging as a large number of weights needs to be fine-tuned, and as a result, they forget information about the previous tasks. We propose a model architecture to address this issue, building upon a discrete bottleneck containing pairs of separate and learnable key-value codes.
arXiv Detail & Related papers (2022-07-22T17:52:30Z)
CausalAgents: A Robustness Benchmark for Motion Forecasting using Causal Relationships [8.679073301435265]
We construct a new benchmark for evaluating and improving model robustness by applying perturbations to existing data. We use these labels to perturb the data by deleting non-causal agents from the scene. Under non-causal perturbations, we observe a $25$-$38%$ relative change in minADE as compared to the original.
arXiv Detail & Related papers (2022-07-07T21:28:23Z)
Important Object Identification with Semi-Supervised Learning for Autonomous Driving [37.654878298744855]
We propose a novel approach for important object identification in egocentric driving scenarios. We present a semi-supervised learning pipeline to enable the model to learn from unlimited unlabeled data. Our approach also outperforms rule-based baselines by a large margin.
arXiv Detail & Related papers (2022-03-05T01:23:13Z)
Just Label What You Need: Fine-Grained Active Selection for Perception and Prediction through Partially Labeled Scenes [78.23907801786827]
We introduce generalizations that ensure that our approach is both cost-aware and allows for fine-grained selection of examples through partially labeled scenes. Our experiments on a real-world, large-scale self-driving dataset suggest that fine-grained selection can improve the performance across perception, prediction, and downstream planning tasks.
arXiv Detail & Related papers (2021-04-08T17:57:41Z)
Diverse Complexity Measures for Dataset Curation in Self-driving [80.55417232642124]
We propose a new data selection method that exploits a diverse set of criteria that quantize interestingness of traffic scenes. Our experiments show that the proposed curation pipeline is able to select datasets that lead to better generalization and higher performance.
arXiv Detail & Related papers (2021-01-16T23:45:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.