Related papers: Automotive Perception Software Development: An Empirical Investigation into Data, Annotation, and Ecosystem Challenges

Automotive Perception Software Development: An Empirical Investigation into Data, Annotation, and Ecosystem Challenges

URL: http://arxiv.org/abs/2303.05947v1
Date: Fri, 10 Mar 2023 14:29:06 GMT
Title: Automotive Perception Software Development: An Empirical Investigation into Data, Annotation, and Ecosystem Challenges
Authors: Hans-Martin Heyn, Khan Mohammad Habibullah, Eric Knauss, Jennifer Horkoff, Markus Borg, Alessia Knauss, Polly Jing Li
Abstract summary: Software that contains machine learning algorithms is an integral part of automotive perception. The development of such software, specifically the training and validation of the machine learning components, require large annotated datasets. An industry of data and annotation services has emerged to serve the development of such data-intensive automotive software components.
Score: 10.649193588119985
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Software that contains machine learning algorithms is an integral part of automotive perception, for example, in driving automation systems. The development of such software, specifically the training and validation of the machine learning components, require large annotated datasets. An industry of data and annotation services has emerged to serve the development of such data-intensive automotive software components. Wide-spread difficulties to specify data and annotation needs challenge collaborations between OEMs (Original Equipment Manufacturers) and their suppliers of software components, data, and annotations. This paper investigates the reasons for these difficulties for practitioners in the Swedish automotive industry to arrive at clear specifications for data and annotations. The results from an interview study show that a lack of effective metrics for data quality aspects, ambiguities in the way of working, unclear definitions of annotation quality, and deficits in the business ecosystems are causes for the difficulty in deriving the specifications. We provide a list of recommendations that can mitigate challenges when deriving specifications and we propose future research opportunities to overcome these challenges. Our work contributes towards the on-going research on accountability of machine learning as applied to complex software systems, especially for high-stake applications such as automated driving.

Related papers

Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey [59.3507264893654]
Issue resolution is a complex Software Engineering task integral to real-world development.<n> benchmarks like SWE-bench revealed this task as profoundly difficult for large language models.<n>This paper presents a systematic survey of this emerging domain.
arXiv Detail & Related papers (2026-01-15T18:55:03Z)
Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs [58.24692529185971]
We introduce a comprehensive auditing framework for unlearning evaluation comprising three benchmark datasets, six unlearning algorithms, and five prompt-based auditing methods.<n>We evaluate the effectiveness and robustness of different unlearning strategies.
arXiv Detail & Related papers (2025-05-29T09:19:07Z)
Towards Effective Issue Assignment using Online Machine Learning [1.3749490831384266]
We propose an Online Machine Learning methodology that adapts to the evolving characteristics of software projects.<n>Our system processes issues as a data stream, dynamically learning from new data and adjusting in real time to changes in team composition and project requirements.
arXiv Detail & Related papers (2025-05-05T08:05:13Z)
QualiTagger: Automating software quality detection in issue trackers [4.917423556150366]
This research uses cutting edge models like Transformers to identify what text is usually associated with different quality properties. We also study the distribution of such qualities in issue trackers from openly accessible software repositories.
arXiv Detail & Related papers (2025-04-15T10:40:40Z)
The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements. LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information. Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z)
Machine Unlearning: Taxonomy, Metrics, Applications, Challenges, and Prospects [17.502158848870426]
Data users have been endowed with the right to be forgotten of their data. In the course of machine learning (ML), the forgotten right requires a model provider to delete user data. Machine unlearning emerges to address this, which has garnered ever-increasing attention from both industry and academia.
arXiv Detail & Related papers (2024-03-13T05:11:24Z)
Dealing with Data for RE: Mitigating Challenges while using NLP and Generative AI [2.9189409618561966]
Book chapter explores the evolving landscape of Software Engineering in general, and Requirements Engineering (RE) in particular. We discuss challenges that arise while integrating Natural Language Processing (NLP) and generative AI into enterprise-critical software systems. Book provides practical insights, solutions, and examples to equip readers with the knowledge and tools necessary.
arXiv Detail & Related papers (2024-02-26T19:19:47Z)
A Systematic Review of Available Datasets in Additive Manufacturing [56.684125592242445]
In-situ monitoring incorporating visual and other sensor technologies allows the collection of extensive datasets during the Additive Manufacturing process. These datasets have potential for determining the quality of the manufactured output and the detection of defects through the use of Machine Learning. This systematic review investigates the availability of open image-based datasets originating from AM processes that align with a number of pre-defined selection criteria.
arXiv Detail & Related papers (2024-01-27T16:13:32Z)
Machine Unlearning: A Survey [56.79152190680552]
A special need has arisen where, due to privacy, usability, and/or the right to be forgotten, information about some specific samples needs to be removed from a model, called machine unlearning. This emerging technology has drawn significant interest from both academics and industry due to its innovation and practicality. No study has analyzed this complex topic or compared the feasibility of existing unlearning solutions in different kinds of scenarios. The survey concludes by highlighting some of the outstanding issues with unlearning techniques, along with some feasible directions for new research opportunities.
arXiv Detail & Related papers (2023-06-06T10:18:36Z)
Modelling Concurrency Bugs Using Machine Learning [0.0]
This project aims to compare both common and recent machine learning approaches. We define a synthetic dataset that we generate with the scope of simulating real-life (concurrent) programs. We formulate hypotheses about fundamental limits of various machine learning model types.
arXiv Detail & Related papers (2023-05-08T17:30:24Z)
Advancing Reacting Flow Simulations with Data-Driven Models [50.9598607067535]
Key to effective use of machine learning tools in multi-physics problems is to couple them to physical and computer models. The present chapter reviews some of the open opportunities for the application of data-driven reduced-order modeling of combustion systems.
arXiv Detail & Related papers (2022-09-05T16:48:34Z)
Engineering an Intelligent Essay Scoring and Feedback System: An Experience Report [1.5168188294440734]
We describe an exploratory system for assessing the quality of essays supplied by customers of a specialized recruitment support service. The problem domain is challenging because the open-ended customer-supplied source text has considerable scope for ambiguity and error. There is also a need to incorporate specialized business domain knowledge into the intelligent processing systems.
arXiv Detail & Related papers (2021-03-25T03:46:05Z)
Automatic Feasibility Study via Data Quality Analysis for ML: A Case-Study on Label Noise [21.491392581672198]
We present Snoopy, with the goal of supporting data scientists and machine learning engineers performing a systematic and theoretically founded feasibility study. We approach this problem by estimating the irreducible error of the underlying task, also known as the Bayes error rate (BER) We demonstrate in end-to-end experiments how users are able to save substantial labeling time and monetary efforts.
arXiv Detail & Related papers (2020-10-16T14:21:19Z)
Data-Driven Aerospace Engineering: Reframing the Industry with Machine Learning [49.367020832638794]
The aerospace industry is poised to capitalize on big data and machine learning. Recent trends will be explored in context of critical challenges in design, manufacturing, verification and services.
arXiv Detail & Related papers (2020-08-24T22:40:26Z)
Machine Learning for Software Engineering: A Systematic Mapping [73.30245214374027]
The software development industry is rapidly adopting machine learning for transitioning modern day software systems towards highly intelligent and self-learning systems. No comprehensive study exists that explores the current state-of-the-art on the adoption of machine learning across software engineering life cycle stages. This study introduces a machine learning for software engineering (MLSE) taxonomy classifying the state-of-the-art machine learning techniques according to their applicability to various software engineering life cycle stages.
arXiv Detail & Related papers (2020-05-27T11:56:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.