What About the Data? A Mapping Study on Data Engineering for AI Systems
- URL: http://arxiv.org/abs/2402.05156v1
- Date: Wed, 7 Feb 2024 16:31:58 GMT
- Title: What About the Data? A Mapping Study on Data Engineering for AI Systems
- Authors: Petra Heck
- Abstract summary: There is a growing need for data engineers that know how to prepare data for AI systems.
We found 25 relevant papers between January 2019 and June 2023, explaining AI data engineering activities.
This paper creates an overview of the body of knowledge on data engineering for AI.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AI systems cannot exist without data. Now that AI models (data science and
AI) have matured and are readily available to apply in practice, most
organizations struggle with the data infrastructure to do so. There is a
growing need for data engineers that know how to prepare data for AI systems or
that can setup enterprise-wide data architectures for analytical projects. But
until now, the data engineering part of AI engineering has not been getting
much attention, in favor of discussing the modeling part. In this paper we aim
to change this by perform a mapping study on data engineering for AI systems,
i.e., AI data engineering. We found 25 relevant papers between January 2019 and
June 2023, explaining AI data engineering activities. We identify which life
cycle phases are covered, which technical solutions or architectures are
proposed and which lessons learned are presented. We end by an overall
discussion of the papers with implications for practitioners and researchers.
This paper creates an overview of the body of knowledge on data engineering for
AI. This overview is useful for practitioners to identify solutions and best
practices as well as for researchers to identify gaps.
Related papers
- Data Issues in Industrial AI System: A Meta-Review and Research Strategy [10.540603300770885]
Artificial intelligence (AI) is assuming an increasingly pivotal role within industrial systems.
Despite the recent trend within various industries to adopt AI, the actual adoption of AI is not as developed as perceived.
How to address these data issues stands as a significant concern confronting both industry and academia.
arXiv Detail & Related papers (2024-06-22T08:36:59Z) - On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms [56.119374302685934]
There have been severe concerns over the trustworthiness of AI technologies.
Machine and deep learning algorithms depend heavily on the data used during their development.
We propose a framework to evaluate the datasets through a responsible rubric.
arXiv Detail & Related papers (2023-10-24T14:01:53Z) - AI-Generated Images as Data Source: The Dawn of Synthetic Era [61.879821573066216]
generative AI has unlocked the potential to create synthetic images that closely resemble real-world photographs.
This paper explores the innovative concept of harnessing these AI-generated images as new data sources.
In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability.
arXiv Detail & Related papers (2023-10-03T06:55:19Z) - AI in Software Engineering: A Survey on Project Management Applications [3.156791351998142]
Machine Learning (ML) employs algorithms that undergo training on data sets, enabling them to carry out specific tasks autonomously.
AI holds immense potential in the field of software engineering, particularly in project management and planning.
arXiv Detail & Related papers (2023-07-27T23:02:24Z) - AI for IT Operations (AIOps) on Cloud Platforms: Reviews, Opportunities
and Challenges [60.56413461109281]
Artificial Intelligence for IT operations (AIOps) aims to combine the power of AI with the big data generated by IT Operations processes.
We discuss in depth the key types of data emitted by IT Operations activities, the scale and challenges in analyzing them, and where they can be helpful.
We categorize the key AIOps tasks as - incident detection, failure prediction, root cause analysis and automated actions.
arXiv Detail & Related papers (2023-04-10T15:38:12Z) - Data-centric Artificial Intelligence: A Survey [47.24049907785989]
Recently, the role of data in AI has been significantly magnified, giving rise to the emerging concept of data-centric AI.
In this survey, we discuss the necessity of data-centric AI, followed by a holistic view of three general data-centric goals.
We believe this is the first comprehensive survey that provides a global view of a spectrum of tasks across various stages of the data lifecycle.
arXiv Detail & Related papers (2023-03-17T17:44:56Z) - Enabling Automated Machine Learning for Model-Driven AI Engineering [60.09869520679979]
We propose a novel approach to enable Model-Driven Software Engineering and Model-Driven AI Engineering.
In particular, we support Automated ML, thus assisting software engineers without deep AI knowledge in developing AI-intensive systems.
arXiv Detail & Related papers (2022-03-06T10:12:56Z) - Towards Productizing AI/ML Models: An Industry Perspective from Data
Scientists [10.27276267081559]
The transition from AI/ML models to production-ready AI-based systems is a challenge for both data scientists and software engineers.
In this paper, we report the results of a workshop conducted in a consulting company to understand how this transition is perceived by practitioners.
arXiv Detail & Related papers (2021-03-18T22:25:44Z) - A Methodology for Creating AI FactSheets [67.65802440158753]
This paper describes a methodology for creating the form of AI documentation we call FactSheets.
Within each step of the methodology, we describe the issues to consider and the questions to explore.
This methodology will accelerate the broader adoption of transparent AI documentation.
arXiv Detail & Related papers (2020-06-24T15:08:59Z) - Sensor Artificial Intelligence and its Application to Space Systems -- A
White Paper [35.78525324168878]
The goal of this white paper is to establish "Sensor AI" as a dedicated research topic.
A closer look at the sensors and their physical properties within AI approaches will lead to more robust and widely applicable algorithms.
Sensor AI will play a decisive role in autonomous driving as well as in areas of automated production, predictive maintenance or space research.
arXiv Detail & Related papers (2020-06-09T14:10:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.