Data-Centric Artificial Intelligence
- URL: http://arxiv.org/abs/2212.11854v4
- Date: Thu, 18 Jan 2024 11:52:08 GMT
- Title: Data-Centric Artificial Intelligence
- Authors: Johannes Jakubik, Michael V\"ossing, Niklas K\"uhl, Jannis Walk,
Gerhard Satzger
- Abstract summary: Data-centric artificial intelligence (data-centric AI) represents an emerging paradigm emphasizing that the systematic design and engineering of data is essential for building effective and efficient AI-based systems.
We define relevant terms, provide key characteristics to contrast the data-centric paradigm to the model-centric one, and introduce a framework for data-centric AI.
- Score: 2.5874041837241304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data-centric artificial intelligence (data-centric AI) represents an emerging
paradigm emphasizing that the systematic design and engineering of data is
essential for building effective and efficient AI-based systems. The objective
of this article is to introduce practitioners and researchers from the field of
Information Systems (IS) to data-centric AI. We define relevant terms, provide
key characteristics to contrast the data-centric paradigm to the model-centric
one, and introduce a framework for data-centric AI. We distinguish data-centric
AI from related concepts and discuss its longer-term implications for the IS
community.
Related papers
- Survey and Taxonomy: The Role of Data-Centric AI in Transformer-Based Time Series Forecasting [36.31269406067809]
We argue that data-centric AI is essential for training AI models, particularly for transformer-based TSF models efficiently.
We review the previous research works from a data-centric AI perspective and we intend to lay the foundation work for the future development of transformer-based architecture and data-centric AI.
arXiv Detail & Related papers (2024-07-29T08:27:21Z) - AI Data Readiness Inspector (AIDRIN) for Quantitative Assessment of Data Readiness for AI [0.8553254686016967]
"Garbage in Garbage Out" is a universally agreed quote by computer scientists from various domains, including Artificial Intelligence (AI)
There are no standard methods or frameworks for assessing the "readiness" of data for AI.
AIDRIN is a framework covering a broad range of readiness dimensions available in the literature.
arXiv Detail & Related papers (2024-06-27T15:26:39Z) - Reimagining Synthetic Tabular Data Generation through Data-Centric AI: A
Comprehensive Benchmark [56.8042116967334]
Synthetic data serves as an alternative in training machine learning models.
ensuring that synthetic data mirrors the complex nuances of real-world data is a challenging task.
This paper explores the potential of integrating data-centric AI techniques to guide the synthetic data generation process.
arXiv Detail & Related papers (2023-10-25T20:32:02Z) - AI-Generated Images as Data Source: The Dawn of Synthetic Era [61.879821573066216]
generative AI has unlocked the potential to create synthetic images that closely resemble real-world photographs.
This paper explores the innovative concept of harnessing these AI-generated images as new data sources.
In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability.
arXiv Detail & Related papers (2023-10-03T06:55:19Z) - Data-centric Artificial Intelligence: A Survey [47.24049907785989]
Recently, the role of data in AI has been significantly magnified, giving rise to the emerging concept of data-centric AI.
In this survey, we discuss the necessity of data-centric AI, followed by a holistic view of three general data-centric goals.
We believe this is the first comprehensive survey that provides a global view of a spectrum of tasks across various stages of the data lifecycle.
arXiv Detail & Related papers (2023-03-17T17:44:56Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - Data-centric AI: Perspectives and Challenges [51.70828802140165]
Data-centric AI (DCAI) advocates a fundamental shift from model advancements to ensuring data quality and reliability.
We bring together three general missions: training data development, inference data development, and data maintenance.
arXiv Detail & Related papers (2023-01-12T05:28:59Z) - The Principles of Data-Centric AI (DCAI) [9.211953610948862]
Data-centric AI (DCAI) as an emerging concept brings data, its quality and its dynamism to the forefront.
This article brings together data-centric perspectives and concepts to outline the foundations of DCAI.
arXiv Detail & Related papers (2022-11-26T16:43:40Z) - Data-Centric AI Requires Rethinking Data Notion [12.595006823256687]
This work proposes unifying principles offered by categorical and cochain notions of data.
In the categorical notion, data is viewed as a mathematical structure that we act upon via morphisms to preserve this structure.
As for cochain notion, data can be viewed as a function defined in a discrete domain of interest and acted upon via operators.
arXiv Detail & Related papers (2021-10-06T04:00:38Z) - A general framework for scientifically inspired explanations in AI [76.48625630211943]
We instantiate the concept of structure of scientific explanation as the theoretical underpinning for a general framework in which explanations for AI systems can be implemented.
This framework aims to provide the tools to build a "mental-model" of any AI system so that the interaction with the user can provide information on demand and be closer to the nature of human-made explanations.
arXiv Detail & Related papers (2020-03-02T10:32:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.