The Principles of Data-Centric AI (DCAI)
- URL: http://arxiv.org/abs/2211.14611v2
- Date: Tue, 12 Mar 2024 16:07:12 GMT
- Title: The Principles of Data-Centric AI (DCAI)
- Authors: Mohammad Hossein Jarrahi, Ali Memariani, Shion Guha
- Abstract summary: Data-centric AI (DCAI) as an emerging concept brings data, its quality and its dynamism to the forefront.
This article brings together data-centric perspectives and concepts to outline the foundations of DCAI.
- Score: 9.211953610948862
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data is a crucial infrastructure to how artificial intelligence (AI) systems
learn. However, these systems to date have been largely model-centric, putting
a premium on the model at the expense of the data quality. Data quality issues
beset the performance of AI systems, particularly in downstream deployments and
in real-world applications. Data-centric AI (DCAI) as an emerging concept
brings data, its quality and its dynamism to the forefront in considerations of
AI systems through an iterative and systematic approach. As one of the first
overviews, this article brings together data-centric perspectives and concepts
to outline the foundations of DCAI. It specifically formulates six guiding
principles for researchers and practitioners and gives direction for future
advancement of DCAI.
Related papers
- AI-Aided Kalman Filters [65.35350122917914]
The Kalman filter (KF) and its variants are among the most celebrated algorithms in signal processing.
Recent developments illustrate the possibility of fusing deep neural networks (DNNs) with classic Kalman-type filtering.
This article provides a tutorial-style overview of design approaches for incorporating AI in aiding KF-type algorithms.
arXiv Detail & Related papers (2024-10-16T06:47:53Z) - Survey and Taxonomy: The Role of Data-Centric AI in Transformer-Based Time Series Forecasting [36.31269406067809]
We argue that data-centric AI is essential for training AI models, particularly for transformer-based TSF models efficiently.
We review the previous research works from a data-centric AI perspective and we intend to lay the foundation work for the future development of transformer-based architecture and data-centric AI.
arXiv Detail & Related papers (2024-07-29T08:27:21Z) - NeurDB: An AI-powered Autonomous Data System [44.14807794638682]
We present NeurDB, an AI-powered autonomous data system designed to fully embrace AI design in each major system component.
We outline the conceptual and architectural overview of NeurDB, discuss its design choices and key components, and report its current development and future plan.
arXiv Detail & Related papers (2024-05-07T00:51:48Z) - Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future [130.87142103774752]
This review systematically assesses over seventy open-source autonomous driving datasets.
It offers insights into various aspects, such as the principles underlying the creation of high-quality datasets.
It also delves into the scientific and technical challenges that warrant resolution.
arXiv Detail & Related papers (2023-12-06T10:46:53Z) - Data-centric Artificial Intelligence: A Survey [47.24049907785989]
Recently, the role of data in AI has been significantly magnified, giving rise to the emerging concept of data-centric AI.
In this survey, we discuss the necessity of data-centric AI, followed by a holistic view of three general data-centric goals.
We believe this is the first comprehensive survey that provides a global view of a spectrum of tasks across various stages of the data lifecycle.
arXiv Detail & Related papers (2023-03-17T17:44:56Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - Data-centric AI: Perspectives and Challenges [51.70828802140165]
Data-centric AI (DCAI) advocates a fundamental shift from model advancements to ensuring data quality and reliability.
We bring together three general missions: training data development, inference data development, and data maintenance.
arXiv Detail & Related papers (2023-01-12T05:28:59Z) - Data-Centric Artificial Intelligence [2.5874041837241304]
Data-centric artificial intelligence (data-centric AI) represents an emerging paradigm emphasizing that the systematic design and engineering of data is essential for building effective and efficient AI-based systems.
We define relevant terms, provide key characteristics to contrast the data-centric paradigm to the model-centric one, and introduce a framework for data-centric AI.
arXiv Detail & Related papers (2022-12-22T16:41:03Z) - DC-Check: A Data-Centric AI checklist to guide the development of
reliable machine learning systems [81.21462458089142]
Data-centric AI is emerging as a unifying paradigm that could enable reliable end-to-end pipelines.
We propose DC-Check, an actionable checklist-style framework to elicit data-centric considerations.
This data-centric lens on development aims to promote thoughtfulness and transparency prior to system development.
arXiv Detail & Related papers (2022-11-09T17:32:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.