Bridging the Data Gap in AI Reliability Research and Establishing DR-AIR, a Comprehensive Data Repository for AI Reliability
- URL: http://arxiv.org/abs/2502.12386v1
- Date: Mon, 17 Feb 2025 23:50:36 GMT
- Title: Bridging the Data Gap in AI Reliability Research and Establishing DR-AIR, a Comprehensive Data Repository for AI Reliability
- Authors: Simin Zheng, Jared M. Clark, Fatemeh Salboukh, Priscila Silva, Karen da Mata, Fenglian Pan, Jie Min, Jiayi Lian, Caleb B. King, Lance Fiondella, Jian Liu, Xinwei Deng, Yili Hong,
- Abstract summary: A major challenge in AI reliability research, particularly for those in academia, is the lack of readily available AI reliability data.
This paper conducts a comprehensive review of available AI reliability data and establishes DR-AIR: a data repository for AI reliability data.
- Score: 4.769924694900377
- License:
- Abstract: Artificial intelligence (AI) technology and systems have been advancing rapidly. However, ensuring the reliability of these systems is crucial for fostering public confidence in their use. This necessitates the modeling and analysis of reliability data specific to AI systems. A major challenge in AI reliability research, particularly for those in academia, is the lack of readily available AI reliability data. To address this gap, this paper focuses on conducting a comprehensive review of available AI reliability data and establishing DR-AIR: a data repository for AI reliability. Specifically, we introduce key measurements and data types for assessing AI reliability, along with the methodologies used to collect these data. We also provide a detailed description of the currently available datasets with illustrative examples. Furthermore, we outline the setup of the DR-AIR repository and demonstrate its practical applications. This repository provides easy access to datasets specifically curated for AI reliability research. We believe these efforts will significantly benefit the AI research community by facilitating access to valuable reliability data and promoting collaboration across various academic domains within AI. We conclude our paper with a call to action, encouraging the research community to contribute and share AI reliability data to further advance this critical field of study.
Related papers
- Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective [111.58315434849047]
robustness of neural information retrieval models (IR) models has garnered significant attention.
We view the robustness of IR to be a multifaceted concept, emphasizing its necessity against adversarial attacks, out-of-distribution (OOD) scenarios and performance variance.
We provide an in-depth discussion of existing methods, datasets, and evaluation metrics, shedding light on challenges and future directions in the era of large language models.
arXiv Detail & Related papers (2024-07-09T16:07:01Z) - AI Data Readiness Inspector (AIDRIN) for Quantitative Assessment of Data Readiness for AI [0.8553254686016967]
"Garbage in Garbage Out" is a universally agreed quote by computer scientists from various domains, including Artificial Intelligence (AI)
There are no standard methods or frameworks for assessing the "readiness" of data for AI.
AIDRIN is a framework covering a broad range of readiness dimensions available in the literature.
arXiv Detail & Related papers (2024-06-27T15:26:39Z) - Generative AI for Secure and Privacy-Preserving Mobile Crowdsensing [74.58071278710896]
generative AI has attracted much attention from both academic and industrial fields.
Secure and privacy-preserving mobile crowdsensing (SPPMCS) has been widely applied in data collection/ acquirement.
arXiv Detail & Related papers (2024-05-17T04:00:58Z) - Open-Source AI-based SE Tools: Opportunities and Challenges of Collaborative Software Learning [23.395624804517034]
Large Language Models (LLMs) have become instrumental in advancing software engineering (SE) tasks.
The collaboration of these AI-based SE models hinges on maximising the sources of high-quality data.
Data especially of high quality, often holds commercial or sensitive value, making it less accessible for open-source AI-based SE projects.
arXiv Detail & Related papers (2024-04-09T10:47:02Z) - On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms [56.119374302685934]
There have been severe concerns over the trustworthiness of AI technologies.
Machine and deep learning algorithms depend heavily on the data used during their development.
We propose a framework to evaluate the datasets through a responsible rubric.
arXiv Detail & Related papers (2023-10-24T14:01:53Z) - Collect, Measure, Repeat: Reliability Factors for Responsible AI Data
Collection [8.12993269922936]
We argue that data collection for AI should be performed in a responsible manner.
We propose a Responsible AI (RAI) methodology designed to guide the data collection with a set of metrics.
arXiv Detail & Related papers (2023-08-22T18:01:27Z) - Data-centric Artificial Intelligence: A Survey [47.24049907785989]
Recently, the role of data in AI has been significantly magnified, giving rise to the emerging concept of data-centric AI.
In this survey, we discuss the necessity of data-centric AI, followed by a holistic view of three general data-centric goals.
We believe this is the first comprehensive survey that provides a global view of a spectrum of tasks across various stages of the data lifecycle.
arXiv Detail & Related papers (2023-03-17T17:44:56Z) - Data-centric AI: Perspectives and Challenges [51.70828802140165]
Data-centric AI (DCAI) advocates a fundamental shift from model advancements to ensuring data quality and reliability.
We bring together three general missions: training data development, inference data development, and data maintenance.
arXiv Detail & Related papers (2023-01-12T05:28:59Z) - Statistical Perspectives on Reliability of Artificial Intelligence
Systems [6.284088451820049]
We provide statistical perspectives on the reliability of AI systems.
We introduce a so-called SMART statistical framework for AI reliability research.
We discuss recent developments in modeling and analysis of AI reliability.
arXiv Detail & Related papers (2021-11-09T20:00:14Z) - Trustworthy AI [75.99046162669997]
Brittleness to minor adversarial changes in the input data, ability to explain the decisions, address the bias in their training data, are some of the most prominent limitations.
We propose the tutorial on Trustworthy AI to address six critical issues in enhancing user and public trust in AI systems.
arXiv Detail & Related papers (2020-11-02T20:04:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.