System Cards for AI-Based Decision-Making for Public Policy
- URL: http://arxiv.org/abs/2203.04754v2
- Date: Wed, 31 Aug 2022 19:57:55 GMT
- Title: System Cards for AI-Based Decision-Making for Public Policy
- Authors: Furkan Gursoy and Ioannis A. Kakadiaris
- Abstract summary: This work proposes a system accountability benchmark for formal audits of artificial intelligence-based decision-aiding systems.
It consists of 56 criteria organized within a four-by-four matrix composed of rows focused on (i) data, (ii) model, (iii) code, (iv) system, and columns focused on (a) development, (b) assessment, (c) mitigation, and (d) assurance.
- Score: 5.076419064097733
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Decisions impacting human lives are increasingly being made or assisted by
automated decision-making algorithms. Many of these algorithms process personal
data for predicting recidivism, credit risk analysis, identifying individuals
using face recognition, and more. While potentially improving efficiency and
effectiveness, such algorithms are not inherently free from bias, opaqueness,
lack of explainability, maleficence, and the like. Given that the outcomes of
these algorithms have a significant impact on individuals and society and are
open to analysis and contestation after deployment, such issues must be
accounted for before deployment. Formal audits are a way of ensuring algorithms
meet the appropriate accountability standards. This work, based on an extensive
analysis of the literature and an expert focus group study, proposes a unifying
framework for a system accountability benchmark for formal audits of artificial
intelligence-based decision-aiding systems. This work also proposes system
cards to serve as scorecards presenting the outcomes of such audits. It
consists of 56 criteria organized within a four-by-four matrix composed of rows
focused on (i) data, (ii) model, (iii) code, (iv) system, and columns focused
on (a) development, (b) assessment, (c) mitigation, and (d) assurance. The
proposed system accountability benchmark reflects the state-of-the-art
developments for accountable systems, serves as a checklist for algorithm
audits, and paves the way for sequential work in future research.
Related papers
- Online Decision Mediation [72.80902932543474]
Consider learning a decision support assistant to serve as an intermediary between (oracle) expert behavior and (imperfect) human behavior.
In clinical diagnosis, fully-autonomous machine behavior is often beyond ethical affordances.
arXiv Detail & Related papers (2023-10-28T05:59:43Z) - Who Audits the Auditors? Recommendations from a field scan of the
algorithmic auditing ecosystem [0.971392598996499]
We provide the first comprehensive field scan of the AI audit ecosystem.
We identify emerging best practices as well as methods and tools that are becoming commonplace.
We outline policy recommendations to improve the quality and impact of these audits.
arXiv Detail & Related papers (2023-10-04T01:40:03Z) - Influence of the algorithm's reliability and transparency in the user's
decision-making process [0.0]
We conduct an online empirical study with 61 participants to find out how the change in transparency and reliability of an algorithm could impact users' decision-making process.
The results indicate that people show at least moderate confidence in the decisions of the algorithm even when the reliability is bad.
arXiv Detail & Related papers (2023-07-13T03:13:49Z) - Equal Confusion Fairness: Measuring Group-Based Disparities in Automated
Decision Systems [5.076419064097733]
This paper proposes a new equal confusion fairness test to check an automated decision system for fairness and a new confusion parity error to quantify the extent of any unfairness.
Overall, the methods and metrics provided here may assess automated decision systems' fairness as part of a more extensive accountability assessment.
arXiv Detail & Related papers (2023-07-02T04:44:19Z) - A Gold Standard Dataset for the Reviewer Assignment Problem [117.59690218507565]
"Similarity score" is a numerical estimate of the expertise of a reviewer in reviewing a paper.
Our dataset consists of 477 self-reported expertise scores provided by 58 researchers.
For the task of ordering two papers in terms of their relevance for a reviewer, the error rates range from 12%-30% in easy cases to 36%-43% in hard cases.
arXiv Detail & Related papers (2023-03-23T16:15:03Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - Causal Fairness Analysis [68.12191782657437]
We introduce a framework for understanding, modeling, and possibly solving issues of fairness in decision-making settings.
The main insight of our approach will be to link the quantification of the disparities present on the observed data with the underlying, and often unobserved, collection of causal mechanisms.
Our effort culminates in the Fairness Map, which is the first systematic attempt to organize and explain the relationship between different criteria found in the literature.
arXiv Detail & Related papers (2022-07-23T01:06:34Z) - Outsider Oversight: Designing a Third Party Audit Ecosystem for AI
Governance [3.8997087223115634]
We discuss the challenges of third party oversight in the current AI landscape.
We show that the institutional design of such audits are far from monolithic.
We conclude that the turn toward audits alone is unlikely to achieve actual algorithmic accountability.
arXiv Detail & Related papers (2022-06-09T19:18:47Z) - Towards a multi-stakeholder value-based assessment framework for
algorithmic systems [76.79703106646967]
We develop a value-based assessment framework that visualizes closeness and tensions between values.
We give guidelines on how to operationalize them, while opening up the evaluation and deliberation process to a wide range of stakeholders.
arXiv Detail & Related papers (2022-05-09T19:28:32Z) - Bias in Multimodal AI: Testbed for Fair Automatic Recruitment [73.85525896663371]
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
We train automatic recruitment algorithms using a set of multimodal synthetic profiles consciously scored with gender and racial biases.
Our methodology and results show how to generate fairer AI-based tools in general, and in particular fairer automated recruitment systems.
arXiv Detail & Related papers (2020-04-15T15:58:05Z) - Closing the AI Accountability Gap: Defining an End-to-End Framework for
Internal Algorithmic Auditing [8.155332346712424]
We introduce a framework for algorithmic auditing that supports artificial intelligence system development end-to-end.
The proposed auditing framework is intended to close the accountability gap in the development and deployment of large-scale artificial intelligence systems.
arXiv Detail & Related papers (2020-01-03T20:19:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.