PADTHAI-MM: A Principled Approach for Designing Trustable,
Human-centered AI systems using the MAST Methodology
- URL: http://arxiv.org/abs/2401.13850v1
- Date: Wed, 24 Jan 2024 23:15:44 GMT
- Title: PADTHAI-MM: A Principled Approach for Designing Trustable,
Human-centered AI systems using the MAST Methodology
- Authors: Nayoung Kim, Myke C. Cohen, Yang Ba, Anna Pan, Shawaiz Bhatti, Pouria
Salehi, James Sung, Erik Blasch, Michelle V. Mancenido, Erin K. Chiou
- Abstract summary: The Multisource AI Scorecard Table (MAST), a checklist rating system, addresses this gap in designing and evaluating AI-enabled decision support systems.
We propose the Principled Approach for Designing Trustable Human-centered AI systems using MAST methodology.
We show that MAST-guided design can improve trust perceptions, and that MAST criteria can be linked to performance, process, and purpose information.
- Score: 5.38932801848643
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Designing for AI trustworthiness is challenging, with a lack of practical
guidance despite extensive literature on trust. The Multisource AI Scorecard
Table (MAST), a checklist rating system, addresses this gap in designing and
evaluating AI-enabled decision support systems. We propose the Principled
Approach for Designing Trustable Human-centered AI systems using MAST
Methodology (PADTHAI-MM), a nine-step framework what we demonstrate through the
iterative design of a text analysis platform called the REporting Assistant for
Defense and Intelligence Tasks (READIT). We designed two versions of READIT,
high-MAST including AI context and explanations, and low-MAST resembling a
"black box" type system. Participant feedback and state-of-the-art AI knowledge
was integrated in the design process, leading to a redesigned prototype tested
by participants in an intelligence reporting task. Results show that
MAST-guided design can improve trust perceptions, and that MAST criteria can be
linked to performance, process, and purpose information, providing a practical
and theory-informed basis for AI system design.
Related papers
- Found in Translation: semantic approaches for enhancing AI interpretability in face verification [0.4222205362654437]
This study extends previous work by integrating semantic concepts into XAI frameworks to bridge the comprehension gap between model outputs and human understanding.
We propose a novel approach combining global and local explanations, using semantic features defined by user-selected facial landmarks.
Results indicate that our semantic-based approach, particularly the most detailed set, offers a more nuanced understanding of model decisions than traditional methods.
arXiv Detail & Related papers (2025-01-06T08:34:53Z) - Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges.
We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow.
We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z) - Combining AI Control Systems and Human Decision Support via Robustness and Criticality [53.10194953873209]
We extend a methodology for adversarial explanations (AE) to state-of-the-art reinforcement learning frameworks.
We show that the learned AI control system demonstrates robustness against adversarial tampering.
In a training / learning framework, this technology can improve both the AI's decisions and explanations through human interaction.
arXiv Detail & Related papers (2024-07-03T15:38:57Z) - Trustworthy Artificial Intelligence in the Context of Metrology [3.2873782624127834]
We review research at the National Physical Laboratory in the area of trustworthy artificial intelligence (TAI)
We describe three broad themes of TAI: technical, socio-technical and social, which play key roles in ensuring that the developed models are trustworthy and can be relied upon to make responsible decisions.
We discuss three research areas within TAI that we are working on at NPL, and examine the certification of AI systems in terms of adherence to the characteristics of TAI.
arXiv Detail & Related papers (2024-06-14T15:23:27Z) - An In-depth Survey of Large Language Model-based Artificial Intelligence
Agents [11.774961923192478]
We have explored the core differences and characteristics between LLM-based AI agents and traditional AI agents.
We conducted an in-depth analysis of the key components of AI agents, including planning, memory, and tool use.
arXiv Detail & Related papers (2023-09-23T11:25:45Z) - A Study of Situational Reasoning for Traffic Understanding [63.45021731775964]
We devise three novel text-based tasks for situational reasoning in the traffic domain.
We adopt four knowledge-enhanced methods that have shown generalization capability across language reasoning tasks in prior work.
We provide in-depth analyses of model performance on data partitions and examine model predictions categorically.
arXiv Detail & Related papers (2023-06-05T01:01:12Z) - Connecting Algorithmic Research and Usage Contexts: A Perspective of
Contextualized Evaluation for Explainable AI [65.44737844681256]
A lack of consensus on how to evaluate explainable AI (XAI) hinders the advancement of the field.
We argue that one way to close the gap is to develop evaluation methods that account for different user requirements.
arXiv Detail & Related papers (2022-06-22T05:17:33Z) - A Comparative Approach to Explainable Artificial Intelligence Methods in
Application to High-Dimensional Electronic Health Records: Examining the
Usability of XAI [0.0]
XAI aims to produce a demonstrative factor of trust, which for human subjects is achieved through communicative means.
The ideology behind trusting a machine to tend towards the livelihood of a human poses an ethical conundrum.
XAI methods produce visualization of the feature contribution towards a given models output on both a local and global level.
arXiv Detail & Related papers (2021-03-08T18:15:52Z) - Multisource AI Scorecard Table for System Evaluation [3.74397577716445]
The paper describes a Multisource AI Scorecard Table (MAST) that provides the developer and user of an artificial intelligence (AI)/machine learning (ML) system with a standard checklist.
The paper explores how the analytic tradecraft standards outlined in Intelligence Community Directive (ICD) 203 can provide a framework for assessing the performance of an AI system.
arXiv Detail & Related papers (2021-02-08T03:37:40Z) - Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap.
We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert.
Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z) - A general framework for scientifically inspired explanations in AI [76.48625630211943]
We instantiate the concept of structure of scientific explanation as the theoretical underpinning for a general framework in which explanations for AI systems can be implemented.
This framework aims to provide the tools to build a "mental-model" of any AI system so that the interaction with the user can provide information on demand and be closer to the nature of human-made explanations.
arXiv Detail & Related papers (2020-03-02T10:32:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.