Related papers: PADTHAI-MM: Principles-based Approach for Designing Trustworthy, Human-centered AI using MAST Methodology

PADTHAI-MM: Principles-based Approach for Designing Trustworthy, Human-centered AI using MAST Methodology

URL: http://arxiv.org/abs/2401.13850v2
Date: Wed, 22 Jan 2025 20:52:56 GMT
Title: PADTHAI-MM: Principles-based Approach for Designing Trustworthy, Human-centered AI using MAST Methodology
Authors: Myke C. Cohen, Nayoung Kim, Yang Ba, Anna Pan, Shawaiz Bhatti, Pouria Salehi, James Sung, Erik Blasch, Michelle V. Mancenido, Erin K. Chiou,
Abstract summary: The Multisource AI Scorecard Table (MAST) was designed to bridge the gap by offering a systematic, tradecraft-centered approach to evaluating AI-enabled decision support systems.<n>We introduce an iterative design framework called textitPrinciples-based Approach for Designing Trustworthy, Human-centered AI.<n>We demonstrate this framework in our development of the Reporting Assistant for Defense and Intelligence Tasks (READIT)
Score: 5.215782336985273
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite an extensive body of literature on trust in technology, designing trustworthy AI systems for high-stakes decision domains remains a significant challenge, further compounded by the lack of actionable design and evaluation tools. The Multisource AI Scorecard Table (MAST) was designed to bridge this gap by offering a systematic, tradecraft-centered approach to evaluating AI-enabled decision support systems. Expanding on MAST, we introduce an iterative design framework called \textit{Principles-based Approach for Designing Trustworthy, Human-centered AI using MAST Methodology} (PADTHAI-MM). We demonstrate this framework in our development of the Reporting Assistant for Defense and Intelligence Tasks (READIT), a research platform that leverages data visualizations and natural language processing-based text analysis, emulating an AI-enabled system supporting intelligence reporting work. To empirically assess the efficacy of MAST on trust in AI, we developed two distinct iterations of READIT for comparison: a High-MAST version, which incorporates AI contextual information and explanations, and a Low-MAST version, akin to a ``black box'' system. This iterative design process, guided by stakeholder feedback and contemporary AI architectures, culminated in a prototype that was evaluated through its use in an intelligence reporting task. We further discuss the potential benefits of employing the MAST-inspired design framework to address context-specific needs. We also explore the relationship between stakeholder evaluators' MAST ratings and three categories of information known to impact trust: \textit{process}, \textit{purpose}, and \textit{performance}. Overall, our study supports the practical benefits and theoretical validity for PADTHAI-MM as a viable method for designing trustable, context-specific AI systems.

Related papers

The AI Imperative: Scaling High-Quality Peer Review in Machine Learning [49.87236114682497]
We argue that AI-assisted peer review must become an urgent research and infrastructure priority.<n>We propose specific roles for AI in enhancing factual verification, guiding reviewer performance, assisting authors in quality improvement, and supporting ACs in decision-making.
arXiv Detail & Related papers (2025-06-09T18:37:14Z)
A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems [93.8285345915925]
Reasoning is a fundamental cognitive process that enables logical inference, problem-solving, and decision-making. With the rapid advancement of large language models (LLMs), reasoning has emerged as a key capability that distinguishes advanced AI systems. We categorize existing methods along two dimensions: (1) Regimes, which define the stage at which reasoning is achieved; and (2) Architectures, which determine the components involved in the reasoning process.
arXiv Detail & Related papers (2025-04-12T01:27:49Z)
Found in Translation: semantic approaches for enhancing AI interpretability in face verification [0.4222205362654437]
This study extends previous work by integrating semantic concepts into XAI frameworks to bridge the comprehension gap between model outputs and human understanding. We propose a novel approach combining global and local explanations, using semantic features defined by user-selected facial landmarks. Results indicate that our semantic-based approach, particularly the most detailed set, offers a more nuanced understanding of model decisions than traditional methods.
arXiv Detail & Related papers (2025-01-06T08:34:53Z)
Engineering Trustworthy AI: A Developer Guide for Empirical Risk Minimization [53.80919781981027]
Key requirements for trustworthy AI can be translated into design choices for the components of empirical risk minimization. We hope to provide actionable guidance for building AI systems that meet emerging standards for trustworthiness of AI.
arXiv Detail & Related papers (2024-10-25T07:53:32Z)
Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges. We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow. We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z)
Combining AI Control Systems and Human Decision Support via Robustness and Criticality [53.10194953873209]
We extend a methodology for adversarial explanations (AE) to state-of-the-art reinforcement learning frameworks. We show that the learned AI control system demonstrates robustness against adversarial tampering. In a training / learning framework, this technology can improve both the AI's decisions and explanations through human interaction.
arXiv Detail & Related papers (2024-07-03T15:38:57Z)
Trustworthy Artificial Intelligence in the Context of Metrology [3.2873782624127834]
We review research at the National Physical Laboratory in the area of trustworthy artificial intelligence (TAI) We describe three broad themes of TAI: technical, socio-technical and social, which play key roles in ensuring that the developed models are trustworthy and can be relied upon to make responsible decisions. We discuss three research areas within TAI that we are working on at NPL, and examine the certification of AI systems in terms of adherence to the characteristics of TAI.
arXiv Detail & Related papers (2024-06-14T15:23:27Z)
The AI-DEC: A Card-based Design Method for User-centered AI Explanations [20.658833770179903]
We develop a design method, called AI-DEC, that defines four dimensions of AI explanations. We evaluate this method through co-design sessions with workers in healthcare, finance, and management industries. We discuss the implications of using the AI-DEC for the user-centered design of AI explanations in real-world systems.
arXiv Detail & Related papers (2024-05-26T22:18:38Z)
Explainable AI for Safe and Trustworthy Autonomous Driving: A Systematic Review [12.38351931894004]
We present the first systematic literature review of explainable methods for safe and trustworthy autonomous driving. We identify five key contributions of XAI for safe and trustworthy AI in AD, which are interpretable design, interpretable surrogate models, interpretable monitoring, auxiliary explanations, and interpretable validation. We propose a modular framework called SafeX to integrate these contributions, enabling explanation delivery to users while simultaneously ensuring the safety of AI models.
arXiv Detail & Related papers (2024-02-08T09:08:44Z)
Evaluating Trustworthiness of AI-Enabled Decision Support Systems: Validation of the Multisource AI Scorecard Table (MAST) [10.983659980278926]
The Multisource AI Scorecard Table (MAST) is a checklist tool to inform the design and evaluation of trustworthy AI systems. We evaluate whether MAST is associated with people's trust perceptions in AI-enabled decision support systems.
arXiv Detail & Related papers (2023-11-29T19:34:15Z)
Explainable Authorship Identification in Cultural Heritage Applications: Analysis of a New Perspective [48.031678295495574]
We explore the applicability of existing general-purpose eXplainable Artificial Intelligence (XAI) techniques to AId. In particular, we assess the relative merits of three different types of XAI techniques on three different AId tasks. Our analysis shows that, while these techniques make important first steps towards explainable Authorship Identification, more work remains to be done.
arXiv Detail & Related papers (2023-11-03T20:51:15Z)
An In-depth Survey of Large Language Model-based Artificial Intelligence Agents [11.774961923192478]
We have explored the core differences and characteristics between LLM-based AI agents and traditional AI agents. We conducted an in-depth analysis of the key components of AI agents, including planning, memory, and tool use.
arXiv Detail & Related papers (2023-09-23T11:25:45Z)
A Study of Situational Reasoning for Traffic Understanding [63.45021731775964]
We devise three novel text-based tasks for situational reasoning in the traffic domain. We adopt four knowledge-enhanced methods that have shown generalization capability across language reasoning tasks in prior work. We provide in-depth analyses of model performance on data partitions and examine model predictions categorically.
arXiv Detail & Related papers (2023-06-05T01:01:12Z)
Seamful XAI: Operationalizing Seamful Design in Explainable AI [59.89011292395202]
Mistakes in AI systems are inevitable, arising from both technical limitations and sociotechnical gaps. We propose that seamful design can foster AI explainability by revealing sociotechnical and infrastructural mismatches. We explore this process with 43 AI practitioners and real end-users.
arXiv Detail & Related papers (2022-11-12T21:54:05Z)
Certifiable Artificial Intelligence Through Data Fusion [7.103626867766158]
This paper reviews and proposes concerns in adopting, fielding, and maintaining artificial intelligence (AI) systems. A notional use case is presented with image data fusion to support AI object recognition certifiability considering precision versus distance.
arXiv Detail & Related papers (2021-11-03T03:34:19Z)
An interdisciplinary conceptual study of Artificial Intelligence (AI) for helping benefit-risk assessment practices: Towards a comprehensive qualification matrix of AI programs and devices (pre-print 2020) [55.41644538483948]
This paper proposes a comprehensive analysis of existing concepts coming from different disciplines tackling the notion of intelligence. The aim is to identify shared notions or discrepancies to consider for qualifying AI systems.
arXiv Detail & Related papers (2021-05-07T12:01:31Z)
A Comparative Approach to Explainable Artificial Intelligence Methods in Application to High-Dimensional Electronic Health Records: Examining the Usability of XAI [0.0]
XAI aims to produce a demonstrative factor of trust, which for human subjects is achieved through communicative means. The ideology behind trusting a machine to tend towards the livelihood of a human poses an ethical conundrum. XAI methods produce visualization of the feature contribution towards a given models output on both a local and global level.
arXiv Detail & Related papers (2021-03-08T18:15:52Z)
Multisource AI Scorecard Table for System Evaluation [3.74397577716445]
The paper describes a Multisource AI Scorecard Table (MAST) that provides the developer and user of an artificial intelligence (AI)/machine learning (ML) system with a standard checklist. The paper explores how the analytic tradecraft standards outlined in Intelligence Community Directive (ICD) 203 can provide a framework for assessing the performance of an AI system.
arXiv Detail & Related papers (2021-02-08T03:37:40Z)
Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap. We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert. Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z)
Towards an Interface Description Template for AI-enabled Systems [77.34726150561087]
Reuse is a common system architecture approach that seeks to instantiate a system architecture with existing components. There is currently no framework that guides the selection of necessary information to assess their portability to operate in a system different than the one for which the component was originally purposed. We present ongoing work on establishing an interface description template that captures the main information of an AI-enabled component.
arXiv Detail & Related papers (2020-07-13T20:30:26Z)
A general framework for scientifically inspired explanations in AI [76.48625630211943]
We instantiate the concept of structure of scientific explanation as the theoretical underpinning for a general framework in which explanations for AI systems can be implemented. This framework aims to provide the tools to build a "mental-model" of any AI system so that the interaction with the user can provide information on demand and be closer to the nature of human-made explanations.
arXiv Detail & Related papers (2020-03-02T10:32:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.