Should We Trust (X)AI? Design Dimensions for Structured Experimental
Evaluations
- URL: http://arxiv.org/abs/2009.06433v1
- Date: Mon, 14 Sep 2020 13:40:51 GMT
- Title: Should We Trust (X)AI? Design Dimensions for Structured Experimental
Evaluations
- Authors: Fabian Sperrle, Mennatallah El-Assady, Grace Guo, Duen Horng Chau,
Alex Endert, Daniel Keim
- Abstract summary: This paper systematically derives design dimensions for the structured evaluation of explainable artificial intelligence (XAI) approaches.
They enable a descriptive characterization, facilitating comparisons between different study designs.
They further structure the design space of XAI, converging towards a precise terminology required for a rigorous study of XAI.
- Score: 19.68184991543289
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper systematically derives design dimensions for the structured
evaluation of explainable artificial intelligence (XAI) approaches. These
dimensions enable a descriptive characterization, facilitating comparisons
between different study designs. They further structure the design space of
XAI, converging towards a precise terminology required for a rigorous study of
XAI. Our literature review differentiates between comparative studies and
application papers, revealing methodological differences between the fields of
machine learning, human-computer interaction, and visual analytics. Generally,
each of these disciplines targets specific parts of the XAI process. Bridging
the resulting gaps enables a holistic evaluation of XAI in real-world
scenarios, as proposed by our conceptual model characterizing bias sources and
trust-building. Furthermore, we identify and discuss the potential for future
work based on observed research gaps that should lead to better coverage of the
proposed model.
Related papers
- Dimensions of Generative AI Evaluation Design [51.541816010127256]
We propose a set of general dimensions that capture critical choices involved in GenAI evaluation design.
These dimensions include the evaluation setting, the task type, the input source, the interaction style, the duration, the metric type, and the scoring method.
arXiv Detail & Related papers (2024-11-19T18:25:30Z) - User-centric evaluation of explainability of AI with and for humans: a comprehensive empirical study [5.775094401949666]
This study is located in the Human-Centered Artificial Intelligence (HCAI)
It focuses on the results of a user-centered assessment of commonly used eXplainable Artificial Intelligence (XAI) algorithms.
arXiv Detail & Related papers (2024-10-21T12:32:39Z) - SIDU-TXT: An XAI Algorithm for NLP with a Holistic Assessment Approach [14.928572140620245]
'Similarity Difference and Uniqueness' (SIDU) XAI method, recognized for its superior capability in localizing entire salient regions in image-based classification is extended to textual data.
The extended method, SIDU-TXT, utilizes feature activation maps from 'black-box' models to generate heatmaps at a granular, word-based level.
We find that, in sentiment analysis task of a movie review dataset, SIDU-TXT excels in both functionally and human-grounded evaluations.
arXiv Detail & Related papers (2024-02-05T14:29:54Z) - Explainable artificial intelligence approaches for brain-computer
interfaces: a review and design space [6.786321327136925]
This review paper provides an integrated perspective of Explainable Artificial Intelligence techniques applied to Brain-Computer Interfaces.
Brain-Computer Interfaces use predictive models to interpret brain signals for various high-stake applications.
There is a lack of an integrated perspective in XAI for BCI literature.
arXiv Detail & Related papers (2023-12-20T13:56:31Z) - Syntax-Informed Interactive Model for Comprehensive Aspect-Based
Sentiment Analysis [0.0]
We introduce an innovative model: Syntactic Dependency Enhanced Multi-Task Interaction Architecture (SDEMTIA) for comprehensive ABSA.
Our approach innovatively exploits syntactic knowledge (dependency relations and types) using a specialized Syntactic Dependency Embedded Interactive Network (SDEIN)
We also incorporate a novel and efficient message-passing mechanism within a multi-task learning framework to bolster learning efficacy.
arXiv Detail & Related papers (2023-11-28T16:03:22Z) - Document AI: A Comparative Study of Transformer-Based, Graph-Based
Models, and Convolutional Neural Networks For Document Layout Analysis [3.231170156689185]
Document AI aims to automatically analyze documents by leveraging natural language processing and computer vision techniques.
One of the major tasks of Document AI is document layout analysis, which structures document pages by interpreting the content and spatial relationships of layout, image, and text.
arXiv Detail & Related papers (2023-08-29T16:58:03Z) - Investigating Fairness Disparities in Peer Review: A Language Model
Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs)
We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date.
We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z) - Connecting Algorithmic Research and Usage Contexts: A Perspective of
Contextualized Evaluation for Explainable AI [65.44737844681256]
A lack of consensus on how to evaluate explainable AI (XAI) hinders the advancement of the field.
We argue that one way to close the gap is to develop evaluation methods that account for different user requirements.
arXiv Detail & Related papers (2022-06-22T05:17:33Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z) - Introducing Syntactic Structures into Target Opinion Word Extraction
with Deep Learning [89.64620296557177]
We propose to incorporate the syntactic structures of the sentences into the deep learning models for targeted opinion word extraction.
We also introduce a novel regularization technique to improve the performance of the deep learning models.
The proposed model is extensively analyzed and achieves the state-of-the-art performance on four benchmark datasets.
arXiv Detail & Related papers (2020-10-26T07:13:17Z) - A Diagnostic Study of Explainability Techniques for Text Classification [52.879658637466605]
We develop a list of diagnostic properties for evaluating existing explainability techniques.
We compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones.
arXiv Detail & Related papers (2020-09-25T12:01:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.