Related papers: Evaluating the Effectiveness of XAI Techniques for Encoder-Based Language Models

Evaluating the Effectiveness of XAI Techniques for Encoder-Based Language Models

URL: http://arxiv.org/abs/2501.15374v1
Date: Sun, 26 Jan 2025 03:08:34 GMT
Title: Evaluating the Effectiveness of XAI Techniques for Encoder-Based Language Models
Authors: Melkamu Abay Mersha, Mesay Gemeda Yigezu, Jugal Kalita,
Abstract summary: This study presents a general evaluation framework using four key metrics: Human-reasoning Agreement (HA), Robustness, Consistency, and Contrastivity.<n>We assess the effectiveness of six explainability techniques from five different XAI categories.<n>Our findings show that the model simplification-based XAI method (LIME) consistently outperforms across multiple metrics and models.
Score: 6.349503549199403
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The black-box nature of large language models (LLMs) necessitates the development of eXplainable AI (XAI) techniques for transparency and trustworthiness. However, evaluating these techniques remains a challenge. This study presents a general evaluation framework using four key metrics: Human-reasoning Agreement (HA), Robustness, Consistency, and Contrastivity. We assess the effectiveness of six explainability techniques from five different XAI categories model simplification (LIME), perturbation-based methods (SHAP), gradient-based approaches (InputXGradient, Grad-CAM), Layer-wise Relevance Propagation (LRP), and attention mechanisms-based explainability methods (Attention Mechanism Visualization, AMV) across five encoder-based language models: TinyBERT, BERTbase, BERTlarge, XLM-R large, and DeBERTa-xlarge, using the IMDB Movie Reviews and Tweet Sentiment Extraction (TSE) datasets. Our findings show that the model simplification-based XAI method (LIME) consistently outperforms across multiple metrics and models, significantly excelling in HA with a score of 0.9685 on DeBERTa-xlarge, robustness, and consistency as the complexity of large language models increases. AMV demonstrates the best Robustness, with scores as low as 0.0020. It also excels in Consistency, achieving near-perfect scores of 0.9999 across all models. Regarding Contrastivity, LRP performs the best, particularly on more complex models, with scores up to 0.9371.

Related papers

A Unified Framework with Novel Metrics for Evaluating the Effectiveness of XAI Techniques in LLMs [5.112826806339356]
This study introduces a comprehensive evaluation framework with four novel metrics for assessing the effectiveness of five XAI techniques. The evaluation focuses on four key metrics: Human-reasoning Agreement (HA), Robustness, Consistency, and Contrastivity.
arXiv Detail & Related papers (2025-03-06T23:59:50Z)
A Statistical Framework for Ranking LLM-Based Chatbots [57.59268154690763]
We propose a statistical framework that incorporates key advancements to address specific challenges in pairwise comparison analysis. First, we introduce a factored tie model that enhances the ability to handle groupings of human-judged comparisons. Second, we extend the framework to model covariance tiers between competitors, enabling deeper insights into performance relationships. Third, we resolve optimization challenges arising from parameter non-uniqueness by introducing novel constraints.
arXiv Detail & Related papers (2024-12-24T12:54:19Z)
Linear Discriminant Analysis in Credit Scoring: A Transparent Hybrid Model Approach [9.88281854509076]
We implement Linear Discriminant Analysis (LDA) as a feature reduction technique, which reduces the burden of the models complexity.<n>Our hybrid model, XG-DNN, outperformed other models with the highest accuracy of 99.45% and a 99% F1 score with LDA.<n>To interpret model decisions, we have applied 2 different explainable AI techniques named LIME (local) and Morris Sensitivity Analysis (global)
arXiv Detail & Related papers (2024-12-05T14:21:18Z)
Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability [53.51560766150442]
Critical tokens are elements within reasoning trajectories that significantly influence incorrect outcomes.<n>We present a novel framework for identifying these tokens through rollout sampling.<n>We show that identifying and replacing critical tokens significantly improves model accuracy.
arXiv Detail & Related papers (2024-11-29T18:58:22Z)
Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark [62.58869921806019]
We propose a task decomposition evaluation framework based on GPT-4o to automatically construct a new training dataset. We design innovative training strategies to effectively distill GPT-4o's evaluation capabilities into a 7B open-source MLLM, MiniCPM-V-2.6. Experimental results demonstrate that our distilled open-source MLLM significantly outperforms the current state-of-the-art GPT-4o-base baseline.
arXiv Detail & Related papers (2024-11-23T08:06:06Z)
Can Large Language Model Predict Employee Attrition? [0.0]
This study compares the predictive accuracy and interpretability of a fine-tuned GPT-3.5 model against traditional machine learning (ML) Our findings show that the fine-tuned GPT-3.5 model outperforms traditional methods with a precision of 0.91, recall of 0.94, and an F1-score of 0.92, while the best traditional model, SVM, achieved an F1-score of 0.82, with Random Forest and XGBoost reaching 0.80.
arXiv Detail & Related papers (2024-11-02T19:50:39Z)
Enhancing Authorship Attribution through Embedding Fusion: A Novel Approach with Masked and Encoder-Decoder Language Models [0.0]
We propose a novel framework with textual embeddings from Pre-trained Language Models to distinguish AI-generated and human-authored text. Our approach utilizes Embedding Fusion to integrate semantic information from multiple Language Models, harnessing their complementary strengths to enhance performance.
arXiv Detail & Related papers (2024-11-01T07:18:27Z)
SynthTree: Co-supervised Local Model Synthesis for Explainable Prediction [15.832975722301011]
We propose a novel method to enhance explainability with minimal accuracy loss. We have developed novel methods for estimating nodes by leveraging AI techniques. Our findings highlight the critical role that statistical methodologies can play in advancing explainable AI.
arXiv Detail & Related papers (2024-06-16T14:43:01Z)
QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement. QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights. We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z)
Precise Benchmarking of Explainable AI Attribution Methods [0.0]
We propose a novel evaluation approach for benchmarking state-of-the-art XAI attribution methods. Our proposal consists of a synthetic classification model accompanied by its derived ground truth explanations. Our experimental results provide novel insights into the performance of Guided-Backprop and Smoothgrad XAI methods.
arXiv Detail & Related papers (2023-08-06T17:03:32Z)
Preserving Knowledge Invariance: Rethinking Robustness Evaluation of Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world. We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique. By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z)
Plex: Towards Reliability using Pretrained Large Model Extensions [69.13326436826227]
We develop ViT-Plex and T5-Plex, pretrained large model extensions for vision and language modalities, respectively. Plex greatly improves the state-of-the-art across reliability tasks, and simplifies the traditional protocol. We demonstrate scaling effects over model sizes up to 1B parameters and pretraining dataset sizes up to 4B examples.
arXiv Detail & Related papers (2022-07-15T11:39:37Z)
Sparse MoEs meet Efficient Ensembles [49.313497379189315]
We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs) We present Efficient Ensemble of Experts (E$3$), a scalable and simple ensemble of sparse MoEs that takes the best of both classes of models, while using up to 45% fewer FLOPs than a deep ensemble.
arXiv Detail & Related papers (2021-10-07T11:58:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.