Related papers: Uniform Discretized Integrated Gradients: An effective attribution based method for explaining large language models

Uniform Discretized Integrated Gradients: An effective attribution based method for explaining large language models

URL: http://arxiv.org/abs/2412.03886v1
Date: Thu, 05 Dec 2024 05:39:03 GMT
Title: Uniform Discretized Integrated Gradients: An effective attribution based method for explaining large language models
Authors: Swarnava Sinha Roy, Ayan Kundu,
Abstract summary: Integrated Gradients is a well-known technique for explaining deep learning models.<n>In this paper, we propose a method called Uniform Discretized Integrated Gradients (UDIG)<n>We evaluate our method on two types of NLP tasks- Sentiment Classification and Question Answering against three metrics viz Log odds, Comprehensiveness and Sufficiency.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Integrated Gradients is a well-known technique for explaining deep learning models. It calculates feature importance scores by employing a gradient based approach computing gradients of the model output with respect to input features and accumulating them along a linear path. While this works well for continuous features spaces, it may not be the most optimal way to deal with discrete spaces like word embeddings. For interpreting LLMs (Large Language Models), there exists a need for a non-linear path where intermediate points, whose gradients are to be computed, lie close to actual words in the embedding space. In this paper, we propose a method called Uniform Discretized Integrated Gradients (UDIG) based on a new interpolation strategy where we choose a favorable nonlinear path for computing attribution scores suitable for predictive language models. We evaluate our method on two types of NLP tasks- Sentiment Classification and Question Answering against three metrics viz Log odds, Comprehensiveness and Sufficiency. For sentiment classification, we have used the SST2, IMDb and Rotten Tomatoes datasets for benchmarking and for Question Answering, we have used the fine-tuned BERT model on SQuAD dataset. Our approach outperforms the existing methods in almost all the metrics.

Related papers

An Enhanced Model-based Approach for Short Text Clustering [58.60681789677676]
Short text clustering has become increasingly important with the popularity of social media like Twitter, Google+, and Facebook.<n>Existing methods can be broadly categorized into two paradigms: topic model-based approaches and deep representation learning-based approaches.<n>We propose a collapsed Gibbs Sampling algorithm for the Dirichlet Multinomial Mixture model (GSDMM), which effectively handles the sparsity and high dimensionality of short texts.<n>Based on several aspects of GSDMM that warrant further refinement, we propose an improved approach, GSDMM+, designed to further optimize its performance.
arXiv Detail & Related papers (2025-07-18T10:07:42Z)
Revisiting Nearest Neighbor for Tabular Data: A Deep Tabular Baseline Two Decades Later [76.66498833720411]
We introduce a differentiable version of $K$-nearest neighbors (KNN) originally designed to learn a linear projection to capture semantic similarities between instances. Surprisingly, our implementation of NCA using SGD and without dimensionality reduction already achieves decent performance on tabular data. We conclude our paper by analyzing the factors behind these improvements, including loss functions, prediction strategies, and deep architectures.
arXiv Detail & Related papers (2024-07-03T16:38:57Z)
Class Gradient Projection For Continual Learning [99.105266615448]
Catastrophic forgetting is one of the most critical challenges in Continual Learning (CL) We propose Class Gradient Projection (CGP), which calculates the gradient subspace from individual classes rather than tasks.
arXiv Detail & Related papers (2023-11-25T02:45:56Z)
GRANDE: Gradient-Based Decision Tree Ensembles for Tabular Data [9.107782510356989]
We propose a novel approach for learning hard, axis-aligned decision tree ensembles using end-to-end gradient descent. Grande is based on a dense representation of tree ensembles, which affords to use backpropagation with a straight-through operator. We demonstrate that our method outperforms existing gradient-boosting and deep learning frameworks on most datasets.
arXiv Detail & Related papers (2023-09-29T10:49:14Z)
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours. We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length. This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z)
Neural Gradient Learning and Optimization for Oriented Point Normal Estimation [53.611206368815125]
We propose a deep learning approach to learn gradient vectors with consistent orientation from 3D point clouds for normal estimation. We learn an angular distance field based on local plane geometry to refine the coarse gradient vectors. Our method efficiently conducts global gradient approximation while achieving better accuracy and ability generalization of local feature description.
arXiv Detail & Related papers (2023-09-17T08:35:11Z)
Fighting Uncertainty with Gradients: Offline Reinforcement Learning via Diffusion Score Matching [22.461036967440723]
We study smoothed distance to data as an uncertainty metric, and claim that it has two beneficial properties. We show these gradients can be efficiently learned with score-matching techniques. We propose Score-Guided Planning (SGP) to enable first-order planning in high-dimensional problems.
arXiv Detail & Related papers (2023-06-24T23:40:58Z)
Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data Augmentation [42.05617728412819]
We show how to optimize few-shot text classification without accessing the gradients of the large-scale language models. Our approach, dubbed BT-Classifier, significantly outperforms state-of-the-art black-box few-shot learners.
arXiv Detail & Related papers (2023-05-23T07:54:34Z)
Geometrically Guided Integrated Gradients [0.3867363075280543]
We introduce an interpretability method called "geometrically-guided integrated gradients" Our method explores the model's dynamic behavior from multiple scaled versions of the input and captures the best possible attribution for each input. We also propose a "model perturbation" sanity check to complement the traditionally used "model randomization" test.
arXiv Detail & Related papers (2022-06-13T05:05:43Z)
Bi-level Alignment for Cross-Domain Crowd Counting [113.78303285148041]
Current methods rely on external data for training an auxiliary task or apply an expensive coarse-to-fine estimation. We develop a new adversarial learning based method, which is simple and efficient to apply. We evaluate our approach on five real-world crowd counting benchmarks, where we outperform existing approaches by a large margin.
arXiv Detail & Related papers (2022-05-12T02:23:25Z)
Locally Aggregated Feature Attribution on Natural Language Model Understanding [12.233103741197334]
Locally Aggregated Feature Attribution (LAFA) is a novel gradient-based feature attribution method for NLP models. Instead of relying on obscure reference tokens, it smooths gradients by aggregating similar reference texts derived from language model embeddings. For evaluation purpose, we also design experiments on different NLP tasks including Entity Recognition and Sentiment Analysis on public datasets.
arXiv Detail & Related papers (2022-04-22T18:59:27Z)
Discretized Integrated Gradients for Explaining Language Models [43.2877233809206]
Integrated Gradients (IG) is a prominent attribution-based explanation algorithm. We propose Discretized Integrated Gradients (DIG) which allows effective attribution along non-linear paths.
arXiv Detail & Related papers (2021-08-31T07:36:34Z)
Differentiable Segmentation of Sequences [2.1485350418225244]
We build on advances in learning continuous warping functions and propose a novel family of warping functions based on the two-sided power (TSP) distribution. Our formulation includes the important class of segmented generalized linear models as a special case. We use our approach to model the spread of COVID-19 with Poisson regression, apply it on a change point detection task, and learn classification models with concept drift.
arXiv Detail & Related papers (2020-06-23T15:51:48Z)
Spatial Pyramid Based Graph Reasoning for Semantic Segmentation [67.47159595239798]
We apply graph convolution into the semantic segmentation task and propose an improved Laplacian. The graph reasoning is directly performed in the original feature space organized as a spatial pyramid. We achieve comparable performance with advantages in computational and memory overhead.
arXiv Detail & Related papers (2020-03-23T12:28:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.