Accelerating the Global Aggregation of Local Explanations
- URL: http://arxiv.org/abs/2312.07991v3
- Date: Fri, 12 Jan 2024 14:18:57 GMT
- Title: Accelerating the Global Aggregation of Local Explanations
- Authors: Alon Mor, Yonatan Belinkov, Benny Kimelfeld
- Abstract summary: We devise techniques for accelerating the global aggregation of the Anchor algorithm.
We show that for a very mild loss of quality, we are able to accelerate the computation by up to 30$times$, reducing the computation from hours to minutes.
- Score: 43.787092409977724
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Local explanation methods highlight the input tokens that have a considerable
impact on the outcome of classifying the document at hand. For example, the
Anchor algorithm applies a statistical analysis of the sensitivity of the
classifier to changes in the token. Aggregating local explanations over a
dataset provides a global explanation of the model. Such aggregation aims to
detect words with the most impact, giving valuable insights about the model,
like what it has learned in training and which adversarial examples expose its
weaknesses. However, standard aggregation methods bear a high computational
cost: a na\"ive implementation applies a costly algorithm to each token of each
document, and hence, it is infeasible for a simple user running in the scope of
a short analysis session. % We devise techniques for accelerating the global
aggregation of the Anchor algorithm. Specifically, our goal is to compute a set
of top-$k$ words with the highest global impact according to different
aggregation functions. Some of our techniques are lossless and some are lossy.
We show that for a very mild loss of quality, we are able to accelerate the
computation by up to 30$\times$, reducing the computation from hours to
minutes. We also devise and study a probabilistic model that accounts for noise
in the Anchor algorithm and diminishes the bias toward words that are frequent
yet low in impact.
Related papers
- Provable Optimization for Adversarial Fair Self-supervised Contrastive Learning [49.417414031031264]
This paper studies learning fair encoders in a self-supervised learning setting.
All data are unlabeled and only a small portion of them are annotated with sensitive attributes.
arXiv Detail & Related papers (2024-06-09T08:11:12Z) - A hierarchical decomposition for explaining ML performance discrepancies [6.603088808962966]
Machine learning algorithms can often differ in performance across domains. Understanding $textitwhy$ their performance differs is crucial for determining what types of interventions are most effective at closing the performance gaps.
We introduce a nonparametric hierarchical framework that provides both aggregate and detailed decompositions for explaining why the performance of an ML algorithm differs across domains without requiring causal knowledge.
arXiv Detail & Related papers (2024-02-22T03:41:05Z) - Controlling Federated Learning for Covertness [15.878313629774269]
A learner aims to minimize a function $f$ by repeatedly querying a distributed oracle that provides noisy gradient evaluations.
At the same time, the learner seeks to hide $argmin f$ from a malicious eavesdropper that observes the learner's queries.
This paper considers the problem of textitcovert or textitlearner-private optimization, where the learner has to dynamically choose between learning and obfuscation.
arXiv Detail & Related papers (2023-08-17T07:16:41Z) - Less is More: Focus Attention for Efficient DETR [23.81282650112188]
We propose Focus-DETR, which focuses attention on more informative tokens for a better trade-off between computation efficiency and model accuracy.
Specifically, we reconstruct the encoder with dual attention, which includes a token scoring mechanism.
Compared with the state-of-the-art sparse DETR-like detectors under the same setting, our Focus-DETR gets comparable complexity while achieving 50.4AP (+2.2) on COCO.
arXiv Detail & Related papers (2023-07-24T08:39:11Z) - Understanding and Mitigating Spurious Correlations in Text
Classification with Neighborhood Analysis [69.07674653828565]
Machine learning models have a tendency to leverage spurious correlations that exist in the training set but may not hold true in general circumstances.
In this paper, we examine the implications of spurious correlations through a novel perspective called neighborhood analysis.
We propose a family of regularization methods, NFL (doN't Forget your Language) to mitigate spurious correlations in text classification.
arXiv Detail & Related papers (2023-05-23T03:55:50Z) - Global Pointer: Novel Efficient Span-based Approach for Named Entity
Recognition [7.226094340165499]
Named entity recognition (NER) task aims at identifying entities from a piece of text that belong to predefined semantic types.
The state-of-the-art solutions for flat entities NER commonly suffer from capturing the fine-grained semantic information in underlying texts.
We propose a novel span-based NER framework, namely Global Pointer (GP), that leverages the relative positions through a multiplicative attention mechanism.
arXiv Detail & Related papers (2022-08-05T09:19:46Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Multi-granularity Relabeled Under-sampling Algorithm for Imbalanced Data [15.030895782548576]
The imbalanced classification problem turns out to be one of the important and challenging problems in data mining and machine learning.
The Tomek-Link sampling algorithm can effectively reduce the class overlap on data, remove the majority instances that are difficult to distinguish, and improve the algorithm classification accuracy.
However, the Tomek-Links under-sampling algorithm only considers the boundary instances that are the nearest neighbors to each other globally and ignores the potential local overlapping instances.
This paper proposes a multi-granularity relabeled under-sampling algorithm (MGRU) which fully considers the local information of the data set in the
arXiv Detail & Related papers (2022-01-11T14:07:55Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z) - Learning a Unified Sample Weighting Network for Object Detection [113.98404690619982]
Region sampling or weighting is significantly important to the success of modern region-based object detectors.
We argue that sample weighting should be data-dependent and task-dependent.
We propose a unified sample weighting network to predict a sample's task weights.
arXiv Detail & Related papers (2020-06-11T16:19:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.