Related papers: Accelerating the Global Aggregation of Local Explanations

Accelerating the Global Aggregation of Local Explanations

URL: http://arxiv.org/abs/2312.07991v3
Date: Fri, 12 Jan 2024 14:18:57 GMT
Title: Accelerating the Global Aggregation of Local Explanations
Authors: Alon Mor, Yonatan Belinkov, Benny Kimelfeld
Abstract summary: We devise techniques for accelerating the global aggregation of the Anchor algorithm. We show that for a very mild loss of quality, we are able to accelerate the computation by up to 30$times$, reducing the computation from hours to minutes.
Score: 43.787092409977724
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Local explanation methods highlight the input tokens that have a considerable impact on the outcome of classifying the document at hand. For example, the Anchor algorithm applies a statistical analysis of the sensitivity of the classifier to changes in the token. Aggregating local explanations over a dataset provides a global explanation of the model. Such aggregation aims to detect words with the most impact, giving valuable insights about the model, like what it has learned in training and which adversarial examples expose its weaknesses. However, standard aggregation methods bear a high computational cost: a na\"ive implementation applies a costly algorithm to each token of each document, and hence, it is infeasible for a simple user running in the scope of a short analysis session. % We devise techniques for accelerating the global aggregation of the Anchor algorithm. Specifically, our goal is to compute a set of top-$k$ words with the highest global impact according to different aggregation functions. Some of our techniques are lossless and some are lossy. We show that for a very mild loss of quality, we are able to accelerate the computation by up to 30$\times$, reducing the computation from hours to minutes. We also devise and study a probabilistic model that accounts for noise in the Anchor algorithm and diminishes the bias toward words that are frequent yet low in impact.

Related papers

Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling [90.86991492288487]
evaluating constraint on every token can be prohibitively expensive. LCD can distort the global distribution over strings, sampling tokens based only on local information. We show that our approach is superior to state-of-the-art baselines.
arXiv Detail & Related papers (2025-04-07T18:30:18Z)
Provable Optimization for Adversarial Fair Self-supervised Contrastive Learning [49.417414031031264]
This paper studies learning fair encoders in a self-supervised learning setting. All data are unlabeled and only a small portion of them are annotated with sensitive attributes.
arXiv Detail & Related papers (2024-06-09T08:11:12Z)
A hierarchical decomposition for explaining ML performance discrepancies [6.603088808962966]
Machine learning algorithms can often differ in performance across domains. Understanding $textitwhy$ their performance differs is crucial for determining what types of interventions are most effective at closing the performance gaps. We introduce a nonparametric hierarchical framework that provides both aggregate and detailed decompositions for explaining why the performance of an ML algorithm differs across domains without requiring causal knowledge.
arXiv Detail & Related papers (2024-02-22T03:41:05Z)
How to Efficiently Annotate Images for Best-Performing Deep Learning Based Segmentation Models: An Empirical Study with Weak and Noisy Annotations and Segment Anything Model [16.745318743249864]
Deep neural networks (DNNs) have demonstrated exceptional performance across various image segmentation tasks. To mitigate this challenge, alternative approaches such as using weak labels or less precise (noisy) annotations can be employed. Noisy and weak labels are significantly quicker to generate, allowing for more annotated images within the same time frame.
arXiv Detail & Related papers (2023-12-17T04:26:42Z)
Controlling Federated Learning for Covertness [15.878313629774269]
A learner aims to minimize a function $f$ by repeatedly querying a distributed oracle that provides noisy gradient evaluations. At the same time, the learner seeks to hide $argmin f$ from a malicious eavesdropper that observes the learner's queries. This paper considers the problem of textitcovert or textitlearner-private optimization, where the learner has to dynamically choose between learning and obfuscation.
arXiv Detail & Related papers (2023-08-17T07:16:41Z)
Less is More: Focus Attention for Efficient DETR [23.81282650112188]
We propose Focus-DETR, which focuses attention on more informative tokens for a better trade-off between computation efficiency and model accuracy. Specifically, we reconstruct the encoder with dual attention, which includes a token scoring mechanism. Compared with the state-of-the-art sparse DETR-like detectors under the same setting, our Focus-DETR gets comparable complexity while achieving 50.4AP (+2.2) on COCO.
arXiv Detail & Related papers (2023-07-24T08:39:11Z)
Understanding and Mitigating Spurious Correlations in Text Classification with Neighborhood Analysis [69.07674653828565]
Machine learning models have a tendency to leverage spurious correlations that exist in the training set but may not hold true in general circumstances. In this paper, we examine the implications of spurious correlations through a novel perspective called neighborhood analysis. We propose a family of regularization methods, NFL (doN't Forget your Language) to mitigate spurious correlations in text classification.
arXiv Detail & Related papers (2023-05-23T03:55:50Z)
Global Pointer: Novel Efficient Span-based Approach for Named Entity Recognition [7.226094340165499]
Named entity recognition (NER) task aims at identifying entities from a piece of text that belong to predefined semantic types. The state-of-the-art solutions for flat entities NER commonly suffer from capturing the fine-grained semantic information in underlying texts. We propose a novel span-based NER framework, namely Global Pointer (GP), that leverages the relative positions through a multiplicative attention mechanism.
arXiv Detail & Related papers (2022-08-05T09:19:46Z)
An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system. Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches. This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z)
Multi-granularity Relabeled Under-sampling Algorithm for Imbalanced Data [15.030895782548576]
The imbalanced classification problem turns out to be one of the important and challenging problems in data mining and machine learning. The Tomek-Link sampling algorithm can effectively reduce the class overlap on data, remove the majority instances that are difficult to distinguish, and improve the algorithm classification accuracy. However, the Tomek-Links under-sampling algorithm only considers the boundary instances that are the nearest neighbors to each other globally and ignores the potential local overlapping instances. This paper proposes a multi-granularity relabeled under-sampling algorithm (MGRU) which fully considers the local information of the data set in the
arXiv Detail & Related papers (2022-01-11T14:07:55Z)
Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels. To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit. Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z)
Learning a Unified Sample Weighting Network for Object Detection [113.98404690619982]
Region sampling or weighting is significantly important to the success of modern region-based object detectors. We argue that sample weighting should be data-dependent and task-dependent. We propose a unified sample weighting network to predict a sample's task weights.
arXiv Detail & Related papers (2020-06-11T16:19:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.