X-Pruner: eXplainable Pruning for Vision Transformers
- URL: http://arxiv.org/abs/2303.04935v2
- Date: Mon, 5 Jun 2023 04:33:07 GMT
- Title: X-Pruner: eXplainable Pruning for Vision Transformers
- Authors: Lu Yu, Wei Xiang
- Abstract summary: Vision transformer models usually suffer from intensive computational costs and heavy memory requirements.
Recent studies have proposed to prune transformers in an unexplainable manner, which overlook the relationship between internal units of the model and the target class.
We propose a novel explainable pruning framework dubbed X-Pruner, which is designed by considering the explainability of the pruning criterion.
- Score: 12.296223124178102
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recently vision transformer models have become prominent models for a range
of tasks. These models, however, usually suffer from intensive computational
costs and heavy memory requirements, making them impractical for deployment on
edge platforms. Recent studies have proposed to prune transformers in an
unexplainable manner, which overlook the relationship between internal units of
the model and the target class, thereby leading to inferior performance. To
alleviate this problem, we propose a novel explainable pruning framework dubbed
X-Pruner, which is designed by considering the explainability of the pruning
criterion. Specifically, to measure each prunable unit's contribution to
predicting each target class, a novel explainability-aware mask is proposed and
learned in an end-to-end manner. Then, to preserve the most informative units
and learn the layer-wise pruning rate, we adaptively search the layer-wise
threshold that differentiates between unpruned and pruned units based on their
explainability-aware mask values. To verify and evaluate our method, we apply
the X-Pruner on representative transformer models including the DeiT and Swin
Transformer. Comprehensive simulation results demonstrate that the proposed
X-Pruner outperforms the state-of-the-art black-box methods with significantly
reduced computational costs and slight performance degradation.
Related papers
- Convexity-based Pruning of Speech Representation Models [1.3873323883842132]
Recent work has shown that there is significant redundancy in the transformer models for NLP.
In this paper, we investigate layer pruning in audio models.
We find a massive reduction in the computational effort with no loss of performance or even improvements in certain cases.
arXiv Detail & Related papers (2024-08-16T09:04:54Z) - Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach [87.8330887605381]
We show how to adapt a pre-trained Vision Transformer to downstream recognition tasks with only a few learnable parameters.
We synthesize a task-specific query with a learnable and lightweight module, which is independent of the pre-trained model.
Our method achieves state-of-the-art performance under memory constraints, showcasing its applicability in real-world situations.
arXiv Detail & Related papers (2024-07-09T15:45:04Z) - Learning on Transformers is Provable Low-Rank and Sparse: A One-layer Analysis [63.66763657191476]
We show that efficient numerical training and inference algorithms as low-rank computation have impressive performance for learning Transformer-based adaption.
We analyze how magnitude-based models affect generalization while improving adaption.
We conclude that proper magnitude-based has a slight on the testing performance.
arXiv Detail & Related papers (2024-06-24T23:00:58Z) - ExpPoint-MAE: Better interpretability and performance for self-supervised point cloud transformers [7.725095281624494]
We evaluate the effectiveness of Masked Autoencoding as a pretraining scheme, and explore Momentum Contrast as an alternative.
We observe that the transformer learns to attend to semantically meaningful regions, indicating that pretraining leads to a better understanding of the underlying geometry.
arXiv Detail & Related papers (2023-06-19T09:38:21Z) - Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers [29.319666323947708]
We present a novel approach that dynamically prunes contextual information while preserving the model's expressiveness.
Our method employs a learnable mechanism that determines which uninformative tokens can be dropped from the context.
Our reference implementation achieves up to $2times$ increase in inference throughput and even greater memory savings.
arXiv Detail & Related papers (2023-05-25T07:39:41Z) - VCNet: A self-explaining model for realistic counterfactual generation [52.77024349608834]
Counterfactual explanation is a class of methods to make local explanations of machine learning decisions.
We present VCNet-Variational Counter Net, a model architecture that combines a predictor and a counterfactual generator.
We show that VCNet is able to both generate predictions, and to generate counterfactual explanations without having to solve another minimisation problem.
arXiv Detail & Related papers (2022-12-21T08:45:32Z) - Interpretations Steered Network Pruning via Amortized Inferred Saliency
Maps [85.49020931411825]
Convolutional Neural Networks (CNNs) compression is crucial to deploying these models in edge devices with limited resources.
We propose to address the channel pruning problem from a novel perspective by leveraging the interpretations of a model to steer the pruning process.
We tackle this challenge by introducing a selector model that predicts real-time smooth saliency masks for pruned models.
arXiv Detail & Related papers (2022-09-07T01:12:11Z) - PLATON: Pruning Large Transformer Models with Upper Confidence Bound of
Weight Importance [114.1541203743303]
We propose PLATON, which captures the uncertainty of importance scores by upper confidence bound (UCB) of importance estimation.
We conduct extensive experiments with several Transformer-based models on natural language understanding, question answering and image classification.
arXiv Detail & Related papers (2022-06-25T05:38:39Z) - IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision
Transformers [81.31885548824926]
Self-attention-based model, transformer, is recently becoming the leading backbone in the field of computer vision.
We present an Interpretability-Aware REDundancy REDuction framework (IA-RED$2$)
We include extensive experiments on both image and video tasks, where our method could deliver up to 1.4X speed-up.
arXiv Detail & Related papers (2021-06-23T18:29:23Z) - A Modified Perturbed Sampling Method for Local Interpretable
Model-agnostic Explanation [35.281127405430674]
Local Interpretable Model-agnostic Explanation (LIME) is a technique that explains the predictions of any classifier faithfully.
This paper proposes a novel Modified Perturbed Sampling operation for LIME (MPS-LIME)
In image classification, MPS-LIME converts the superpixel image into an undirected graph.
arXiv Detail & Related papers (2020-02-18T09:03:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.