Investigating the Effect of Network Pruning on Performance and Interpretability
- URL: http://arxiv.org/abs/2409.19727v1
- Date: Sun, 29 Sep 2024 14:57:45 GMT
- Title: Investigating the Effect of Network Pruning on Performance and Interpretability
- Authors: Jonathan von Rad, Florian Seuffert,
- Abstract summary: We investigate the impact of different pruning techniques on the classification performance and interpretability of GoogLeNet.
We compare different retraining strategies, such as iterative pruning and one-shot pruning.
We find that with sufficient retraining epochs, the performance of the networks can approximate the performance of the default GoogLeNet.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Neural Networks (DNNs) are often over-parameterized for their tasks and can be compressed quite drastically by removing weights, a process called pruning. We investigate the impact of different pruning techniques on the classification performance and interpretability of GoogLeNet. We systematically apply unstructured and structured pruning, as well as connection sparsity (pruning of input weights) methods to the network and analyze the outcomes regarding the network's performance on the validation set of ImageNet. We also compare different retraining strategies, such as iterative pruning and one-shot pruning. We find that with sufficient retraining epochs, the performance of the networks can approximate the performance of the default GoogLeNet - and even surpass it in some cases. To assess interpretability, we employ the Mechanistic Interpretability Score (MIS) developed by Zimmermann et al. . Our experiments reveal that there is no significant relationship between interpretability and pruning rate when using MIS as a measure. Additionally, we observe that networks with extremely low accuracy can still achieve high MIS scores, suggesting that the MIS may not always align with intuitive notions of interpretability, such as understanding the basis of correct decisions.
Related papers
- Beyond Pruning Criteria: The Dominant Role of Fine-Tuning and Adaptive Ratios in Neural Network Robustness [7.742297876120561]
Deep neural networks (DNNs) excel in tasks like image recognition and natural language processing.
Traditional pruning methods compromise the network's ability to withstand subtle perturbations.
This paper challenges the conventional emphasis on weight importance scoring as the primary determinant of a pruned network's performance.
arXiv Detail & Related papers (2024-10-19T18:35:52Z) - Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance.
Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning.
Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z) - Neural Network Pruning by Gradient Descent [7.427858344638741]
We introduce a novel and straightforward neural network pruning framework that incorporates the Gumbel-Softmax technique.
We demonstrate its exceptional compression capability, maintaining high accuracy on the MNIST dataset with only 0.15% of the original network parameters.
We believe our method opens a promising new avenue for deep learning pruning and the creation of interpretable machine learning systems.
arXiv Detail & Related papers (2023-11-21T11:12:03Z) - Influencer Detection with Dynamic Graph Neural Networks [56.1837101824783]
We investigate different dynamic Graph Neural Networks (GNNs) configurations for influencer detection.
We show that using deep multi-head attention in GNN and encoding temporal attributes significantly improves performance.
arXiv Detail & Related papers (2022-11-15T13:00:25Z) - Mitigating Performance Saturation in Neural Marked Point Processes:
Architectures and Loss Functions [50.674773358075015]
We propose a simple graph-based network structure called GCHP, which utilizes only graph convolutional layers.
We show that GCHP can significantly reduce training time and the likelihood ratio loss with interarrival time probability assumptions can greatly improve the model performance.
arXiv Detail & Related papers (2021-07-07T16:59:14Z) - Bridging the Gap Between Target Networks and Functional Regularization [61.051716530459586]
We show that Target Networks act as an implicit regularizer which can be beneficial in some cases, but also have disadvantages.
We propose an explicit Functional Regularization alternative that is flexible and a convex regularizer in function space.
Our findings emphasize that Functional Regularization can be used as a drop-in replacement for Target Networks and result in performance improvement.
arXiv Detail & Related papers (2021-06-04T17:21:07Z) - Lost in Pruning: The Effects of Pruning Neural Networks beyond Test
Accuracy [42.15969584135412]
Neural network pruning is a popular technique used to reduce the inference costs of modern networks.
We evaluate whether the use of test accuracy alone in the terminating condition is sufficient to ensure that the resulting model performs well.
We find that pruned networks effectively approximate the unpruned model, however, the prune ratio at which pruned networks achieve commensurate performance varies significantly across tasks.
arXiv Detail & Related papers (2021-03-04T13:22:16Z) - Cascade Network with Guided Loss and Hybrid Attention for Finding Good
Correspondences [33.65360396430535]
Given a putative correspondence set of an image pair, we propose a neural network which finds correct correspondences by a binary-class classifier.
We propose a new Guided Loss that can directly use evaluation criterion (Fn-measure) as guidance to dynamically adjust the objective function.
We then propose a hybrid attention block to extract feature, which integrates the Bayesian context normalization (BACN) and channel-wise attention (CA)
arXiv Detail & Related papers (2021-01-31T08:33:20Z) - Adversarial Training Reduces Information and Improves Transferability [81.59364510580738]
Recent results show that features of adversarially trained networks for classification, in addition to being robust, enable desirable properties such as invertibility.
We show that the Adversarial Training can improve linear transferability to new tasks, from which arises a new trade-off between transferability of representations and accuracy on the source task.
arXiv Detail & Related papers (2020-07-22T08:30:16Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.