Related papers: Exploring the Lottery Ticket Hypothesis with Explainability Methods: Insights into Sparse Network Performance

Exploring the Lottery Ticket Hypothesis with Explainability Methods: Insights into Sparse Network Performance

URL: http://arxiv.org/abs/2307.13698v1
Date: Fri, 7 Jul 2023 18:33:52 GMT
Title: Exploring the Lottery Ticket Hypothesis with Explainability Methods: Insights into Sparse Network Performance
Authors: Shantanu Ghosh, Kayhan Batmanghelich
Abstract summary: Lottery Ticket Hypothesis (LTH) finds a network within a deep network with comparable or superior performance to the original model. In this work, we examine why the performance of the pruned networks gradually increases or decreases.
Score: 13.773050123620592
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Discovering a high-performing sparse network within a massive neural network is advantageous for deploying them on devices with limited storage, such as mobile phones. Additionally, model explainability is essential to fostering trust in AI. The Lottery Ticket Hypothesis (LTH) finds a network within a deep network with comparable or superior performance to the original model. However, limited study has been conducted on the success or failure of LTH in terms of explainability. In this work, we examine why the performance of the pruned networks gradually increases or decreases. Using Grad-CAM and Post-hoc concept bottleneck models (PCBMs), respectively, we investigate the explainability of pruned networks in terms of pixels and high-level concepts. We perform extensive experiments across vision and medical imaging datasets. As more weights are pruned, the performance of the network degrades. The discovered concepts and pixels from the pruned networks are inconsistent with the original network -- a possible reason for the drop in performance.

Related papers

Understanding Generalization, Robustness, and Interpretability in Low-Capacity Neural Networks [0.0]
We introduce a framework to investigate capacity, sparsity, and robustness in low-capacity networks.<n>We show that trained networks are robust to extreme magnitude pruning (up to 95% sparsity)<n>This work provides a clear, empirical demonstration of the trade-offs governing simple neural networks.
arXiv Detail & Related papers (2025-07-22T06:43:03Z)
Explaining, Fast and Slow: Abstraction and Refinement of Provable Explanations [6.902279764206365]
We propose a novel abstraction-refinement technique for efficiently computing provably sufficient explanations of neural network predictions.<n>Our approach enhances the efficiency of obtaining provably sufficient explanations for neural network predictions while additionally providing a fine-grained interpretation of the network's predictions across different abstraction levels.
arXiv Detail & Related papers (2025-06-10T07:04:13Z)
Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance. Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning. Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z)
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity. Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z)
Impact of Disentanglement on Pruning Neural Networks [16.077795265753917]
Disentangled latent representations produced by variational autoencoder (VAE) networks are a promising approach for achieving model compression. We make use of the Beta-VAE framework combined with a standard criterion for pruning to investigate the impact of forcing the network to learn disentangled representations.
arXiv Detail & Related papers (2023-07-19T13:58:01Z)
Network Degeneracy as an Indicator of Training Performance: Comparing Finite and Infinite Width Angle Predictions [3.04585143845864]
We show that as networks get deeper and deeper, they are more susceptible to becoming degenerate. We use a simple algorithm that can accurately predict the level of degeneracy for any given fully connected ReLU network architecture.
arXiv Detail & Related papers (2023-06-02T13:02:52Z)
The Lottery Ticket Hypothesis for Self-attention in Convolutional Neural Network [69.54809052377189]
Recently many plug-and-play self-attention modules (SAMs) are proposed to enhance the model generalization by exploiting the internal information of deep convolutional neural networks (CNNs) We empirically find and verify some counterintuitive phenomena that: (a) Connecting the SAMs to all the blocks may not always bring the largest performance boost, and connecting to partial blocks would be even better; (b) Adding the SAMs to a CNN may not always bring a performance boost, and instead it may even harm the performance of the original CNN backbone.
arXiv Detail & Related papers (2022-07-16T07:08:59Z)
Self-Compression in Bayesian Neural Networks [0.9176056742068814]
We propose a new insight into network compression through the Bayesian framework. We show that Bayesian neural networks automatically discover redundancy in model parameters, thus enabling self-compression. Our experimental results show that the network architecture can be successfully compressed by deleting parameters identified by the network itself.
arXiv Detail & Related papers (2021-11-10T21:19:40Z)
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function. We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z)
Leveraging Sparse Linear Layers for Debuggable Deep Networks [86.94586860037049]
We show how fitting sparse linear models over learned deep feature representations can lead to more debuggable neural networks. The resulting sparse explanations can help to identify spurious correlations, explain misclassifications, and diagnose model biases in vision and language tasks.
arXiv Detail & Related papers (2021-05-11T08:15:25Z)
Prior knowledge distillation based on financial time series [0.8756822885568589]
We propose to use neural networks to represent indicators and train a large network constructed of smaller networks as feature layers. In numerical experiments, we find that our algorithm is faster and more accurate than traditional methods on real financial datasets.
arXiv Detail & Related papers (2020-06-16T15:26:06Z)
Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters. Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques. We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
Mixed-Precision Quantized Neural Network with Progressively Decreasing Bitwidth For Image Classification and Object Detection [21.48875255723581]
A mixed-precision quantized neural network with progressively ecreasing bitwidth is proposed to improve the trade-off between accuracy and compression. Experiments on typical network architectures and benchmark datasets demonstrate that the proposed method could achieve better or comparable results.
arXiv Detail & Related papers (2019-12-29T14:11:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.