Related papers: Composition of Saliency Metrics for Channel Pruning with a Myopic Oracle

Composition of Saliency Metrics for Channel Pruning with a Myopic Oracle

URL: http://arxiv.org/abs/2004.03376v2
Date: Thu, 24 Jun 2021 12:56:50 GMT
Title: Composition of Saliency Metrics for Channel Pruning with a Myopic Oracle
Authors: Kaveena Persand, Andrew Anderson, David Gregg
Abstract summary: Pruning is guided by a pruning saliency, which approximates the change in the loss function associated with the removal of specific weights. Many pruning signals have been proposed, but the performance of each depends on the particular trained network. We propose a method to compose several primitive pruning saliencies, to exploit the cases where each saliency measure does well.
Score: 0.8043754868448141
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The computation and memory needed for Convolutional Neural Network (CNN) inference can be reduced by pruning weights from the trained network. Pruning is guided by a pruning saliency, which heuristically approximates the change in the loss function associated with the removal of specific weights. Many pruning signals have been proposed, but the performance of each heuristic depends on the particular trained network. This leaves the data scientist with a difficult choice. When using any one saliency metric for the entire pruning process, we run the risk of the metric assumptions being invalidated, leading to poor decisions being made by the metric. Ideally we could combine the best aspects of different saliency metrics. However, despite an extensive literature review, we are unable to find any prior work on composing different saliency metrics. The chief difficulty lies in combining the numerical output of different saliency metrics, which are not directly comparable. We propose a method to compose several primitive pruning saliencies, to exploit the cases where each saliency measure does well. Our experiments show that the composition of saliencies avoids many poor pruning choices identified by individual saliencies. In most cases our method finds better selections than even the best individual pruning saliency.

Related papers

Revisiting Large Language Model Pruning using Neuron Semantic Attribution [63.62836612864512]
We conduct evaluations on 24 datasets and 4 tasks using popular pruning methods. We surprisingly find a significant performance drop of existing pruning methods in sentiment classification tasks. We propose Neuron Semantic Attribution, which learns to associate each neuron with specific semantics.
arXiv Detail & Related papers (2025-03-03T13:52:17Z)
Identifying General Mechanism Shifts in Linear Causal Representations [58.6238439611389]
We consider the linear causal representation learning setting where we observe a linear mixing of $d$ unknown latent factors. Recent work has shown that it is possible to recover the latent factors as well as the underlying structural causal model over them. We provide a surprising identifiability result that it is indeed possible, under some very mild standard assumptions, to identify the set of shifted nodes.
arXiv Detail & Related papers (2024-10-31T15:56:50Z)
Graspness Discovery in Clutters for Fast and Accurate Grasp Detection [57.81325062171676]
"graspness" is a quality based on geometry cues that distinguishes graspable areas in cluttered scenes. We develop a neural network named cascaded graspness model to approximate the searching process. Experiments on a large-scale benchmark, GraspNet-1Billion, show that our method outperforms previous arts by a large margin.
arXiv Detail & Related papers (2024-06-17T02:06:47Z)
QGait: Toward Accurate Quantization for Gait Recognition with Binarized Input [17.017127559393398]
We propose a differentiable soft quantizer, which better simulates the gradient of the round function during backpropagation. This enables the network to learn from subtle input perturbations. We further refine the training strategy to ensure convergence while simulating quantization errors.
arXiv Detail & Related papers (2024-05-22T17:34:18Z)
Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs [71.56345106591789]
It has been believed that weights in large language models (LLMs) contain significant redundancy. This paper presents a counter-argument: small-magnitude weights of pre-trained model weights encode vital knowledge essential for tackling difficult downstream tasks.
arXiv Detail & Related papers (2023-09-29T22:55:06Z)
Choosing a Proxy Metric from Past Experiments [54.338884612982405]
In many randomized experiments, the treatment effect of the long-term metric is often difficult or infeasible to measure. A common alternative is to measure several short-term proxy metrics in the hope they closely track the long-term metric. We introduce a new statistical framework to both define and construct an optimal proxy metric for use in a homogeneous population of randomized experiments.
arXiv Detail & Related papers (2023-09-14T17:43:02Z)
A Maximum Log-Likelihood Method for Imbalanced Few-Shot Learning Tasks [3.2895195535353308]
We propose a new maximum log-likelihood metric for few-shot architectures. We demonstrate that the proposed metric achieves superior performance accuracy w.r.t. conventional similarity metrics. We also show that our algorithm achieves state-of-the-art transductive few-shot performance when the evaluation data is imbalanced.
arXiv Detail & Related papers (2022-11-26T21:31:00Z)
Preprint: Norm Loss: An efficient yet effective regularization method for deep neural networks [7.214681039134488]
We propose a weight soft-regularization method based on the oblique manifold. We evaluate our method on the popular CIFAR-10, CIFAR-100 and ImageNet 2012 datasets.
arXiv Detail & Related papers (2021-03-11T10:24:49Z)
Multi-Loss Weighting with Coefficient of Variations [19.37721431024278]
We propose a weighting scheme based on the coefficient of variations and set the weights based on properties observed while training the model. The proposed method incorporates a measure of uncertainty to balance the losses, and as a result the loss weights evolve during training without requiring another (learning based) optimisation. The validity of the approach is shown empirically for depth estimation and semantic segmentation on multiple datasets.
arXiv Detail & Related papers (2020-09-03T14:51:19Z)
Rethinking preventing class-collapsing in metric learning with margin-based losses [81.22825616879936]
Metric learning seeks embeddings where visually similar instances are close and dissimilar instances are apart. margin-based losses tend to project all samples of a class onto a single point in the embedding space. We propose a simple modification to the embedding losses such that each sample selects its nearest same-class counterpart in a batch.
arXiv Detail & Related papers (2020-06-09T09:59:25Z)
What is the State of Neural Network Pruning? [12.50128492336137]
We provide a meta-analysis of the literature, including an overview of approaches to pruning. We find that the community suffers from a lack of standardized benchmarks and metrics. We introduce ShrinkBench, an open-source framework to facilitate standardized evaluations of pruning methods.
arXiv Detail & Related papers (2020-03-06T05:06:12Z)
Learning with Differentiable Perturbed Optimizers [54.351317101356614]
We propose a systematic method to transform operations into operations that are differentiable and never locally constant. Our approach relies on perturbeds, and can be used readily together with existing solvers. We show how this framework can be connected to a family of losses developed in structured prediction, and give theoretical guarantees for their use in learning tasks.
arXiv Detail & Related papers (2020-02-20T11:11:32Z)
The Price of Incentivizing Exploration: A Characterization via Thompson Sampling and Sample Complexity [83.81297078039836]
We consider incentivized exploration: a version of multi-armed bandits where the choice of arms is controlled by self-interested agents. We focus on the price of incentives: the loss in performance, broadly construed, incurred for the sake of incentive-compatibility.
arXiv Detail & Related papers (2020-02-03T04:58:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.