Related papers: Compact Neural Networks via Stacking Designed Basic Units

Compact Neural Networks via Stacking Designed Basic Units

URL: http://arxiv.org/abs/2205.01508v1
Date: Tue, 3 May 2022 14:04:49 GMT
Title: Compact Neural Networks via Stacking Designed Basic Units
Authors: Weichao Lan, Yiu-ming Cheung, Juyong Jiang
Abstract summary: This paper presents a new method termed TissueNet, which directly constructs compact neural networks with fewer weight parameters. We formulate TissueNet in diverse popular backbones for comparison with the state-of-the-art pruning methods on different benchmark datasets. Experiment results show that TissueNet can achieve comparable classification accuracy while saving up to around 80% FLOPs and 89.7% parameters.
Score: 38.10212043168065
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Unstructured pruning has the limitation of dealing with the sparse and irregular weights. By contrast, structured pruning can help eliminate this drawback but it requires complex criterion to determine which components to be pruned. To this end, this paper presents a new method termed TissueNet, which directly constructs compact neural networks with fewer weight parameters by independently stacking designed basic units, without requiring additional judgement criteria anymore. Given the basic units of various architectures, they are combined and stacked in a certain form to build up compact neural networks. We formulate TissueNet in diverse popular backbones for comparison with the state-of-the-art pruning methods on different benchmark datasets. Moreover, two new metrics are proposed to evaluate compression performance. Experiment results show that TissueNet can achieve comparable classification accuracy while saving up to around 80% FLOPs and 89.7% parameters. That is, stacking basic units provides a new promising way for network compression.

Related papers

Designing Semi-Structured Pruning of Graph Convolutional Networks for Skeleton-based Recognition [5.656581242851759]
Pruning is one of the lightweight network design techniques that operate by removing unnecessary network parts. In this paper, we devise a novel semi-structured method that discards the downsides of structured and unstructured pruning. The proposed solution is based on a differentiable cascaded parametrization which combines (i) a band-stop mechanism that prunes weights depending on their magnitudes, (ii) a weight-sharing parametrization that prunes connections either individually or group-wise, and (iii) a gating mechanism which arbitrates between different group-wise and entry-wise pruning.
arXiv Detail & Related papers (2024-12-16T14:29:31Z)
Isomorphic Pruning for Vision Models [56.286064975443026]
Structured pruning reduces the computational overhead of deep neural networks by removing redundant sub-structures. We present Isomorphic Pruning, a simple approach that demonstrates effectiveness across a range of network architectures.
arXiv Detail & Related papers (2024-07-05T16:14:53Z)
ThinResNet: A New Baseline for Structured Convolutional Networks Pruning [1.90298817989995]
Pruning is a compression method which aims to improve the efficiency of neural networks by reducing their number of parameters. In this work, we verify how results in the recent literature of pruning hold up against networks that underwent both state-of-the-art training methods and trivial model scaling.
arXiv Detail & Related papers (2023-09-22T13:28:18Z)
Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter. We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures'' Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z)
Neural Network Compression by Joint Sparsity Promotion and Redundancy Reduction [4.9613162734482215]
This paper presents a novel training scheme based on composite constraints that prune redundant filters and minimize their effect on overall network learning via sparsity promotion. Our tests on several pixel-wise segmentation benchmarks show that the number of neurons and the memory footprint of networks in the test phase are significantly reduced without affecting performance.
arXiv Detail & Related papers (2022-10-14T01:34:49Z)
Compact representations of convolutional neural networks via weight pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization. We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z)
Efficient Micro-Structured Weight Unification and Pruning for Neural Network Compression [56.83861738731913]
Deep Neural Network (DNN) models are essential for practical applications, especially for resource limited devices. Previous unstructured or structured weight pruning methods can hardly truly accelerate inference. We propose a generalized weight unification framework at a hardware compatible micro-structured level to achieve high amount of compression and acceleration.
arXiv Detail & Related papers (2021-06-15T17:22:59Z)
Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks. The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z)
Paying more attention to snapshots of Iterative Pruning: Improving Model Compression via Ensemble Distillation [4.254099382808598]
Existing methods often iteratively prune networks to attain high compression ratio without incurring significant loss in performance. We show that strong ensembles can be constructed from snapshots of iterative pruning, which achieve competitive performance and vary in network structure. In standard image classification benchmarks such as CIFAR and Tiny-Imagenet, we advance state-of-the-art pruning ratio of structured pruning by integrating simple l1-norm filters pruning into our pipeline.
arXiv Detail & Related papers (2020-06-20T03:59:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.