Instant Complexity Reduction in CNNs using Locality-Sensitive Hashing
- URL: http://arxiv.org/abs/2309.17211v1
- Date: Fri, 29 Sep 2023 13:09:40 GMT
- Title: Instant Complexity Reduction in CNNs using Locality-Sensitive Hashing
- Authors: Lukas Meiner, Jens Mehnert, Alexandru Paul Condurache
- Abstract summary: We propose HASTE (Hashing for Tractable Efficiency), a parameter-free and data-free module that acts as a plug-and-play replacement for any regular convolution module.
We are able to drastically compress latent feature maps without sacrificing much accuracy by using locality-sensitive hashing (LSH)
In particular, we are able to instantly drop 46.72% of FLOPs while only losing 1.25% accuracy by just swapping the convolution modules in a ResNet34 on CIFAR-10 for our HASTE module.
- Score: 50.79602839359522
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To reduce the computational cost of convolutional neural networks (CNNs) for
usage on resource-constrained devices, structured pruning approaches have shown
promising results, drastically reducing floating-point operations (FLOPs)
without substantial drops in accuracy. However, most recent methods require
fine-tuning or specific training procedures to achieve a reasonable trade-off
between retained accuracy and reduction in FLOPs. This introduces additional
cost in the form of computational overhead and requires training data to be
available. To this end, we propose HASTE (Hashing for Tractable Efficiency), a
parameter-free and data-free module that acts as a plug-and-play replacement
for any regular convolution module. It instantly reduces the network's
test-time inference cost without requiring any training or fine-tuning. We are
able to drastically compress latent feature maps without sacrificing much
accuracy by using locality-sensitive hashing (LSH) to detect redundancies in
the channel dimension. Similar channels are aggregated to reduce the input and
filter depth simultaneously, allowing for cheaper convolutions. We demonstrate
our approach on the popular vision benchmarks CIFAR-10 and ImageNet. In
particular, we are able to instantly drop 46.72% of FLOPs while only losing
1.25% accuracy by just swapping the convolution modules in a ResNet34 on
CIFAR-10 for our HASTE module.
Related papers
- Attention-based Feature Compression for CNN Inference Offloading in Edge
Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems.
We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device.
Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z) - Neural Network Pruning by Cooperative Coevolution [16.0753044050118]
We propose a new filter pruning algorithm CCEP by cooperative coevolution.
CCEP reduces the pruning space by a divide-and-conquer strategy.
Experiments show that CCEP can achieve a competitive performance with the state-of-the-art pruning methods.
arXiv Detail & Related papers (2022-04-12T09:06:38Z) - GDP: Stabilized Neural Network Pruning via Gates with Differentiable
Polarization [84.57695474130273]
Gate-based or importance-based pruning methods aim to remove channels whose importance is smallest.
GDP can be plugged before convolutional layers without bells and whistles, to control the on-and-off of each channel.
Experiments conducted over CIFAR-10 and ImageNet datasets show that the proposed GDP achieves the state-of-the-art performance.
arXiv Detail & Related papers (2021-09-06T03:17:10Z) - Structured Directional Pruning via Perturbation Orthogonal Projection [13.704348351073147]
A more reasonable approach is to find a sparse minimizer along the flat minimum valley found byNIST.
We propose the structured directional pruning based on projecting the perturbations onto the flat minimum valley.
Experiments show that our method obtains the state-of-the-art pruned accuracy (i.e. 93.97% on VGG16, CIFAR-10 task) without retraining.
arXiv Detail & Related papers (2021-07-12T11:35:47Z) - Non-Parametric Adaptive Network Pruning [125.4414216272874]
We introduce non-parametric modeling to simplify the algorithm design.
Inspired by the face recognition community, we use a message passing algorithm to obtain an adaptive number of exemplars.
EPruner breaks the dependency on the training data in determining the "important" filters.
arXiv Detail & Related papers (2021-01-20T06:18:38Z) - ACP: Automatic Channel Pruning via Clustering and Swarm Intelligence
Optimization for CNN [6.662639002101124]
convolutional neural network (CNN) gets deeper and wider in recent years.
Existing magnitude-based pruning methods are efficient, but the performance of the compressed network is unpredictable.
We propose a novel automatic channel pruning method (ACP)
ACP is evaluated against several state-of-the-art CNNs on three different classification datasets.
arXiv Detail & Related papers (2021-01-16T08:56:38Z) - UCP: Uniform Channel Pruning for Deep Convolutional Neural Networks
Compression and Acceleration [24.42067007684169]
We propose a novel uniform channel pruning (UCP) method to prune deep CNN.
The unimportant channels, including convolutional kernels related to them, are pruned directly.
We verify our method on CIFAR-10, CIFAR-100 and ILSVRC-2012 for image classification.
arXiv Detail & Related papers (2020-10-03T01:51:06Z) - AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation.
Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z) - Enabling On-Device CNN Training by Self-Supervised Instance Filtering
and Error Map Pruning [17.272561332310303]
This work aims to enable on-device training of convolutional neural networks (CNNs) by reducing the computation cost at training time.
CNN models are usually trained on high-performance computers and only the trained models are deployed to edge devices.
arXiv Detail & Related papers (2020-07-07T05:52:37Z) - Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization.
Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.