Related papers: Efficient CNN Compression via Multi-method Low Rank Factorization and Feature Map Similarity

Efficient CNN Compression via Multi-method Low Rank Factorization and Feature Map Similarity

URL: http://arxiv.org/abs/2510.00062v1
Date: Mon, 29 Sep 2025 08:44:02 GMT
Title: Efficient CNN Compression via Multi-method Low Rank Factorization and Feature Map Similarity
Authors: M. Kokhazadeh, G. Keramidas, V. Kelefouras,
Abstract summary: Low-Rank Factorization (LRF) is a widely adopted technique for compressing deep neural networks (DNNs)<n>It faces several challenges, including optimal rank selection, a vast design space, long fine-tuning times, and limited compatibility with different layer types and decomposition methods.<n>This paper presents an end-to-end Design Space Exploration methodology and framework for compressing CNNs.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Low-Rank Factorization (LRF) is a widely adopted technique for compressing deep neural networks (DNNs). However, it faces several challenges, including optimal rank selection, a vast design space, long fine-tuning times, and limited compatibility with different layer types and decomposition methods. This paper presents an end-to-end Design Space Exploration (DSE) methodology and framework for compressing convolutional neural networks (CNNs) that addresses all these issues. We introduce a novel rank selection strategy based on feature map similarity, which captures non-linear interactions between layer outputs more effectively than traditional weight-based approaches. Unlike prior works, our method uses a one-shot fine-tuning process, significantly reducing the overall fine-tuning time. The proposed framework is fully compatible with all types of convolutional (Conv) and fully connected (FC) layers. To further improve compression, the framework integrates three different LRF techniques for Conv layers and three for FC layers, applying them selectively on a per-layer basis. We demonstrate that combining multiple LRF methods within a single model yields better compression results than using a single method uniformly across all layers. Finally, we provide a comprehensive evaluation and comparison of the six LRF techniques, offering practical insights into their effectiveness across different scenarios. The proposed work is integrated into TensorFlow 2.x, ensuring compatibility with widely used deep learning workflows. Experimental results on 14 CNN models across eight datasets demonstrate that the proposed methodology achieves substantial compression with minimal accuracy loss, outperforming several state-of-the-art techniques.

Related papers

Rethinking Autoregressive Models for Lossless Image Compression via Hierarchical Parallelism and Progressive Adaptation [75.58269386927076]
Autoregressive (AR) models are often dismissed as impractical due to prohibitive computational cost.<n>This work re-thinks this paradigm, introducing a framework built on hierarchical parallelism and progressive adaptation.<n> Experiments on diverse datasets (natural, satellite, medical) validate that our method achieves new state-of-the-art compression.
arXiv Detail & Related papers (2025-11-14T06:27:58Z)
COMPASS: High-Efficiency Deep Image Compression with Arbitrary-scale Spatial Scalability [28.270200666064163]
We propose a novel NN-based spatially scalable image compression method, called Compass, which supports arbitrary-scale spatial scalability. Our proposed Compass has a very flexible structure where the number of layers and their respective scale factors can be arbitrarily determined during inference. Experimental results show that our Compass achieves BD-rate gain of -58.33% and -47.17% at maximum compared to SHVC and the state-of-the-art NN-based spatially scalable image compression method, respectively.
arXiv Detail & Related papers (2023-09-11T14:05:18Z)
Layer-wise Linear Mode Connectivity [52.6945036534469]
Averaging neural network parameters is an intuitive method for the knowledge of two independent models. It is most prominently used in federated learning. We analyse the performance of the models that result from averaging single, or groups.
arXiv Detail & Related papers (2023-07-13T09:39:10Z)
ConvBLS: An Effective and Efficient Incremental Convolutional Broad Learning System for Image Classification [63.49762079000726]
We propose a convolutional broad learning system (ConvBLS) based on the spherical K-means (SKM) algorithm and two-stage multi-scale (TSMS) feature fusion. Our proposed ConvBLS method is unprecedentedly efficient and effective.
arXiv Detail & Related papers (2023-04-01T04:16:12Z)
Towards Optimal Compression: Joint Pruning and Quantization [1.191194620421783]
This paper introduces FITCompress, a novel method integrating layer-wise mixed-precision quantization and unstructured pruning. Experiments on computer vision and natural language processing benchmarks demonstrate that our proposed approach achieves a superior compression-performance trade-off.
arXiv Detail & Related papers (2023-02-15T12:02:30Z)
Boosting the Efficiency of Parametric Detection with Hierarchical Neural Networks [4.1410005218338695]
We propose Hierarchical Detection Network (HDN), a novel approach to efficient detection. The network is trained using a novel loss function, which encodes simultaneously the goals of statistical accuracy and efficiency. We show how training a three-layer HDN using two-layer model can further boost both accuracy and efficiency.
arXiv Detail & Related papers (2022-07-23T19:23:00Z)
STD-NET: Search of Image Steganalytic Deep-learning Architecture via Hierarchical Tensor Decomposition [40.997546601209145]
STD-NET is an unsupervised deep-learning architecture search approach via hierarchical tensor decomposition for image steganalysis. Our proposed strategy is more efficient and can remove more redundancy compared with previous steganalytic network compression methods.
arXiv Detail & Related papers (2022-06-12T03:46:08Z)
Low-rank Tensor Decomposition for Compression of Convolutional Neural Networks Using Funnel Regularization [1.8579693774597708]
We propose a model reduction method to compress the pre-trained networks using low-rank tensor decomposition. A new regularization method, called funnel function, is proposed to suppress the unimportant factors during the compression. For ResNet18 with ImageNet2012, our reduced model can reach more than twi times speed up in terms of GMAC with merely 0.7% Top-1 accuracy drop.
arXiv Detail & Related papers (2021-12-07T13:41:51Z)
Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition [62.41259783906452]
We present a novel global compression framework for deep neural networks. It automatically analyzes each layer to identify the optimal per-layer compression ratio. Our results open up new avenues for future research into the global performance-size trade-offs of modern neural networks.
arXiv Detail & Related papers (2021-07-23T20:01:30Z)
Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks. The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z)
Attentive CutMix: An Enhanced Data Augmentation Approach for Deep Learning Based Image Classification [58.20132466198622]
We propose Attentive CutMix, a naturally enhanced augmentation strategy based on CutMix. In each training iteration, we choose the most descriptive regions based on the intermediate attention maps from a feature extractor. Our proposed method is simple yet effective, easy to implement and can boost the baseline significantly.
arXiv Detail & Related papers (2020-03-29T15:01:05Z)
Deep Unfolding Network for Image Super-Resolution [159.50726840791697]
This paper proposes an end-to-end trainable unfolding network which leverages both learning-based methods and model-based methods. The proposed network inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model.
arXiv Detail & Related papers (2020-03-23T17:55:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.