Bonsai: A Framework for Convolutional Neural Network Acceleration Using Criterion-Based Pruning
- URL: http://arxiv.org/abs/2602.17145v1
- Date: Thu, 19 Feb 2026 07:46:08 GMT
- Title: Bonsai: A Framework for Convolutional Neural Network Acceleration Using Criterion-Based Pruning
- Authors: Joseph Bingham, Sam Helmich,
- Abstract summary: We introduce Combine, a criterion-based pruning solution for CNNs.<n>We demonstrate that it is fast and effective framework for iterative pruning.<n>We show the capacity of these criterion functions and the framework on VGG inspired models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As the need for more accurate and powerful Convolutional Neural Networks (CNNs) increases, so too does the size, execution time, memory footprint, and power consumption. To overcome this, solutions such as pruning have been proposed with their own metrics and methodologies, or criteria, for how weights should be removed. These solutions do not share a common implementation and are difficult to implement and compare. In this work, we introduce Combine, a criterion- based pruning solution and demonstrate that it is fast and effective framework for iterative pruning, demonstrate that criterion have differing effects on different models, create a standard language for comparing criterion functions, and propose a few novel criterion functions. We show the capacity of these criterion functions and the framework on VGG inspired models, pruning up to 79\% of filters while retaining or improving accuracy, and reducing the computations needed by the network by up to 68\%.
Related papers
- Application-Specific Component-Aware Structured Pruning of Deep Neural Networks via Soft Coefficient Optimization [1.6874375111244326]
It remains critical to ensure that application-specific performance characteristics are preserved during compression.<n>In structured pruning, where groups of structurally coherent elements are removed, conventional importance metrics frequently fail to maintain these essential performance attributes.<n>We propose an enhanced importance metric framework that not only reduces model size but also explicitly accounts for application-specific performance constraints.
arXiv Detail & Related papers (2025-07-20T09:50:04Z) - Structure-Aware Automatic Channel Pruning by Searching with Graph Embedding [28.03880549472142]
Channel pruning is a powerful technique to reduce the computational overhead of deep neural networks.<n>We propose a novel structure-aware automatic channel pruning (SACP) framework to model the network topology and learn the global importance of each channel.<n>SACP outperforms state-of-the-art pruning methods on compression efficiency and competitive on accuracy retention.
arXiv Detail & Related papers (2025-06-13T05:05:35Z) - Less is KEN: a Universal and Simple Non-Parametric Pruning Algorithm for Large Language Models [1.5807079236265718]
KEN is a straightforward, universal and unstructured pruning algorithm based on Kernel Density Estimation (KDE)
Ken aims to construct optimized transformers by selectively preserving the most significant parameters while restoring others to their pre-training state.
Ken achieves equal or better performance than their original unpruned versions, with a minimum parameter reduction of 25%.
arXiv Detail & Related papers (2024-02-05T16:11:43Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - PRUNIX: Non-Ideality Aware Convolutional Neural Network Pruning for
Memristive Accelerators [0.36832029288386126]
PRUNIX is a framework for training and pruning convolutional neural networks.
It is proposed for deployment on memristor crossbar based accelerators.
arXiv Detail & Related papers (2022-02-03T18:32:03Z) - Learning from Images: Proactive Caching with Parallel Convolutional
Neural Networks [94.85780721466816]
A novel framework for proactive caching is proposed in this paper.
It combines model-based optimization with data-driven techniques by transforming an optimization problem into a grayscale image.
Numerical results show that the proposed scheme can reduce 71.6% computation time with only 0.8% additional performance cost.
arXiv Detail & Related papers (2021-08-15T21:32:47Z) - Blending Pruning Criteria for Convolutional Neural Networks [13.259106518678474]
Recent popular network pruning is an effective method to reduce the redundancy of the models.
One filter could be important according to a certain criterion, while it is unnecessary according to another one, which indicates that each criterion is only a partial view of the comprehensive "importance"
We propose a novel framework to integrate the existing filter pruning criteria by exploring the criteria diversity.
arXiv Detail & Related papers (2021-07-11T12:34:19Z) - Rate Distortion Characteristic Modeling for Neural Image Compression [59.25700168404325]
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance.
distinct models are required to be trained to reach different points in the rate-distortion (R-D) space.
We make efforts to formulate the essential mathematical functions to describe the R-D behavior of NIC using deep network and statistical modeling.
arXiv Detail & Related papers (2021-06-24T12:23:05Z) - Efficient Micro-Structured Weight Unification and Pruning for Neural
Network Compression [56.83861738731913]
Deep Neural Network (DNN) models are essential for practical applications, especially for resource limited devices.
Previous unstructured or structured weight pruning methods can hardly truly accelerate inference.
We propose a generalized weight unification framework at a hardware compatible micro-structured level to achieve high amount of compression and acceleration.
arXiv Detail & Related papers (2021-06-15T17:22:59Z) - Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences.
We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline.
Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z) - Slimming Neural Networks using Adaptive Connectivity Scores [28.872080203221934]
We propose a new single-shot, fully automated pruning algorithm called Slimming Neural networks using Adaptive Connectivity Scores (SNACS)
Our proposed approach combines a probabilistic pruning framework with constraints on the underlying weight matrices.
SNACS is faster by over 17x the nearest comparable method.
arXiv Detail & Related papers (2020-06-22T17:45:16Z) - Dependency Aware Filter Pruning [74.69495455411987]
Pruning a proportion of unimportant filters is an efficient way to mitigate the inference cost.
Previous work prunes filters according to their weight norms or the corresponding batch-norm scaling factors.
We propose a novel mechanism to dynamically control the sparsity-inducing regularization so as to achieve the desired sparsity.
arXiv Detail & Related papers (2020-05-06T07:41:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.