AutoSculpt: A Pattern-based Model Auto-pruning Framework Using Reinforcement Learning and Graph Learning
- URL: http://arxiv.org/abs/2412.18091v1
- Date: Tue, 24 Dec 2024 02:05:51 GMT
- Title: AutoSculpt: A Pattern-based Model Auto-pruning Framework Using Reinforcement Learning and Graph Learning
- Authors: Lixian Jing, Jianpeng Qi, Junyu Dong, Yanwei Yu,
- Abstract summary: AutoSculpt is a pattern-based automated pruning framework for deep neural networks (DNNs)
It automatically identifies and prunes regular patterns within DNN architectures that can be recognized by existing inference engines.
It achieves pruning rates of up to 90% and nearly 18% improvement in FLOPs reduction, outperforming all baselines.
- Score: 32.10443611442628
- License:
- Abstract: As deep neural networks (DNNs) are increasingly deployed on edge devices, optimizing models for constrained computational resources is critical. Existing auto-pruning methods face challenges due to the diversity of DNN models, various operators (e.g., filters), and the difficulty in balancing pruning granularity with model accuracy. To address these limitations, we introduce AutoSculpt, a pattern-based automated pruning framework designed to enhance efficiency and accuracy by leveraging graph learning and deep reinforcement learning (DRL). AutoSculpt automatically identifies and prunes regular patterns within DNN architectures that can be recognized by existing inference engines, enabling runtime acceleration. Three key steps in AutoSculpt include: (1) Constructing DNNs as graphs to encode their topology and parameter dependencies, (2) embedding computationally efficient pruning patterns, and (3) utilizing DRL to iteratively refine auto-pruning strategies until the optimal balance between compression and accuracy is achieved. Experimental results demonstrate the effectiveness of AutoSculpt across various architectures, including ResNet, MobileNet, VGG, and Vision Transformer, achieving pruning rates of up to 90% and nearly 18% improvement in FLOPs reduction, outperforming all baselines. The codes can be available at https://anonymous.4open.science/r/AutoSculpt-DDA0
Related papers
- RL-Pruner: Structured Pruning Using Reinforcement Learning for CNN Compression and Acceleration [0.0]
We propose RL-Pruner, which uses reinforcement learning to learn the optimal pruning distribution.
RL-Pruner can automatically extract dependencies between filters in the input model and perform pruning, without requiring model-specific pruning implementations.
arXiv Detail & Related papers (2024-11-10T13:35:10Z) - Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch [72.26822499434446]
Auto-Train-Once (ATO) is an innovative network pruning algorithm designed to automatically reduce the computational and storage costs of DNNs.
We provide a comprehensive convergence analysis as well as extensive experiments, and the results show that our approach achieves state-of-the-art performance across various model architectures.
arXiv Detail & Related papers (2024-03-21T02:33:37Z) - OTOv3: Automatic Architecture-Agnostic Neural Network Training and
Compression from Structured Pruning to Erasing Operators [57.145175475579315]
This topic spans various techniques, from structured pruning to neural architecture search, encompassing both pruning and erasing operators perspectives.
We introduce the third-generation Only-Train-Once (OTOv3), which first automatically trains and compresses a general DNN through pruning and erasing operations.
Our empirical results demonstrate the efficacy of OTOv3 across various benchmarks in structured pruning and neural architecture search.
arXiv Detail & Related papers (2023-12-15T00:22:55Z) - Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time
Mobile Acceleration [71.80326738527734]
We propose a general, fine-grained structured pruning scheme and corresponding compiler optimizations.
We show that our pruning scheme mapping methods, together with the general fine-grained structured pruning scheme, outperform the state-of-the-art DNN optimization framework.
arXiv Detail & Related papers (2021-11-22T23:53:14Z) - Boosting the Convergence of Reinforcement Learning-based Auto-pruning
Using Historical Data [35.36703623383735]
Reinforcement learning (RL)-based auto-pruning has been proposed to automate the pruning process to avoid expensive hand-crafted work.
However, the RL-based pruner involves a time-consuming training process and the high expense of each sample further exacerbates this problem.
We propose an efficient auto-pruning framework which solves this problem by taking advantage of the historical data from the previous auto-pruning process.
arXiv Detail & Related papers (2021-07-16T07:17:26Z) - DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator
Search [55.164053971213576]
convolutional neural network has achieved great success in fulfilling computer vision tasks despite large computation overhead.
Structured (channel) pruning is usually applied to reduce the model redundancy while preserving the network structure.
Existing structured pruning methods require hand-crafted rules which may lead to tremendous pruning space.
arXiv Detail & Related papers (2020-11-04T07:43:01Z) - Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and
Robust AutoDL [53.40030379661183]
Auto-PyTorch is a framework to enable fully automated deep learning (AutoDL)
It combines multi-fidelity optimization with portfolio construction for warmstarting and ensembling of deep neural networks (DNNs)
We show that Auto-PyTorch performs better than several state-of-the-art competitors on average.
arXiv Detail & Related papers (2020-06-24T15:15:17Z) - An Image Enhancing Pattern-based Sparsity for Real-time Inference on
Mobile Devices [58.62801151916888]
We introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly.
Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms.
arXiv Detail & Related papers (2020-01-20T16:17:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.