HESSO: Towards Automatic Efficient and User Friendly Any Neural Network Training and Pruning
- URL: http://arxiv.org/abs/2409.09085v1
- Date: Wed, 11 Sep 2024 05:28:52 GMT
- Title: HESSO: Towards Automatic Efficient and User Friendly Any Neural Network Training and Pruning
- Authors: Tianyi Chen, Xiaoyi Qu, David Aponte, Colby Banbury, Jongwoo Ko, Tianyu Ding, Yong Ma, Vladimir Lyapunov, Ilya Zharkov, Luming Liang,
- Abstract summary: Only-Train-Once (OTO) series has been recently proposed to resolve the many pain points by streamlining the workflow.
We numerically demonstrate the efficacy of HESSO and its enhanced version HESSO-CRIC on a variety of applications.
- Score: 38.01465387364115
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Structured pruning is one of the most popular approaches to effectively compress the heavy deep neural networks (DNNs) into compact sub-networks while retaining performance. The existing methods suffer from multi-stage procedures along with significant engineering efforts and human expertise. The Only-Train-Once (OTO) series has been recently proposed to resolve the many pain points by streamlining the workflow by automatically conducting (i) search space generation, (ii) structured sparse optimization, and (iii) sub-network construction. However, the built-in sparse optimizers in the OTO series, i.e., the Half-Space Projected Gradient (HSPG) family, have limitations that require hyper-parameter tuning and the implicit controls of the sparsity exploration, consequently requires intervening by human expertise. To address such limitations, we propose a Hybrid Efficient Structured Sparse Optimizer (HESSO). HESSO could automatically and efficiently train a DNN to produce a high-performing subnetwork. Meanwhile, it is almost tuning-free and enjoys user-friendly integration for generic training applications. To address another common issue of irreversible performance collapse observed in pruning DNNs, we further propose a Corrective Redundant Identification Cycle (CRIC) for reliably identifying indispensable structures. We numerically demonstrate the efficacy of HESSO and its enhanced version HESSO-CRIC on a variety of applications ranging from computer vision to natural language processing, including large language model. The numerical results showcase that HESSO can achieve competitive even superior performance to varying state-of-the-arts and support most DNN architectures. Meanwhile, CRIC can effectively prevent the irreversible performance collapse and further enhance the performance of HESSO on certain applications. The code is available at https://github.com/microsoft/only_train_once.
Related papers
- Enhancing GNNs Performance on Combinatorial Optimization by Recurrent Feature Update [0.09986418756990156]
We introduce a novel algorithm, denoted hereafter as QRF-GNN, leveraging the power of GNNs to efficiently solve Combinatorial optimization (CO) problems.
It relies on unsupervised learning by minimizing the loss function derived from QUBO relaxation.
Results of experiments show that QRF-GNN drastically surpasses existing learning-based approaches and is comparable to the state-of-the-art conventionals.
arXiv Detail & Related papers (2024-07-23T13:34:35Z) - OTOv3: Automatic Architecture-Agnostic Neural Network Training and
Compression from Structured Pruning to Erasing Operators [57.145175475579315]
This topic spans various techniques, from structured pruning to neural architecture search, encompassing both pruning and erasing operators perspectives.
We introduce the third-generation Only-Train-Once (OTOv3), which first automatically trains and compresses a general DNN through pruning and erasing operations.
Our empirical results demonstrate the efficacy of OTOv3 across various benchmarks in structured pruning and neural architecture search.
arXiv Detail & Related papers (2023-12-15T00:22:55Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - PLiNIO: A User-Friendly Library of Gradient-based Methods for
Complexity-aware DNN Optimization [3.460496851517031]
PLiNIO is an open-source library implementing a comprehensive set of state-of-the-art DNN design automation techniques.
We show that PLiNIO achieves up to 94.34% memory reduction for a 1% accuracy drop compared to a baseline architecture.
arXiv Detail & Related papers (2023-07-18T07:11:14Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time
Mobile Acceleration [71.80326738527734]
We propose a general, fine-grained structured pruning scheme and corresponding compiler optimizations.
We show that our pruning scheme mapping methods, together with the general fine-grained structured pruning scheme, outperform the state-of-the-art DNN optimization framework.
arXiv Detail & Related papers (2021-11-22T23:53:14Z) - Only Train Once: A One-Shot Neural Network Training And Pruning
Framework [31.959625731943675]
Structured pruning is a commonly used technique in deploying deep neural networks (DNNs) onto resource-constrained devices.
We propose a framework that DNNs are slimmer with competitive performances and significant FLOPs reductions by Only-Train-Once (OTO)
OTO contains two keys: (i) we partition the parameters of DNNs into zero-invariant groups, enabling us to prune zero groups without affecting the output; and (ii) to promote zero groups, we then formulate a structured-Image optimization algorithm, Half-Space Projected (HSPG)
To demonstrate the effectiveness of OTO, we train and
arXiv Detail & Related papers (2021-07-15T17:15:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.