Related papers: OTOv3: Automatic Architecture-Agnostic Neural Network Training and Compression from Structured Pruning to Erasing Operators

OTOv3: Automatic Architecture-Agnostic Neural Network Training and Compression from Structured Pruning to Erasing Operators

URL: http://arxiv.org/abs/2312.09411v1
Date: Fri, 15 Dec 2023 00:22:55 GMT
Title: OTOv3: Automatic Architecture-Agnostic Neural Network Training and Compression from Structured Pruning to Erasing Operators
Authors: Tianyi Chen, Tianyu Ding, Zhihui Zhu, Zeyu Chen, HsiangTao Wu, Ilya Zharkov, Luming Liang
Abstract summary: This topic spans various techniques, from structured pruning to neural architecture search, encompassing both pruning and erasing operators perspectives. We introduce the third-generation Only-Train-Once (OTOv3), which first automatically trains and compresses a general DNN through pruning and erasing operations. Our empirical results demonstrate the efficacy of OTOv3 across various benchmarks in structured pruning and neural architecture search.
Score: 57.145175475579315
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Compressing a predefined deep neural network (DNN) into a compact sub-network with competitive performance is crucial in the efficient machine learning realm. This topic spans various techniques, from structured pruning to neural architecture search, encompassing both pruning and erasing operators perspectives. Despite advancements, existing methods suffers from complex, multi-stage processes that demand substantial engineering and domain knowledge, limiting their broader applications. We introduce the third-generation Only-Train-Once (OTOv3), which first automatically trains and compresses a general DNN through pruning and erasing operations, creating a compact and competitive sub-network without the need of fine-tuning. OTOv3 simplifies and automates the training and compression process, minimizes the engineering efforts required from users. It offers key technological advancements: (i) automatic search space construction for general DNNs based on dependency graph analysis; (ii) Dual Half-Space Projected Gradient (DHSPG) and its enhanced version with hierarchical search (H2SPG) to reliably solve (hierarchical) structured sparsity problems and ensure sub-network validity; and (iii) automated sub-network construction using solutions from DHSPG/H2SPG and dependency graphs. Our empirical results demonstrate the efficacy of OTOv3 across various benchmarks in structured pruning and neural architecture search. OTOv3 produces sub-networks that match or exceed the state-of-the-arts. The source code will be available at https://github.com/tianyic/only_train_once.

Related papers

Automatic Joint Structured Pruning and Quantization for Efficient Neural Network Training and Compression [44.35542987414442]
Structured pruning and quantization are fundamental techniques used to reduce the size of deep neural networks (DNNs) Applying these techniques jointly via co-optimization has the potential to produce smaller, high-quality models. We present the framework GETA, which automatically and efficiently performs joint structured pruning and quantization-aware training on any DNNs.
arXiv Detail & Related papers (2025-02-23T16:28:18Z)
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency. We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs) We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z)
HESSO: Towards Automatic Efficient and User Friendly Any Neural Network Training and Pruning [38.01465387364115]
Only-Train-Once (OTO) series has been recently proposed to resolve the many pain points by streamlining the workflow. We numerically demonstrate the efficacy of HESSO and its enhanced version HESSO-CRIC on a variety of applications.
arXiv Detail & Related papers (2024-09-11T05:28:52Z)
Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch [72.26822499434446]
Auto-Train-Once (ATO) is an innovative network pruning algorithm designed to automatically reduce the computational and storage costs of DNNs. We provide a comprehensive convergence analysis as well as extensive experiments, and the results show that our approach achieves state-of-the-art performance across various model architectures.
arXiv Detail & Related papers (2024-03-21T02:33:37Z)
Automated Search-Space Generation Neural Architecture Search [45.902445271519596]
ASGNAS produces high-performing sub-networks in the one shot manner. ASGNAS delivers three noticeable contributions to minimize human efforts. The library will be released at https://github.com/tianyic/tianyic/only_train_once.
arXiv Detail & Related papers (2023-05-25T19:41:40Z)
HKNAS: Classification of Hyperspectral Imagery Based on Hyper Kernel Neural Architecture Search [104.45426861115972]
We propose to directly generate structural parameters by utilizing the specifically designed hyper kernels. We obtain three kinds of networks to separately conduct pixel-level or image-level classifications with 1-D or 3-D convolutions. A series of experiments on six public datasets demonstrate that the proposed methods achieve state-of-the-art results.
arXiv Detail & Related papers (2023-04-23T17:27:40Z)
POPNASv3: a Pareto-Optimal Neural Architecture Search Solution for Image and Time Series Classification [8.190723030003804]
This article presents the third version of a sequential model-based NAS algorithm targeting different hardware environments and multiple classification tasks. Our method is able to find competitive architectures within large search spaces, while keeping a flexible structure and data processing pipeline to adapt to different tasks. The experiments performed on images and time series classification datasets provide evidence that POPNASv3 can explore a large set of assorted operators and converge to optimal architectures suited for the type of data provided under different scenarios.
arXiv Detail & Related papers (2022-12-13T17:14:14Z)
Complexity-Driven CNN Compression for Resource-constrained Edge AI [1.6114012813668934]
We propose a novel and computationally efficient pruning pipeline by exploiting the inherent layer-level complexities of CNNs. We define three modes of pruning, namely parameter-aware (PA), FLOPs-aware (FA), and memory-aware (MA), to introduce versatile compression of CNNs.
arXiv Detail & Related papers (2022-08-26T16:01:23Z)
Pruning-as-Search: Efficient Neural Architecture Search via Channel Pruning and Structural Reparameterization [50.50023451369742]
Pruning-as-Search (PaS) is an end-to-end channel pruning method to search out desired sub-network automatically and efficiently. Our proposed architecture outperforms prior arts by around $1.0%$ top-1 accuracy on ImageNet-1000 classification task.
arXiv Detail & Related papers (2022-06-02T17:58:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.