OTOV2: Automatic, Generic, User-Friendly
- URL: http://arxiv.org/abs/2303.06862v2
- Date: Fri, 23 Jun 2023 05:41:26 GMT
- Title: OTOV2: Automatic, Generic, User-Friendly
- Authors: Tianyi Chen, Luming Liang, Tianyu Ding, Zhihui Zhu, Ilya Zharkov
- Abstract summary: We propose the second generation of Only-Train-Once (OTOv2), which first automatically trains and compresses a general DNN only once from scratch.
OTOv2 is automatic and pluggable into various deep learning applications, and requires almost minimal engineering efforts from the users.
Numerically, we demonstrate the generality and autonomy of OTOv2 on a variety of model architectures such as VGG, ResNet, CARN, ConvNeXt, DenseNet and StackedUnets.
- Score: 39.828644638174225
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The existing model compression methods via structured pruning typically
require complicated multi-stage procedures. Each individual stage necessitates
numerous engineering efforts and domain-knowledge from the end-users which
prevent their wider applications onto broader scenarios. We propose the second
generation of Only-Train-Once (OTOv2), which first automatically trains and
compresses a general DNN only once from scratch to produce a more compact model
with competitive performance without fine-tuning. OTOv2 is automatic and
pluggable into various deep learning applications, and requires almost minimal
engineering efforts from the users. Methodologically, OTOv2 proposes two major
improvements: (i) Autonomy: automatically exploits the dependency of general
DNNs, partitions the trainable variables into Zero-Invariant Groups (ZIGs), and
constructs the compressed model; and (ii) Dual Half-Space Projected Gradient
(DHSPG): a novel optimizer to more reliably solve structured-sparsity problems.
Numerically, we demonstrate the generality and autonomy of OTOv2 on a variety
of model architectures such as VGG, ResNet, CARN, ConvNeXt, DenseNet and
StackedUnets, the majority of which cannot be handled by other methods without
extensive handcrafting efforts. Together with benchmark datasets including
CIFAR10/100, DIV2K, Fashion-MNIST, SVNH and ImageNet, its effectiveness is
validated by performing competitively or even better than the
state-of-the-arts. The source code is available at
https://github.com/tianyic/only_train_once.
Related papers
- Dynamic Encoder Size Based on Data-Driven Layer-wise Pruning for Speech Recognition [24.71497121634708]
Varying-size models are often required to deploy ASR systems under different hardware and/or application constraints.
We present the dynamic encoder size approach, which jointly trains multiple performant models within one supernet from scratch.
arXiv Detail & Related papers (2024-07-10T08:35:21Z) - Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration [100.54419875604721]
All-in-one image restoration tackles different types of degradations with a unified model instead of having task-specific, non-generic models for each degradation.
We propose DyNet, a dynamic family of networks designed in an encoder-decoder style for all-in-one image restoration tasks.
Our DyNet can seamlessly switch between its bulkier and lightweight variants, thereby offering flexibility for efficient model deployment.
arXiv Detail & Related papers (2024-04-02T17:58:49Z) - Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch [72.26822499434446]
Auto-Train-Once (ATO) is an innovative network pruning algorithm designed to automatically reduce the computational and storage costs of DNNs.
We provide a comprehensive convergence analysis as well as extensive experiments, and the results show that our approach achieves state-of-the-art performance across various model architectures.
arXiv Detail & Related papers (2024-03-21T02:33:37Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - OTOv3: Automatic Architecture-Agnostic Neural Network Training and
Compression from Structured Pruning to Erasing Operators [57.145175475579315]
This topic spans various techniques, from structured pruning to neural architecture search, encompassing both pruning and erasing operators perspectives.
We introduce the third-generation Only-Train-Once (OTOv3), which first automatically trains and compresses a general DNN through pruning and erasing operations.
Our empirical results demonstrate the efficacy of OTOv3 across various benchmarks in structured pruning and neural architecture search.
arXiv Detail & Related papers (2023-12-15T00:22:55Z) - Stochastic Configuration Machines: FPGA Implementation [4.57421617811378]
configuration networks (SCNs) are a prime choice in industrial applications due to their merits and feasibility for data modelling.
This paper aims to implement SCM models on a field programmable gate array (FPGA) and introduce binary-coded inputs to improve learning performance.
arXiv Detail & Related papers (2023-10-30T02:04:20Z) - You Only Compress Once: Towards Effective and Elastic BERT Compression
via Exploit-Explore Stochastic Nature Gradient [88.58536093633167]
Existing model compression approaches require re-compression or fine-tuning across diverse constraints to accommodate various hardware deployments.
We propose a novel approach, YOCO-BERT, to achieve compress once and deploy everywhere.
Compared with state-of-the-art algorithms, YOCO-BERT provides more compact models, yet achieving 2.1%-4.5% average accuracy improvement on the GLUE benchmark.
arXiv Detail & Related papers (2021-06-04T12:17:44Z) - NSGANetV2: Evolutionary Multi-Objective Surrogate-Assisted Neural
Architecture Search [22.848528877480796]
We propose an efficient NAS algorithm for generating task-specific models that are competitive under multiple competing objectives.
It comprises of two surrogates, one at the architecture level to improve sample efficiency and one at the weights level, through a supernet, to improve gradient descent training efficiency.
We demonstrate the effectiveness and versatility of the proposed method on six diverse non-standard datasets.
arXiv Detail & Related papers (2020-07-20T18:30:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.