Related papers: OTOV2: Automatic, Generic, User-Friendly

OTOV2: Automatic, Generic, User-Friendly

URL: http://arxiv.org/abs/2303.06862v2
Date: Fri, 23 Jun 2023 05:41:26 GMT
Title: OTOV2: Automatic, Generic, User-Friendly
Authors: Tianyi Chen, Luming Liang, Tianyu Ding, Zhihui Zhu, Ilya Zharkov
Abstract summary: We propose the second generation of Only-Train-Once (OTOv2), which first automatically trains and compresses a general DNN only once from scratch. OTOv2 is automatic and pluggable into various deep learning applications, and requires almost minimal engineering efforts from the users. Numerically, we demonstrate the generality and autonomy of OTOv2 on a variety of model architectures such as VGG, ResNet, CARN, ConvNeXt, DenseNet and StackedUnets.
Score: 39.828644638174225
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The existing model compression methods via structured pruning typically require complicated multi-stage procedures. Each individual stage necessitates numerous engineering efforts and domain-knowledge from the end-users which prevent their wider applications onto broader scenarios. We propose the second generation of Only-Train-Once (OTOv2), which first automatically trains and compresses a general DNN only once from scratch to produce a more compact model with competitive performance without fine-tuning. OTOv2 is automatic and pluggable into various deep learning applications, and requires almost minimal engineering efforts from the users. Methodologically, OTOv2 proposes two major improvements: (i) Autonomy: automatically exploits the dependency of general DNNs, partitions the trainable variables into Zero-Invariant Groups (ZIGs), and constructs the compressed model; and (ii) Dual Half-Space Projected Gradient (DHSPG): a novel optimizer to more reliably solve structured-sparsity problems. Numerically, we demonstrate the generality and autonomy of OTOv2 on a variety of model architectures such as VGG, ResNet, CARN, ConvNeXt, DenseNet and StackedUnets, the majority of which cannot be handled by other methods without extensive handcrafting efforts. Together with benchmark datasets including CIFAR10/100, DIV2K, Fashion-MNIST, SVNH and ImageNet, its effectiveness is validated by performing competitively or even better than the state-of-the-arts. The source code is available at https://github.com/tianyic/only_train_once.

Related papers

GLiNER2: An Efficient Multi-Task Information Extraction System with Schema-Driven Interface [0.873811641236639]
We present GLiNER2, a unified framework that enhances the original GLiNER architecture to support named entity recognition, text classification, and hierarchical structured data extraction.<n>Our experiments demonstrate competitive performance across extraction and classification tasks with substantial improvements in deployment accessibility.
arXiv Detail & Related papers (2025-07-24T16:11:14Z)
ReStNet: A Reusable & Stitchable Network for Dynamic Adaptation on IoT Devices [16.762206782460296]
ReStNet dynamically constructs a hybrid network by stitching two pre-trained models together.<n>It achieves flexible accuracy-efficiency trade-offs at runtime while significantly reducing training cost.
arXiv Detail & Related papers (2025-06-08T16:14:37Z)
Streamlining the Collaborative Chain of Models into A Single Forward Pass in Generation-Based Tasks [13.254837575157786]
"Chain of Models" approach is widely used in Retrieval-Augmented Generation (RAG) and agent-based frameworks. Recent advancements attempt to address this by applying prompt tuning, which allows a shared base model to adapt to multiple tasks. In this paper, we introduce FTHSS, a novel prompt-tuning method that enables models to share hidden states.
arXiv Detail & Related papers (2025-02-16T11:37:14Z)
An Efficient Large Recommendation Model: Towards a Resource-Optimal Scaling Law [2.688944054336062]
Climber is a resource-efficient recommendation framework. It has been successfully deployed on Netease Cloud Music, one of China's largest music streaming platforms.
arXiv Detail & Related papers (2025-02-14T03:25:09Z)
Dynamic Encoder Size Based on Data-Driven Layer-wise Pruning for Speech Recognition [24.71497121634708]
Varying-size models are often required to deploy ASR systems under different hardware and/or application constraints. We present the dynamic encoder size approach, which jointly trains multiple performant models within one supernet from scratch.
arXiv Detail & Related papers (2024-07-10T08:35:21Z)
Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration [100.54419875604721]
All-in-one image restoration tackles different types of degradations with a unified model instead of having task-specific, non-generic models for each degradation. We propose DyNet, a dynamic family of networks designed in an encoder-decoder style for all-in-one image restoration tasks. Our DyNet can seamlessly switch between its bulkier and lightweight variants, thereby offering flexibility for efficient model deployment.
arXiv Detail & Related papers (2024-04-02T17:58:49Z)
Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch [72.26822499434446]
Auto-Train-Once (ATO) is an innovative network pruning algorithm designed to automatically reduce the computational and storage costs of DNNs. We provide a comprehensive convergence analysis as well as extensive experiments, and the results show that our approach achieves state-of-the-art performance across various model architectures.
arXiv Detail & Related papers (2024-03-21T02:33:37Z)
When Parameter-efficient Tuning Meets General-purpose Vision-language Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique. Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z)
OTOv3: Automatic Architecture-Agnostic Neural Network Training and Compression from Structured Pruning to Erasing Operators [57.145175475579315]
This topic spans various techniques, from structured pruning to neural architecture search, encompassing both pruning and erasing operators perspectives. We introduce the third-generation Only-Train-Once (OTOv3), which first automatically trains and compresses a general DNN through pruning and erasing operations. Our empirical results demonstrate the efficacy of OTOv3 across various benchmarks in structured pruning and neural architecture search.
arXiv Detail & Related papers (2023-12-15T00:22:55Z)
Stochastic Configuration Machines: FPGA Implementation [4.57421617811378]
configuration networks (SCNs) are a prime choice in industrial applications due to their merits and feasibility for data modelling. This paper aims to implement SCM models on a field programmable gate array (FPGA) and introduce binary-coded inputs to improve learning performance.
arXiv Detail & Related papers (2023-10-30T02:04:20Z)
You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient [88.58536093633167]
Existing model compression approaches require re-compression or fine-tuning across diverse constraints to accommodate various hardware deployments. We propose a novel approach, YOCO-BERT, to achieve compress once and deploy everywhere. Compared with state-of-the-art algorithms, YOCO-BERT provides more compact models, yet achieving 2.1%-4.5% average accuracy improvement on the GLUE benchmark.
arXiv Detail & Related papers (2021-06-04T12:17:44Z)
NSGANetV2: Evolutionary Multi-Objective Surrogate-Assisted Neural Architecture Search [22.848528877480796]
We propose an efficient NAS algorithm for generating task-specific models that are competitive under multiple competing objectives. It comprises of two surrogates, one at the architecture level to improve sample efficiency and one at the weights level, through a supernet, to improve gradient descent training efficiency. We demonstrate the effectiveness and versatility of the proposed method on six diverse non-standard datasets.
arXiv Detail & Related papers (2020-07-20T18:30:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.