Related papers: Transfer-Once-For-All: AI Model Optimization for Edge

Transfer-Once-For-All: AI Model Optimization for Edge

URL: http://arxiv.org/abs/2303.15485v2
Date: Sun, 2 Jul 2023 17:21:51 GMT
Title: Transfer-Once-For-All: AI Model Optimization for Edge
Authors: Achintya Kundu (IBM Research), Laura Wynter (IBM Research), Rhui Dih Lee (IBM Research), Luis Angel Bathen (IBM Research)
Abstract summary: We propose Transfer-Once-For-All (TOFA) for supernet-style training on small data sets with constant computational training cost. To overcome the challenges arising from small data, TOFA utilizes a unified semi-supervised training loss to simultaneously train all existings within the supernet.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Weight-sharing neural architecture search aims to optimize a configurable neural network model (supernet) for a variety of deployment scenarios across many devices with different resource constraints. Existing approaches use evolutionary search to extract models of different sizes from a supernet trained on a very large data set, and then fine-tune the extracted models on the typically small, real-world data set of interest. The computational cost of training thus grows linearly with the number of different model deployment scenarios. Hence, we propose Transfer-Once-For-All (TOFA) for supernet-style training on small data sets with constant computational training cost over any number of edge deployment scenarios. Given a task, TOFA obtains custom neural networks, both the topology and the weights, optimized for any number of edge deployment scenarios. To overcome the challenges arising from small data, TOFA utilizes a unified semi-supervised training loss to simultaneously train all subnets within the supernet, coupled with on-the-fly architecture selection at deployment time.

Related papers

Private Training & Data Generation by Clustering Embeddings [74.00687214400021]
Differential privacy (DP) provides a robust framework for protecting individual data.<n>We introduce a novel principled method for DP synthetic image embedding generation.<n> Empirically, a simple two-layer neural network trained on synthetically generated embeddings achieves state-of-the-art (SOTA) classification accuracy.
arXiv Detail & Related papers (2025-06-20T00:17:14Z)
Efficient Supernet Training with Orthogonal Softmax for Scalable ASR Model Compression [43.25633915651986]
We use supernet training to jointly train multiple encoders of varying sizes, enabling dynamic model size adjustment to fit hardware constraints without redundant training. We introduce a novel method called OrthoSoftmax, which applies multiple softmax functions to efficiently identify optimals within the supernet. Our results with CTC on Librispeech and TED-LIUM-v2 show that FLOPs-aware component-wise selection achieves the best overall performance.
arXiv Detail & Related papers (2025-01-31T05:23:03Z)
No Free Lunch From Random Feature Ensembles [23.661623767100384]
Given a budget on total model size, one must decide whether to train a single, large neural network or to combine the predictions of many smaller networks. We prove that when a fixed number of trainable parameters are partitioned among $K$ independently trained models, $K=1$ achieves optimal performance. We identify conditions on the kernel and task eigenstructure under which ensembles can achieve near-optimal scaling laws.
arXiv Detail & Related papers (2024-12-06T20:55:27Z)
Dynamic Encoder Size Based on Data-Driven Layer-wise Pruning for Speech Recognition [24.71497121634708]
Varying-size models are often required to deploy ASR systems under different hardware and/or application constraints. We present the dynamic encoder size approach, which jointly trains multiple performant models within one supernet from scratch.
arXiv Detail & Related papers (2024-07-10T08:35:21Z)
Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch [72.26822499434446]
Auto-Train-Once (ATO) is an innovative network pruning algorithm designed to automatically reduce the computational and storage costs of DNNs. We provide a comprehensive convergence analysis as well as extensive experiments, and the results show that our approach achieves state-of-the-art performance across various model architectures.
arXiv Detail & Related papers (2024-03-21T02:33:37Z)
SortedNet: A Scalable and Generalized Framework for Training Modular Deep Neural Networks [30.069353400127046]
We propose SortedNet to harness the inherent modularity of deep neural networks (DNNs) SortedNet enables the training of sub-models simultaneously along with the training of the main model. It is able to train 160 sub-models at once, achieving at least 96% of the original model's performance.
arXiv Detail & Related papers (2023-09-01T05:12:25Z)
Training Deep Surrogate Models with Large Scale Online Learning [48.7576911714538]
Deep learning algorithms have emerged as a viable alternative for obtaining fast solutions for PDEs. Models are usually trained on synthetic data generated by solvers, stored on disk and read back for training. It proposes an open source online training framework for deep surrogate models.
arXiv Detail & Related papers (2023-06-28T12:02:27Z)
OFA$^2$: A Multi-Objective Perspective for the Once-for-All Neural Architecture Search [79.36688444492405]
Once-for-All (OFA) is a Neural Architecture Search (NAS) framework designed to address the problem of searching efficient architectures for devices with different resources constraints. We aim to give one step further in the search for efficiency by explicitly conceiving the search stage as a multi-objective optimization problem.
arXiv Detail & Related papers (2023-03-23T21:30:29Z)
Supernet Training for Federated Image Classification under System Heterogeneity [15.2292571922932]
In this work, we propose a novel framework to consider both scenarios, namely Federation of Supernet Training (FedSup) It is inspired by how averaging parameters in the model aggregation stage of Federated Learning (FL) is similar to weight-sharing in supernet training. Under our framework, we present an efficient algorithm (E-FedSup) by sending the sub-model to clients in the broadcast stage for reducing communication costs and training overhead.
arXiv Detail & Related papers (2022-06-03T02:21:01Z)
ADDS: Adaptive Differentiable Sampling for Robust Multi-Party Learning [24.288233074516455]
We propose a novel adaptive differentiable sampling framework (ADDS) for robust and communication-efficient multi-party learning. The proposed framework significantly reduces local computation and communication costs while speeding up the central model convergence.
arXiv Detail & Related papers (2021-10-29T03:35:15Z)
Solving Mixed Integer Programs Using Neural Networks [57.683491412480635]
This paper applies learning to the two key sub-tasks of a MIP solver, generating a high-quality joint variable assignment, and bounding the gap in objective value between that assignment and an optimal one. Our approach constructs two corresponding neural network-based components, Neural Diving and Neural Branching, to use in a base MIP solver such as SCIP. We evaluate our approach on six diverse real-world datasets, including two Google production datasets and MIPLIB, by training separate neural networks on each.
arXiv Detail & Related papers (2020-12-23T09:33:11Z)
Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network. PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.