Deep Model Reassembly
- URL: http://arxiv.org/abs/2210.17409v2
- Date: Wed, 2 Nov 2022 16:16:28 GMT
- Title: Deep Model Reassembly
- Authors: Xingyi Yang, Daquan Zhou, Songhua Liu, Jingwen Ye, Xinchao Wang
- Abstract summary: We explore a novel knowledge-transfer task, termed as Deep Model Reassembly (DeRy)
The goal of DeRy is to first dissect each model into distinctive building blocks, and then selectively reassemble the derived blocks to produce customized networks.
We demonstrate that on ImageNet, the best reassemble model achieves 78.6% top-1 accuracy without fine-tuning.
- Score: 60.6531819328247
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we explore a novel knowledge-transfer task, termed as Deep
Model Reassembly (DeRy), for general-purpose model reuse. Given a collection of
heterogeneous models pre-trained from distinct sources and with diverse
architectures, the goal of DeRy, as its name implies, is to first dissect each
model into distinctive building blocks, and then selectively reassemble the
derived blocks to produce customized networks under both the hardware resource
and performance constraints. Such ambitious nature of DeRy inevitably imposes
significant challenges, including, in the first place, the feasibility of its
solution. We strive to showcase that, through a dedicated paradigm proposed in
this paper, DeRy can be made not only possibly but practically efficiently.
Specifically, we conduct the partitions of all pre-trained networks jointly via
a cover set optimization, and derive a number of equivalence set, within each
of which the network blocks are treated as functionally equivalent and hence
interchangeable. The equivalence sets learned in this way, in turn, enable
picking and assembling blocks to customize networks subject to certain
constraints, which is achieved via solving an integer program backed up with a
training-free proxy to estimate the task performance. The reassembled models,
give rise to gratifying performances with the user-specified constraints
satisfied. We demonstrate that on ImageNet, the best reassemble model achieves
78.6% top-1 accuracy without fine-tuning, which could be further elevated to
83.2% with end-to-end training. Our code is available at
https://github.com/Adamdad/DeRy
Related papers
- Any Image Restoration with Efficient Automatic Degradation Adaptation [132.81912195537433]
We propose a unified manner to achieve joint embedding by leveraging the inherent similarities across various degradations for efficient and comprehensive restoration.
Our network sets new SOTA records while reducing model complexity by approximately -82% in trainable parameters and -85% in FLOPs.
arXiv Detail & Related papers (2024-07-18T10:26:53Z) - Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration [100.54419875604721]
All-in-one image restoration tackles different types of degradations with a unified model instead of having task-specific, non-generic models for each degradation.
We propose DyNet, a dynamic family of networks designed in an encoder-decoder style for all-in-one image restoration tasks.
Our DyNet can seamlessly switch between its bulkier and lightweight variants, thereby offering flexibility for efficient model deployment.
arXiv Detail & Related papers (2024-04-02T17:58:49Z) - Building a Winning Team: Selecting Source Model Ensembles using a
Submodular Transferability Estimation Approach [20.86345962679122]
Estimating the transferability of publicly available pretrained models to a target task has assumed an important place for transfer learning tasks.
We propose a novel Optimal tranSport-based suBmOdular tRaNsferability metric (OSBORN) to estimate the transferability of an ensemble of models to a downstream task.
arXiv Detail & Related papers (2023-09-05T17:57:31Z) - SortedNet: A Scalable and Generalized Framework for Training Modular Deep Neural Networks [30.069353400127046]
We propose SortedNet to harness the inherent modularity of deep neural networks (DNNs)
SortedNet enables the training of sub-models simultaneously along with the training of the main model.
It is able to train 160 sub-models at once, achieving at least 96% of the original model's performance.
arXiv Detail & Related papers (2023-09-01T05:12:25Z) - Generative Model for Models: Rapid DNN Customization for Diverse Tasks
and Resource Constraints [28.983470365172057]
NN-Factory is a one-for-all framework to generate customized lightweight models for diverse edge scenarios.
The main components of NN-Factory include a modular supernet with pretrained modules that can be conditionally activated to accomplish different tasks.
NN-Factory is able to generate high-quality task- and resource-specific models within few seconds, faster than conventional model customization approaches by orders of magnitude.
arXiv Detail & Related papers (2023-08-29T03:28:14Z) - Chain-of-Skills: A Configurable Model for Open-domain Question Answering [79.8644260578301]
The retrieval model is an indispensable component for real-world knowledge-intensive tasks.
Recent work focuses on customized methods, limiting the model transferability and scalability.
We propose a modular retriever where individual modules correspond to key skills that can be reused across datasets.
arXiv Detail & Related papers (2023-05-04T20:19:39Z) - Retrieve-and-Fill for Scenario-based Task-Oriented Semantic Parsing [110.4684789199555]
We introduce scenario-based semantic parsing: a variant of the original task which first requires disambiguating an utterance's "scenario"
This formulation enables us to isolate coarse-grained and fine-grained aspects of the task, each of which we solve with off-the-shelf neural modules.
Our model is modular, differentiable, interpretable, and allows us to garner extra supervision from scenarios.
arXiv Detail & Related papers (2022-02-02T08:00:21Z) - Data Summarization via Bilevel Optimization [48.89977988203108]
A simple yet powerful approach is to operate on small subsets of data.
In this work, we propose a generic coreset framework that formulates the coreset selection as a cardinality-constrained bilevel optimization problem.
arXiv Detail & Related papers (2021-09-26T09:08:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.