Efficient Transfer Learning via Joint Adaptation of Network Architecture
and Weight
- URL: http://arxiv.org/abs/2105.08994v1
- Date: Wed, 19 May 2021 08:58:04 GMT
- Title: Efficient Transfer Learning via Joint Adaptation of Network Architecture
and Weight
- Authors: Ming Sun, Haoxuan Dou, Junjie Yan
- Abstract summary: Recent worksin neural architecture search (NAS) can aid transfer learning by establishing sufficient network search space.
We propose a novel framework consisting of two modules, the neural architecturesearch module for architecture transfer and the neural weight search module for weight transfer.
These two modules conduct search on thetarget task based on a reduced super-networks, so we only need to trainonce on the source task.
- Score: 66.8543732597723
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transfer learning can boost the performance on the targettask by leveraging
the knowledge of the source domain. Recent worksin neural architecture search
(NAS), especially one-shot NAS, can aidtransfer learning by establishing
sufficient network search space. How-ever, existing NAS methods tend to
approximate huge search spaces byexplicitly building giant super-networks with
multiple sub-paths, anddiscard super-network weights after a child structure is
found. Both thecharacteristics of existing approaches causes repetitive network
trainingon source tasks in transfer learning. To remedy the above issues, we
re-duce the super-network size by randomly dropping connection betweennetwork
blocks while embedding a larger search space. Moreover, wereuse super-network
weights to avoid redundant training by proposinga novel framework consisting of
two modules, the neural architecturesearch module for architecture transfer and
the neural weight searchmodule for weight transfer. These two modules conduct
search on thetarget task based on a reduced super-networks, so we only need to
trainonce on the source task. We experiment our framework on both MS-COCO and
CUB-200 for the object detection and fine-grained imageclassification tasks,
and show promising improvements with onlyO(CN)super-network complexity.
Related papers
- Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost [73.28626942658022]
We aim at exploiting additional auxiliary labels from an independent (auxiliary) task to boost the primary task performance.
Our method is architecture-based with a flexible asymmetric structure for the primary and auxiliary tasks.
Experiments with six tasks on NYU v2, CityScapes, and Taskonomy datasets using VGG, ResNet, and ViT backbones validate the promising performance.
arXiv Detail & Related papers (2024-05-09T11:50:19Z) - OFA$^2$: A Multi-Objective Perspective for the Once-for-All Neural
Architecture Search [79.36688444492405]
Once-for-All (OFA) is a Neural Architecture Search (NAS) framework designed to address the problem of searching efficient architectures for devices with different resources constraints.
We aim to give one step further in the search for efficiency by explicitly conceiving the search stage as a multi-objective optimization problem.
arXiv Detail & Related papers (2023-03-23T21:30:29Z) - Evolutionary Neural Cascade Search across Supernetworks [68.8204255655161]
We introduce ENCAS - Evolutionary Neural Cascade Search.
ENCAS can be used to search over multiple pretrained supernetworks.
We test ENCAS on common computer vision benchmarks.
arXiv Detail & Related papers (2022-03-08T11:06:01Z) - A Hardware-Aware System for Accelerating Deep Neural Network
Optimization [7.189421078452572]
We propose a comprehensive system that automatically and efficiently finds sub-networks from a pre-trained super-network.
By combining novel search tactics and algorithms with intelligent use of predictors, we significantly decrease the time needed to find optimal sub-networks.
arXiv Detail & Related papers (2022-02-25T20:07:29Z) - Understanding and Accelerating Neural Architecture Search with
Training-Free and Theory-Grounded Metrics [117.4281417428145]
This work targets designing a principled and unified training-free framework for Neural Architecture Search (NAS)
NAS has been explosively studied to automate the discovery of top-performer neural networks, but suffers from heavy resource consumption and often incurs search bias due to truncated training or approximations.
We present a unified framework to understand and accelerate NAS, by disentangling "TEG" characteristics of searched networks.
arXiv Detail & Related papers (2021-08-26T17:52:07Z) - FNA++: Fast Network Adaptation via Parameter Remapping and Architecture
Search [35.61441231491448]
We propose a Fast Network Adaptation (FNA++) method, which can adapt both the architecture and parameters of a seed network.
In our experiments, we apply FNA++ on MobileNetV2 to obtain new networks for semantic segmentation, object detection, and human pose estimation.
The total computation cost of FNA++ is significantly less than SOTA segmentation and detection NAS approaches.
arXiv Detail & Related papers (2020-06-21T10:03:34Z) - Adjoined Networks: A Training Paradigm with Applications to Network
Compression [3.995047443480282]
We introduce Adjoined Networks, or AN, a learning paradigm that trains both the original base network and the smaller compressed network together.
Using ResNet-50 as the base network, AN achieves 71.8% top-1 accuracy with only 1.8M parameters and 1.6 GFLOPs on the ImageNet data-set.
We propose Differentiable Adjoined Networks (DAN), a training paradigm that augments AN by using neural architecture search to jointly learn both the width and the weights for each layer of the smaller network.
arXiv Detail & Related papers (2020-06-10T02:48:16Z) - From Federated to Fog Learning: Distributed Machine Learning over
Heterogeneous Wireless Networks [71.23327876898816]
Federated learning has emerged as a technique for training ML models at the network edge by leveraging processing capabilities across the nodes that collect the data.
We advocate a new learning paradigm called fog learning which will intelligently distribute ML model training across the continuum of nodes from edge devices to cloud servers.
arXiv Detail & Related papers (2020-06-07T05:11:18Z) - Fast Neural Network Adaptation via Parameter Remapping and Architecture
Search [35.61441231491448]
Deep neural networks achieve remarkable performance in many computer vision tasks.
Most state-of-the-art (SOTA) semantic segmentation and object detection approaches reuse neural network architectures designed for image classification as the backbone.
One major challenge though, is that ImageNet pre-training of the search space representation incurs huge computational cost.
In this paper, we propose a Fast Neural Network Adaptation (FNA) method, which can adapt both the architecture and parameters of a seed network.
arXiv Detail & Related papers (2020-01-08T13:45:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.