Related papers: Efficient Transfer Learning via Joint Adaptation of Network Architecture and Weight

Efficient Transfer Learning via Joint Adaptation of Network Architecture and Weight

URL: http://arxiv.org/abs/2105.08994v1
Date: Wed, 19 May 2021 08:58:04 GMT
Title: Efficient Transfer Learning via Joint Adaptation of Network Architecture and Weight
Authors: Ming Sun, Haoxuan Dou, Junjie Yan
Abstract summary: Recent worksin neural architecture search (NAS) can aid transfer learning by establishing sufficient network search space. We propose a novel framework consisting of two modules, the neural architecturesearch module for architecture transfer and the neural weight search module for weight transfer. These two modules conduct search on thetarget task based on a reduced super-networks, so we only need to trainonce on the source task.
Score: 66.8543732597723
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transfer learning can boost the performance on the targettask by leveraging the knowledge of the source domain. Recent worksin neural architecture search (NAS), especially one-shot NAS, can aidtransfer learning by establishing sufficient network search space. How-ever, existing NAS methods tend to approximate huge search spaces byexplicitly building giant super-networks with multiple sub-paths, anddiscard super-network weights after a child structure is found. Both thecharacteristics of existing approaches causes repetitive network trainingon source tasks in transfer learning. To remedy the above issues, we re-duce the super-network size by randomly dropping connection betweennetwork blocks while embedding a larger search space. Moreover, wereuse super-network weights to avoid redundant training by proposinga novel framework consisting of two modules, the neural architecturesearch module for architecture transfer and the neural weight searchmodule for weight transfer. These two modules conduct search on thetarget task based on a reduced super-networks, so we only need to trainonce on the source task. We experiment our framework on both MS-COCO and CUB-200 for the object detection and fine-grained imageclassification tasks, and show promising improvements with onlyO(CN)super-network complexity.

Related papers

Learning Morphisms with Gauss-Newton Approximation for Growing Networks [43.998746572276076]
A popular method for Neural Architecture Search (NAS) is based on growing networks via small local changes to the network's architecture called network morphisms. Here we propose a NAS method for growing a network by using a Gauss-Newton approximation of the loss function to efficiently learn and evaluate candidate network morphisms.
arXiv Detail & Related papers (2024-11-07T01:12:42Z)
Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost [73.28626942658022]
We aim at exploiting additional auxiliary labels from an independent (auxiliary) task to boost the primary task performance. Our method is architecture-based with a flexible asymmetric structure for the primary and auxiliary tasks. Experiments with six tasks on NYU v2, CityScapes, and Taskonomy datasets using VGG, ResNet, and ViT backbones validate the promising performance.
arXiv Detail & Related papers (2024-05-09T11:50:19Z)
OFA$^2$: A Multi-Objective Perspective for the Once-for-All Neural Architecture Search [79.36688444492405]
Once-for-All (OFA) is a Neural Architecture Search (NAS) framework designed to address the problem of searching efficient architectures for devices with different resources constraints. We aim to give one step further in the search for efficiency by explicitly conceiving the search stage as a multi-objective optimization problem.
arXiv Detail & Related papers (2023-03-23T21:30:29Z)
Evolutionary Neural Cascade Search across Supernetworks [68.8204255655161]
We introduce ENCAS - Evolutionary Neural Cascade Search. ENCAS can be used to search over multiple pretrained supernetworks. We test ENCAS on common computer vision benchmarks.
arXiv Detail & Related papers (2022-03-08T11:06:01Z)
A Hardware-Aware System for Accelerating Deep Neural Network Optimization [7.189421078452572]
We propose a comprehensive system that automatically and efficiently finds sub-networks from a pre-trained super-network. By combining novel search tactics and algorithms with intelligent use of predictors, we significantly decrease the time needed to find optimal sub-networks.
arXiv Detail & Related papers (2022-02-25T20:07:29Z)
Understanding and Accelerating Neural Architecture Search with Training-Free and Theory-Grounded Metrics [117.4281417428145]
This work targets designing a principled and unified training-free framework for Neural Architecture Search (NAS) NAS has been explosively studied to automate the discovery of top-performer neural networks, but suffers from heavy resource consumption and often incurs search bias due to truncated training or approximations. We present a unified framework to understand and accelerate NAS, by disentangling "TEG" characteristics of searched networks.
arXiv Detail & Related papers (2021-08-26T17:52:07Z)
FNA++: Fast Network Adaptation via Parameter Remapping and Architecture Search [35.61441231491448]
We propose a Fast Network Adaptation (FNA++) method, which can adapt both the architecture and parameters of a seed network. In our experiments, we apply FNA++ on MobileNetV2 to obtain new networks for semantic segmentation, object detection, and human pose estimation. The total computation cost of FNA++ is significantly less than SOTA segmentation and detection NAS approaches.
arXiv Detail & Related papers (2020-06-21T10:03:34Z)
Adjoined Networks: A Training Paradigm with Applications to Network Compression [3.995047443480282]
We introduce Adjoined Networks, or AN, a learning paradigm that trains both the original base network and the smaller compressed network together. Using ResNet-50 as the base network, AN achieves 71.8% top-1 accuracy with only 1.8M parameters and 1.6 GFLOPs on the ImageNet data-set. We propose Differentiable Adjoined Networks (DAN), a training paradigm that augments AN by using neural architecture search to jointly learn both the width and the weights for each layer of the smaller network.
arXiv Detail & Related papers (2020-06-10T02:48:16Z)
From Federated to Fog Learning: Distributed Machine Learning over Heterogeneous Wireless Networks [71.23327876898816]
Federated learning has emerged as a technique for training ML models at the network edge by leveraging processing capabilities across the nodes that collect the data. We advocate a new learning paradigm called fog learning which will intelligently distribute ML model training across the continuum of nodes from edge devices to cloud servers.
arXiv Detail & Related papers (2020-06-07T05:11:18Z)
Fast Neural Network Adaptation via Parameter Remapping and Architecture Search [35.61441231491448]
Deep neural networks achieve remarkable performance in many computer vision tasks. Most state-of-the-art (SOTA) semantic segmentation and object detection approaches reuse neural network architectures designed for image classification as the backbone. One major challenge though, is that ImageNet pre-training of the search space representation incurs huge computational cost. In this paper, we propose a Fast Neural Network Adaptation (FNA) method, which can adapt both the architecture and parameters of a seed network.
arXiv Detail & Related papers (2020-01-08T13:45:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.