Kernel Modulation: A Parameter-Efficient Method for Training
Convolutional Neural Networks
- URL: http://arxiv.org/abs/2203.15297v1
- Date: Tue, 29 Mar 2022 07:28:50 GMT
- Title: Kernel Modulation: A Parameter-Efficient Method for Training
Convolutional Neural Networks
- Authors: Yuhuang Hu, Shih-Chii Liu
- Abstract summary: This work proposes a novel parameter-efficient kernel modulation (KM) method that adapts all parameters of a base network instead of a subset of layers.
KM uses lightweight task-specialized kernel modulators that require only an additional 1.4% of the base network parameters.
Our results show that KM delivers up to 9% higher accuracy than other parameter-efficient methods on the Transfer Learning benchmark.
- Score: 19.56633207984127
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks, particularly Convolutional Neural Networks (ConvNets),
have achieved incredible success in many vision tasks, but they usually require
millions of parameters for good accuracy performance. With increasing
applications that use ConvNets, updating hundreds of networks for multiple
tasks on an embedded device can be costly in terms of memory, bandwidth, and
energy. Approaches to reduce this cost include model compression and
parameter-efficient models that adapt a subset of network layers for each new
task. This work proposes a novel parameter-efficient kernel modulation (KM)
method that adapts all parameters of a base network instead of a subset of
layers. KM uses lightweight task-specialized kernel modulators that require
only an additional 1.4% of the base network parameters. With multiple tasks,
only the task-specialized KM weights are communicated and stored on the
end-user device. We applied this method in training ConvNets for Transfer
Learning and Meta-Learning scenarios. Our results show that KM delivers up to
9% higher accuracy than other parameter-efficient methods on the Transfer
Learning benchmark.
Related papers
- Learning Compact Neural Networks with Deep Overparameterised Multitask
Learning [0.0]
We present a simple, efficient and effective multitask learning over parameterisation neural network design.
Experiments on two challenging multitask datasets (NYUv2 and COCO) demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-08-25T10:51:02Z) - Rewarded meta-pruning: Meta Learning with Rewards for Channel Pruning [19.978542231976636]
This paper proposes a novel method to reduce the parameters and FLOPs for computational efficiency in deep learning models.
We introduce accuracy and efficiency coefficients to control the trade-off between the accuracy of the network and its computing efficiency.
arXiv Detail & Related papers (2023-01-26T12:32:01Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - An Experimental Study of the Impact of Pre-training on the Pruning of a
Convolutional Neural Network [0.0]
In recent years, deep neural networks have known a wide success in various application domains.
Deep neural networks usually involve a large number of parameters, which correspond to the weights of the network.
The pruning methods notably attempt to reduce the size of the parameter set, by identifying and removing the irrelevant weights.
arXiv Detail & Related papers (2021-12-15T16:02:15Z) - Efficient Feature Transformations for Discriminative and Generative
Continual Learning [98.10425163678082]
We propose a simple task-specific feature map transformation strategy for continual learning.
Theses provide powerful flexibility for learning new tasks, achieved with minimal parameters added to the base architecture.
We demonstrate the efficacy and efficiency of our method with an extensive set of experiments in discriminative (CIFAR-100 and ImageNet-1K) and generative sequences of tasks.
arXiv Detail & Related papers (2021-03-25T01:48:14Z) - Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z) - Channel Planting for Deep Neural Networks using Knowledge Distillation [3.0165431987188245]
We present a novel incremental training algorithm for deep neural networks called planting.
Our planting can search the optimal network architecture with smaller number of parameters for improving the network performance.
We evaluate the effectiveness of the proposed method on different datasets such as CIFAR-10/100 and STL-10.
arXiv Detail & Related papers (2020-11-04T16:29:59Z) - Neural Parameter Allocation Search [57.190693718951316]
Training neural networks requires increasing amounts of memory.
Existing methods assume networks have many identical layers and utilize hand-crafted sharing strategies that fail to generalize.
We introduce Neural Allocation Search (NPAS), a novel task where the goal is to train a neural network given an arbitrary, fixed parameter budget.
NPAS covers both low-budget regimes, which produce compact networks, as well as a novel high-budget regime, where additional capacity can be added to boost performance without increasing inference FLOPs.
arXiv Detail & Related papers (2020-06-18T15:01:00Z) - Adjoined Networks: A Training Paradigm with Applications to Network
Compression [3.995047443480282]
We introduce Adjoined Networks, or AN, a learning paradigm that trains both the original base network and the smaller compressed network together.
Using ResNet-50 as the base network, AN achieves 71.8% top-1 accuracy with only 1.8M parameters and 1.6 GFLOPs on the ImageNet data-set.
We propose Differentiable Adjoined Networks (DAN), a training paradigm that augments AN by using neural architecture search to jointly learn both the width and the weights for each layer of the smaller network.
arXiv Detail & Related papers (2020-06-10T02:48:16Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.