Decomposing Convolutional Neural Networks into Reusable and Replaceable
Modules
- URL: http://arxiv.org/abs/2110.07720v1
- Date: Mon, 11 Oct 2021 20:41:50 GMT
- Title: Decomposing Convolutional Neural Networks into Reusable and Replaceable
Modules
- Authors: Rangeet Pan and Hridesh Rajan
- Abstract summary: We propose to decompose a CNN model used for image classification problems into modules for each output class.
These modules can further be reused or replaced to build a new model.
We have evaluated our approach with CIFAR-10, CIFAR-100, and ImageNet tiny datasets with three variations of ResNet models.
- Score: 15.729284470106826
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training from scratch is the most common way to build a Convolutional Neural
Network (CNN) based model. What if we can build new CNN models by reusing parts
from previously build CNN models? What if we can improve a CNN model by
replacing (possibly faulty) parts with other parts? In both cases, instead of
training, can we identify the part responsible for each output class (module)
in the model(s) and reuse or replace only the desired output classes to build a
model? Prior work has proposed decomposing dense-based networks into modules
(one for each output class) to enable reusability and replaceability in various
scenarios. However, this work is limited to the dense layers and based on the
one-to-one relationship between the nodes in consecutive layers. Due to the
shared architecture in the CNN model, prior work cannot be adapted directly. In
this paper, we propose to decompose a CNN model used for image classification
problems into modules for each output class. These modules can further be
reused or replaced to build a new model. We have evaluated our approach with
CIFAR-10, CIFAR-100, and ImageNet tiny datasets with three variations of ResNet
models and found that enabling decomposition comes with a small cost (2.38% and
0.81% for top-1 and top-5 accuracy, respectively). Also, building a model by
reusing or replacing modules can be done with a 2.3% and 0.5% average loss of
accuracy. Furthermore, reusing and replacing these modules reduces CO2e
emission by ~37 times compared to training the model from scratch.
Related papers
- Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration [100.54419875604721]
All-in-one image restoration tackles different types of degradations with a unified model instead of having task-specific, non-generic models for each degradation.
We propose DyNet, a dynamic family of networks designed in an encoder-decoder style for all-in-one image restoration tasks.
Our DyNet can seamlessly switch between its bulkier and lightweight variants, thereby offering flexibility for efficient model deployment.
arXiv Detail & Related papers (2024-04-02T17:58:49Z) - Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment [69.33930972652594]
We propose a novel structural pruning approach to jointly learn the weights and structurally prune architectures of CNN models.
The core element of our method is a Reinforcement Learning (RL) agent whose actions determine the pruning ratios of the CNN model's layers.
We conduct the joint training and pruning by iteratively training the model's weights and the agent's policy.
arXiv Detail & Related papers (2024-03-28T15:22:29Z) - Reusing Convolutional Neural Network Models through Modularization and
Composition [22.823870645316397]
We propose two modularization approaches named CNNSplitter and GradSplitter.
CNNSplitter decomposes a trained convolutional neural network (CNN) model into $N$ small reusable modules.
The resulting modules can be reused to patch existing CNN models or build new CNN models through composition.
arXiv Detail & Related papers (2023-11-08T03:18:49Z) - Modularizing while Training: A New Paradigm for Modularizing DNN Models [20.892788625187702]
We propose a novel approach that incorporates modularization into the model training process, i.e., modularizing-while-training (MwT)
The accuracy loss caused by MwT is only 1.13 percentage points, which is 1.76 percentage points less than that of the baseline.
The total time cost required for training and modularizing is only 108 minutes, half of the baseline.
arXiv Detail & Related papers (2023-06-15T07:45:43Z) - Decomposing a Recurrent Neural Network into Modules for Enabling
Reusability and Replacement [11.591247347259317]
We propose the first approach to decompose an RNN into modules.
We study different types of RNNs, i.e., Vanilla, LSTM, and GRU.
We show how such RNN modules can be reused and replaced in various scenarios.
arXiv Detail & Related papers (2022-12-09T03:29:38Z) - Deep Model Assembling [31.88606253639418]
This paper studies a divide-and-conquer strategy to train large models.
It divides a large model into smaller modules, training them independently, and reassembling the trained modules to obtain the target model.
We introduce a global, shared meta model to implicitly link all the modules together.
This enables us to train highly compatible modules that collaborate effectively when they are assembled together.
arXiv Detail & Related papers (2022-12-08T08:04:06Z) - Part-Based Models Improve Adversarial Robustness [57.699029966800644]
We show that combining human prior knowledge with end-to-end learning can improve the robustness of deep neural networks.
Our model combines a part segmentation model with a tiny classifier and is trained end-to-end to simultaneously segment objects into parts.
Our experiments indicate that these models also reduce texture bias and yield better robustness against common corruptions and spurious correlations.
arXiv Detail & Related papers (2022-09-15T15:41:47Z) - Patching Weak Convolutional Neural Network Models through Modularization
and Composition [19.986199290508925]
A convolutional neuron network (CNN) model for classification tasks often performs unsatisfactorily.
We propose a compressed modularization approach, CNNSplitter, which decomposes a strong CNN model for $N$-class classification into $N$ smaller CNN modules.
We show that CNNSplitter can patch a weak CNN model through modularization and composition, thus providing a new solution for developing robust CNN models.
arXiv Detail & Related papers (2022-09-11T15:26:16Z) - A Battle of Network Structures: An Empirical Study of CNN, Transformer,
and MLP [121.35904748477421]
Convolutional neural networks (CNN) are the dominant deep neural network (DNN) architecture for computer vision.
Transformer and multi-layer perceptron (MLP)-based models, such as Vision Transformer and Vision-Mixer, started to lead new trends.
In this paper, we conduct empirical studies on these DNN structures and try to understand their respective pros and cons.
arXiv Detail & Related papers (2021-08-30T06:09:02Z) - MutualNet: Adaptive ConvNet via Mutual Learning from Different Model
Configurations [51.85020143716815]
We propose MutualNet to train a single network that can run at a diverse set of resource constraints.
Our method trains a cohort of model configurations with various network widths and input resolutions.
MutualNet is a general training methodology that can be applied to various network structures.
arXiv Detail & Related papers (2021-05-14T22:30:13Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.