Neural Network Module Decomposition and Recomposition
- URL: http://arxiv.org/abs/2112.13208v1
- Date: Sat, 25 Dec 2021 08:36:47 GMT
- Title: Neural Network Module Decomposition and Recomposition
- Authors: Hiroaki Kingetsu, Kenichi Kobayashi, Taiji Suzuki
- Abstract summary: We propose a modularization method that decomposes a deep neural network (DNN) into small modules from a functionality perspective.
We demonstrate that the proposed method can decompose and recompose DNNs with high compression ratio and high accuracy.
- Score: 35.21448933547118
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a modularization method that decomposes a deep neural network
(DNN) into small modules from a functionality perspective and recomposes them
into a new model for some other task. Decomposed modules are expected to have
the advantages of interpretability and verifiability due to their small size.
In contrast to existing studies based on reusing models that involve
retraining, such as a transfer learning model, the proposed method does not
require retraining and has wide applicability as it can be easily combined with
existing functional modules. The proposed method extracts modules using weight
masks and can be applied to arbitrary DNNs. Unlike existing studies, it
requires no assumption about the network architecture. To extract modules, we
designed a learning method and a loss function to maximize shared weights among
modules. As a result, the extracted modules can be recomposed without a large
increase in the size. We demonstrate that the proposed method can decompose and
recompose DNNs with high compression ratio and high accuracy and is superior to
the existing method through sharing weights between modules.
Related papers
- Train Faster, Perform Better: Modular Adaptive Training in Over-Parameterized Models [31.960749305728488]
We introduce a novel concept dubbed modular neural tangent kernel (mNTK)
We show that the quality of a module's learning is tightly associated with its mNTK's principal eigenvalue $lambda_max$.
We propose a novel training strategy termed Modular Adaptive Training (MAT) to update those modules with their $lambda_max$ exceeding a dynamic threshold.
arXiv Detail & Related papers (2024-05-13T07:46:48Z) - Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation [59.37775534633868]
We present an extremely straightforward approach to transferring pre-trained, task-specific PEFT modules between same-family PLMs.
We also propose a method that allows the transfer of modules between incompatible PLMs without any change in the inference complexity.
arXiv Detail & Related papers (2024-03-27T17:50:00Z) - GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and
reusing ModulEs [64.49176353858792]
We propose generative neuro-symbolic visual reasoning by growing and reusing modules.
The proposed model performs competitively on standard tasks like visual question answering and referring expression comprehension.
It is able to adapt to new visual reasoning tasks by observing a few training examples and reusing modules.
arXiv Detail & Related papers (2023-11-08T18:59:05Z) - Module-wise Adaptive Distillation for Multimodality Foundation Models [125.42414892566843]
multimodal foundation models have demonstrated remarkable generalizability but pose challenges for deployment due to their large sizes.
One effective approach to reducing their sizes is layerwise distillation, wherein small student models are trained to match the hidden representations of large teacher models at each layer.
Motivated by our observation that certain architecture components, referred to as modules, contribute more significantly to the student's performance than others, we propose to track the contributions of individual modules by recording the loss decrement after distillation each module and choose the module with a greater contribution to distill more frequently.
arXiv Detail & Related papers (2023-10-06T19:24:00Z) - Modularizing while Training: A New Paradigm for Modularizing DNN Models [20.892788625187702]
We propose a novel approach that incorporates modularization into the model training process, i.e., modularizing-while-training (MwT)
The accuracy loss caused by MwT is only 1.13 percentage points, which is 1.76 percentage points less than that of the baseline.
The total time cost required for training and modularizing is only 108 minutes, half of the baseline.
arXiv Detail & Related papers (2023-06-15T07:45:43Z) - ModuleFormer: Modularity Emerges from Mixture-of-Experts [60.6148988099284]
This paper proposes a new neural network architecture, ModuleFormer, to improve the efficiency and flexibility of large language models.
Unlike the previous SMoE-based modular language model, ModuleFormer can induce modularity from uncurated data.
arXiv Detail & Related papers (2023-06-07T17:59:57Z) - Modular Deep Learning [120.36599591042908]
Transfer learning has recently become the dominant paradigm of machine learning.
It remains unclear how to develop models that specialise towards multiple tasks without incurring negative interference.
Modular deep learning has emerged as a promising solution to these challenges.
arXiv Detail & Related papers (2023-02-22T18:11:25Z) - Neural Function Modules with Sparse Arguments: A Dynamic Approach to
Integrating Information across Layers [84.57980167400513]
Neural Function Modules (NFM) aims to introduce the same structural capability into deep learning.
Most of the work in the context of feed-forward networks combining top-down and bottom-up feedback is limited to classification problems.
The key contribution of our work is to combine attention, sparsity, top-down and bottom-up feedback, in a flexible algorithm.
arXiv Detail & Related papers (2020-10-15T20:43:17Z) - Are Neural Nets Modular? Inspecting Functional Modularity Through
Differentiable Weight Masks [10.0444013205203]
Understanding if and how NNs are modular could provide insights into how to improve them.
Current inspection methods, however, fail to link modules to their functionality.
arXiv Detail & Related papers (2020-10-05T15:04:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.