Related papers: Partial Network Cloning

Partial Network Cloning

URL: http://arxiv.org/abs/2303.10597v1
Date: Sun, 19 Mar 2023 08:20:31 GMT
Title: Partial Network Cloning
Authors: Jingwen Ye, Songhua Liu, Xinchao Wang
Abstract summary: PNC conducts partial parametric "cloning" from a source network and then injects the cloned module to the target. Our method yields a significant improvement of 5% in accuracy and 50% in locality when compared with parameter-tuning based methods.
Score: 58.83278629019384
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: In this paper, we study a novel task that enables partial knowledge transfer from pre-trained models, which we term as Partial Network Cloning (PNC). Unlike prior methods that update all or at least part of the parameters in the target network throughout the knowledge transfer process, PNC conducts partial parametric "cloning" from a source network and then injects the cloned module to the target, without modifying its parameters. Thanks to the transferred module, the target network is expected to gain additional functionality, such as inference on new classes; whenever needed, the cloned module can be readily removed from the target, with its original parameters and competence kept intact. Specifically, we introduce an innovative learning scheme that allows us to identify simultaneously the component to be cloned from the source and the position to be inserted within the target network, so as to ensure the optimal performance. Experimental results on several datasets demonstrate that, our method yields a significant improvement of 5% in accuracy and 50% in locality when compared with parameter-tuning based methods. Our code is available at https://github.com/JngwenYe/PNCloning.

Related papers

BeST -- A Novel Source Selection Metric for Transfer Learning [35.32994166809785]
We develop a novel task-similarity metric (BeST) to identify the most transferrable source(s) for a given task. Our metric can provide significant computational savings for transfer learning from a selection of a large number of possible source models.
arXiv Detail & Related papers (2025-01-19T03:58:05Z)
MPruner: Optimizing Neural Network Size with CKA-Based Mutual Information Pruning [7.262751938473306]
Pruning is a well-established technique that reduces the size of neural networks while mathematically guaranteeing accuracy preservation. We develop a new pruning algorithm, MPruner, that leverages mutual information through vector similarity. MPruner achieved up to a 50% reduction in parameters and memory usage for CNN and transformer-based models, with minimal to no loss in accuracy.
arXiv Detail & Related papers (2024-08-24T05:54:47Z)
Conditional Information Gain Trellis [1.290382979353427]
Conditional computing processes an input using only part of the neural network's computational units. We use a Trellis-based approach for generating specific execution paths in a deep convolutional neural network. We show that our conditional execution mechanism achieves comparable or better model performance compared to unconditional baselines.
arXiv Detail & Related papers (2024-02-13T10:23:45Z)
CMFDFormer: Transformer-based Copy-Move Forgery Detection with Continual Learning [52.72888626663642]
Copy-move forgery detection aims at detecting duplicated regions in a suspected forged image. Deep learning based copy-move forgery detection methods are in the ascendant. We propose a Transformer-style copy-move forgery network named as CMFDFormer. We also provide a novel PCSD continual learning framework to help CMFDFormer handle new tasks.
arXiv Detail & Related papers (2023-11-22T09:27:46Z)
On-Device Learning with Binary Neural Networks [2.7040098749051635]
We propose a CL solution that embraces the recent advancements in CL field and the efficiency of the Binary Neural Networks (BNN) The choice of a binary network as backbone is essential to meet the constraints of low power devices.
arXiv Detail & Related papers (2023-08-29T13:48:35Z)
Optimal transfer protocol by incremental layer defrosting [66.76153955485584]
Transfer learning is a powerful tool enabling model training with limited amounts of data. The simplest transfer learning protocol is based on freezing" the feature-extractor layers of a network pre-trained on a data-rich source task. We show that this protocol is often sub-optimal and the largest performance gain may be achieved when smaller portions of the pre-trained network are kept frozen.
arXiv Detail & Related papers (2023-03-02T17:32:11Z)
Prompt Tuning for Parameter-efficient Medical Image Segmentation [79.09285179181225]
We propose and investigate several contributions to achieve a parameter-efficient but effective adaptation for semantic segmentation on two medical imaging datasets. We pre-train this architecture with a dedicated dense self-supervision scheme based on assignments to online generated prototypes. We demonstrate that the resulting neural network model is able to attenuate the gap between fully fine-tuned and parameter-efficiently adapted models.
arXiv Detail & Related papers (2022-11-16T21:55:05Z)
Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters. We find that our approach successfully generates parameters for a wide range of loss prompts. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z)
Kernel Modulation: A Parameter-Efficient Method for Training Convolutional Neural Networks [19.56633207984127]
This work proposes a novel parameter-efficient kernel modulation (KM) method that adapts all parameters of a base network instead of a subset of layers. KM uses lightweight task-specialized kernel modulators that require only an additional 1.4% of the base network parameters. Our results show that KM delivers up to 9% higher accuracy than other parameter-efficient methods on the Transfer Learning benchmark.
arXiv Detail & Related papers (2022-03-29T07:28:50Z)
Meta-learning Transferable Representations with a Single Target Domain [46.83481356352768]
Fine-tuning and joint training do not always improve accuracy on downstream tasks. We propose Meta Representation Learning (MeRLin) to learn transferable features. MeRLin empirically outperforms previous state-of-the-art transfer learning algorithms on various real-world vision and NLP transfer learning benchmarks.
arXiv Detail & Related papers (2020-11-03T01:57:37Z)
Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation [111.44445634272235]
In this paper, we develop a parameter efficient transfer learning architecture, termed as PeterRec. PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks. We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks.
arXiv Detail & Related papers (2020-01-13T14:09:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.