Network-Agnostic Knowledge Transfer for Medical Image Segmentation
- URL: http://arxiv.org/abs/2101.09560v1
- Date: Sat, 23 Jan 2021 19:06:14 GMT
- Title: Network-Agnostic Knowledge Transfer for Medical Image Segmentation
- Authors: Shuhang Wang, Vivek Kumar Singh, Alex Benjamin, Mercy Asiedu, Elham
Yousef Kalafi, Eugene Cheah, Viksit Kumar, Anthony Samir
- Abstract summary: We propose a knowledge transfer approach from a teacher to a student network wherein we train the student on an independent transferal dataset.
We studied knowledge transfer from a single teacher, combination of knowledge transfer and fine-tuning, and knowledge transfer from multiple teachers.
The proposed algorithm is effective for knowledge transfer and easily tunable.
- Score: 2.25146058725705
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conventional transfer learning leverages weights of pre-trained networks, but
mandates the need for similar neural architectures. Alternatively, knowledge
distillation can transfer knowledge between heterogeneous networks but often
requires access to the original training data or additional generative
networks. Knowledge transfer between networks can be improved by being agnostic
to the choice of network architecture and reducing the dependence on original
training data. We propose a knowledge transfer approach from a teacher to a
student network wherein we train the student on an independent transferal
dataset, whose annotations are generated by the teacher. Experiments were
conducted on five state-of-the-art networks for semantic segmentation and seven
datasets across three imaging modalities. We studied knowledge transfer from a
single teacher, combination of knowledge transfer and fine-tuning, and
knowledge transfer from multiple teachers. The student model with a single
teacher achieved similar performance as the teacher; and the student model with
multiple teachers achieved better performance than the teachers. The salient
features of our algorithm include: 1)no need for original training data or
generative networks, 2) knowledge transfer between different architectures, 3)
ease of implementation for downstream tasks by using the downstream task
dataset as the transferal dataset, 4) knowledge transfer of an ensemble of
models, trained independently, into one student model. Extensive experiments
demonstrate that the proposed algorithm is effective for knowledge transfer and
easily tunable.
Related papers
- Direct Distillation between Different Domains [97.39470334253163]
We propose a new one-stage method dubbed Direct Distillation between Different Domains" (4Ds)
We first design a learnable adapter based on the Fourier transform to separate the domain-invariant knowledge from the domain-specific knowledge.
We then build a fusion-activation mechanism to transfer the valuable domain-invariant knowledge to the student network.
arXiv Detail & Related papers (2024-01-12T02:48:51Z) - Distribution Shift Matters for Knowledge Distillation with Webly
Collected Images [91.66661969598755]
We propose a novel method dubbed Knowledge Distillation between Different Distributions" (KD$3$)
We first dynamically select useful training instances from the webly collected data according to the combined predictions of teacher network and student network.
We also build a new contrastive learning block called MixDistribution to generate perturbed data with a new distribution for instance alignment.
arXiv Detail & Related papers (2023-07-21T10:08:58Z) - Learning Knowledge Representation with Meta Knowledge Distillation for
Single Image Super-Resolution [82.89021683451432]
We propose a model-agnostic meta knowledge distillation method under the teacher-student architecture for the single image super-resolution task.
Experiments conducted on various single image super-resolution datasets demonstrate that our proposed method outperforms existing defined knowledge representation related distillation methods.
arXiv Detail & Related papers (2022-07-18T02:41:04Z) - Self-Supervised Knowledge Transfer via Loosely Supervised Auxiliary
Tasks [24.041268664220294]
knowledge transfer using convolutional neural networks (CNNs) can help efficiently train a CNN with fewer parameters or maximize the generalization performance under limited supervision.
We propose a simple yet powerful knowledge transfer methodology without any restrictions regarding the network structure or dataset used.
We devise a training methodology that transfers previously learned knowledge to the current training process as an auxiliary task for the target task through self-supervision using a soft label.
arXiv Detail & Related papers (2021-10-25T07:18:26Z) - Point Adversarial Self Mining: A Simple Method for Facial Expression
Recognition [79.75964372862279]
We propose Point Adversarial Self Mining (PASM) to improve the recognition accuracy in facial expression recognition.
PASM uses a point adversarial attack method and a trained teacher network to locate the most informative position related to the target task.
The adaptive learning materials generation and teacher/student update can be conducted more than one time, improving the network capability iteratively.
arXiv Detail & Related papers (2020-08-26T06:39:24Z) - Representation Transfer by Optimal Transport [34.77292648424614]
We use optimal transport to quantify the match between two representations.
This distance defines a regularizer promoting the similarity of the student's representation with that of the teacher.
arXiv Detail & Related papers (2020-07-13T23:42:06Z) - Efficient Crowd Counting via Structured Knowledge Transfer [122.30417437707759]
Crowd counting is an application-oriented task and its inference efficiency is crucial for real-world applications.
We propose a novel Structured Knowledge Transfer framework to generate a lightweight but still highly effective student network.
Our models obtain at least 6.5$times$ speed-up on an Nvidia 1080 GPU and even achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-03-23T08:05:41Z) - Data-Free Knowledge Amalgamation via Group-Stack Dual-GAN [80.17705319689139]
We propose a data-free knowledge amalgamate strategy to craft a well-behaved multi-task student network from multiple single/multi-task teachers.
The proposed method without any training data achieves the surprisingly competitive results, even compared with some full-supervised methods.
arXiv Detail & Related papers (2020-03-20T03:20:52Z) - Inter- and Intra-domain Knowledge Transfer for Related Tasks in Deep
Character Recognition [2.320417845168326]
Pre-training a deep neural network on the ImageNet dataset is a common practice for training deep learning models.
The technique of pre-training on one task and then retraining on a new one is called transfer learning.
In this paper we analyse the effectiveness of using deep transfer learning for character recognition tasks.
arXiv Detail & Related papers (2020-01-02T14:18:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.