MoMA: Momentum Contrastive Learning with Multi-head Attention-based
Knowledge Distillation for Histopathology Image Analysis
- URL: http://arxiv.org/abs/2308.16561v1
- Date: Thu, 31 Aug 2023 08:54:59 GMT
- Title: MoMA: Momentum Contrastive Learning with Multi-head Attention-based
Knowledge Distillation for Histopathology Image Analysis
- Authors: Trinh Thi Le Vuong and Jin Tae Kwak
- Abstract summary: A lack of quality data is a common issue when it comes to a specific task in computational pathology.
We propose to exploit knowledge distillation, i.e., utilize the existing model to learn a new, target model.
We employ a student-teacher framework to learn a target model from a pre-trained, teacher model without direct access to source data.
- Score: 5.396167537615578
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: There is no doubt that advanced artificial intelligence models and high
quality data are the keys to success in developing computational pathology
tools. Although the overall volume of pathology data keeps increasing, a lack
of quality data is a common issue when it comes to a specific task due to
several reasons including privacy and ethical issues with patient data. In this
work, we propose to exploit knowledge distillation, i.e., utilize the existing
model to learn a new, target model, to overcome such issues in computational
pathology. Specifically, we employ a student-teacher framework to learn a
target model from a pre-trained, teacher model without direct access to source
data and distill relevant knowledge via momentum contrastive learning with
multi-head attention mechanism, which provides consistent and context-aware
feature representations. This enables the target model to assimilate
informative representations of the teacher model while seamlessly adapting to
the unique nuances of the target data. The proposed method is rigorously
evaluated across different scenarios where the teacher model was trained on the
same, relevant, and irrelevant classification tasks with the target model.
Experimental results demonstrate the accuracy and robustness of our approach in
transferring knowledge to different domains and tasks, outperforming other
related methods. Moreover, the results provide a guideline on the learning
strategy for different types of tasks and scenarios in computational pathology.
Code is available at: \url{https://github.com/trinhvg/MoMA}.
Related papers
- Learning Objective-Specific Active Learning Strategies with Attentive
Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting.
We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem.
Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z) - TIDo: Source-free Task Incremental Learning in Non-stationary
Environments [0.0]
Updating a model-based agent to learn new target tasks requires us to store past training data.
Few-shot task incremental learning methods overcome the limitation of labeled target datasets.
We propose a one-shot task incremental learning approach that can adapt to non-stationary source and target tasks.
arXiv Detail & Related papers (2023-01-28T02:19:45Z) - Prototype-guided Cross-task Knowledge Distillation for Large-scale
Models [103.04711721343278]
Cross-task knowledge distillation helps to train a small student model to obtain a competitive performance.
We propose a Prototype-guided Cross-task Knowledge Distillation (ProC-KD) approach to transfer the intrinsic local-level object knowledge of a large-scale teacher network to various task scenarios.
arXiv Detail & Related papers (2022-12-26T15:00:42Z) - An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale
Multitask Learning Systems [4.675744559395732]
Multitask learning assumes that models capable of learning from multiple tasks can achieve better quality and efficiency via knowledge transfer.
State of the art ML models rely on high customization for each task and leverage size and data scale rather than scaling the number of tasks.
We propose an evolutionary method that can generate a large scale multitask model and can support the dynamic and continuous addition of new tasks.
arXiv Detail & Related papers (2022-05-25T13:10:47Z) - Continual Learning with Bayesian Model based on a Fixed Pre-trained
Feature Extractor [55.9023096444383]
Current deep learning models are characterised by catastrophic forgetting of old knowledge when learning new classes.
Inspired by the process of learning new knowledge in human brains, we propose a Bayesian generative model for continual learning.
arXiv Detail & Related papers (2022-04-28T08:41:51Z) - DST: Dynamic Substitute Training for Data-free Black-box Attack [79.61601742693713]
We propose a novel dynamic substitute training attack method to encourage substitute model to learn better and faster from the target model.
We introduce a task-driven graph-based structure information learning constrain to improve the quality of generated training data.
arXiv Detail & Related papers (2022-04-03T02:29:11Z) - Lifelong Infinite Mixture Model Based on Knowledge-Driven Dirichlet
Process [15.350366047108103]
Recent research efforts in lifelong learning propose to grow a mixture of models to adapt to an increasing number of tasks.
We perform the theoretical analysis for lifelong learning models by deriving the risk bounds based on the discrepancy distance between the probabilistic representation of data.
Inspired by the theoretical analysis, we introduce a new lifelong learning approach, namely the Lifelong Infinite Mixture (LIMix) model.
arXiv Detail & Related papers (2021-08-25T21:06:20Z) - Knowledge-driven Data Construction for Zero-shot Evaluation in
Commonsense Question Answering [80.60605604261416]
We propose a novel neuro-symbolic framework for zero-shot question answering across commonsense tasks.
We vary the set of language models, training regimes, knowledge sources, and data generation strategies, and measure their impact across tasks.
We show that, while an individual knowledge graph is better suited for specific tasks, a global knowledge graph brings consistent gains across different tasks.
arXiv Detail & Related papers (2020-11-07T22:52:21Z) - MED-TEX: Transferring and Explaining Knowledge with Less Data from
Pretrained Medical Imaging Models [38.12462659279648]
A small student model is learned with less data by distilling knowledge from a cumbersome pretrained teacher model.
An explainer module is introduced to highlight the regions of an input that are important for the predictions of the teacher model.
Our framework outperforms on the knowledge distillation and model interpretation tasks compared to state-of-the-art methods on a fundus dataset.
arXiv Detail & Related papers (2020-08-06T11:50:32Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.