Related papers: All Centers Are at most a Few Tokens Apart: Knowledge Distillation with Domain Invariant Prompt Tuning

All Centers Are at most a Few Tokens Apart: Knowledge Distillation with Domain Invariant Prompt Tuning

URL: http://arxiv.org/abs/2511.22739v1
Date: Thu, 27 Nov 2025 20:18:04 GMT
Title: All Centers Are at most a Few Tokens Apart: Knowledge Distillation with Domain Invariant Prompt Tuning
Authors: Amir Mohammad Ezzati, Alireza Malekhosseini, Armin Khosravi, Mohammad Hossein Rohban,
Abstract summary: Domain generalization is critical in computational pathology (CPath)<n>We propose Domain Invariant Prompt Tuning (DIPT) for knowledge distillation process.<n>Our method adds a significant improvement in average F1-score to existing state-of-the-art knowledge distillation approaches.
Score: 6.706482416007361
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Domain generalization is critical in computational pathology (CPath) due to inherent domain shifts caused by variations in staining protocols, scanner devices, and imaging settings across clinical centers. Vision-language models (VLMs), such as PLIP-a pathology-tuned CLIP-trained on image-text pairs across diverse domains, serve as strong knowledge distillation sources. However, their zero-shot performance with predefined prompts remains limited due to sensitivity to prompt variations. Moreover, unlike natural images, histopathology centers lack semantic descriptors (e.g., 'sketch'), making it difficult to define domain-specific prompts for clinical centers. This requires a data-driven approach for learning domain-specific and ultimately class-generic continuous prompts. We propose Domain Invariant Prompt Tuning (DIPT) for knowledge distillation process, a novel step that learns multiple input tokens for each domain. These tokens are trained separately for each domain and are averaged across domains, leading to domain-invariant prompts. Our student model then distills knowledge from PLIP's text encoder by leveraging the prompts learned by DIPT. This leads to alignment of visual features with domain-invariant embeddings, enhancing generalization by training on multiple domains. Our method adds a significant improvement in average F1-score to existing state-of-the-art (SOTA) knowledge distillation approaches in domain generalization with histopathology datasets. This work helps the way of deploying robust CPath models in real-world clinical problems with heterogeneous data sources.

Related papers

Domain-invariant Mixed-domain Semi-supervised Medical Image Segmentation with Clustered Maximum Mean Discrepancy Alignment [11.298724831730675]
We propose a domain-invariant mixed-domain semi-supervised segmentation framework.<n>A Copy-Paste Mechanism (CPM) augments the training set by transferring informative regions across domains.<n>A Cluster Maximum Mean Discrepancy (CMMD) block clusters unlabeled features and aligns them with labeled anchors.
arXiv Detail & Related papers (2026-01-23T18:23:03Z)
DiPrompT: Disentangled Prompt Tuning for Multiple Latent Domain Generalization in Federated Learning [20.51179258856028]
Federated learning (FL) has emerged as a powerful paradigm for learning from decentralized data. Most existing FL methods assume that domain labels are provided during training, and their evaluation imposes explicit constraints on the number of domains. We propose Disentangled Prompt Tuning (DiPrompT), a method that tackles the above restrictions by learning adaptive prompts for domain generalization in a distributed manner.
arXiv Detail & Related papers (2024-03-11T15:58:15Z)
Prompt-driven Latent Domain Generalization for Medical Image Classification [23.914889221925552]
We propose a novel framework for medical image classification without relying on domain labels. PLDG consists of unsupervised domain discovery and prompt learning. Our method can achieve comparable or even superior performance than conventional DG algorithms.
arXiv Detail & Related papers (2024-01-05T05:24:07Z)
Domain-Controlled Prompt Learning [49.45309818782329]
Existing prompt learning methods often lack domain-awareness or domain-transfer mechanisms. We propose a textbfDomain-Controlled Prompt Learning for the specific domains. Our method achieves state-of-the-art performance in specific domain image recognition datasets.
arXiv Detail & Related papers (2023-09-30T02:59:49Z)
EPVT: Environment-aware Prompt Vision Transformer for Domain Generalization in Skin Lesion Recognition [12.91556412209546]
Skin lesion recognition using deep learning has made remarkable progress, and there is an increasing need for deploying these systems in real-world scenarios. Recent research has revealed that deep neural networks for skin lesion recognition may overly depend on disease-irrelevant image artifacts. We propose a novel domain generalization method called EPVT, which involves embedding prompts into the vision transformer to collaboratively learn knowledge from diverse domains.
arXiv Detail & Related papers (2023-04-04T03:36:14Z)
Generalizing through Forgetting -- Domain Generalization for Symptom Event Extraction in Clinical Notes [0.0]
We present domain generalization for symptom extraction using pretraining and fine-tuning data. We propose a domain generalization method that dynamically masks frequent symptoms words in the source domain. Our experiments indicate that masking and adaptive pretraining methods can significantly improve performance when the source domain is more distant from the target domain.
arXiv Detail & Related papers (2022-09-20T05:53:22Z)
Domain Invariant Masked Autoencoders for Self-supervised Learning from Multi-domains [73.54897096088149]
We propose a Domain-invariant Masked AutoEncoder (DiMAE) for self-supervised learning from multi-domains. The core idea is to augment the input image with style noise from different domains and then reconstruct the image from the embedding of the augmented image. Experiments on PACS and DomainNet illustrate that DiMAE achieves considerable gains compared with recent state-of-the-art methods.
arXiv Detail & Related papers (2022-05-10T09:49:40Z)
Dynamic Instance Domain Adaptation [109.53575039217094]
Most studies on unsupervised domain adaptation assume that each domain's training samples come with domain labels. We develop a dynamic neural network with adaptive convolutional kernels to generate instance-adaptive residuals to adapt domain-agnostic deep features to each individual instance. Our model, dubbed DIDA-Net, achieves state-of-the-art performance on several commonly used single-source and multi-source UDA datasets.
arXiv Detail & Related papers (2022-03-09T20:05:54Z)
Self-Rule to Adapt: Generalized Multi-source Feature Learning Using Unsupervised Domain Adaptation for Colorectal Cancer Tissue Detection [9.074125289002911]
Supervised learning is constrained by the availability of labeled data. We propose SRA, which takes advantage of self-supervised learning to perform domain adaptation.
arXiv Detail & Related papers (2021-08-20T13:52:33Z)
Structured Latent Embeddings for Recognizing Unseen Classes in Unseen Domains [108.11746235308046]
We propose a novel approach that learns domain-agnostic structured latent embeddings by projecting images from different domains. Our experiments on the challenging DomainNet and DomainNet-LS benchmarks show the superiority of our approach over existing methods.
arXiv Detail & Related papers (2021-07-12T17:57:46Z)
Cross-domain Contrastive Learning for Unsupervised Domain Adaptation [108.63914324182984]
Unsupervised domain adaptation (UDA) aims to transfer knowledge learned from a fully-labeled source domain to a different unlabeled target domain. We build upon contrastive self-supervised learning to align features so as to reduce the domain discrepancy between training and testing sets.
arXiv Detail & Related papers (2021-06-10T06:32:30Z)
Curriculum CycleGAN for Textual Sentiment Domain Adaptation with Multiple Sources [68.31273535702256]
We propose a novel instance-level MDA framework, named curriculum cycle-consistent generative adversarial network (C-CycleGAN) C-CycleGAN consists of three components: (1) pre-trained text encoder which encodes textual input from different domains into a continuous representation space, (2) intermediate domain generator with curriculum instance-level adaptation which bridges the gap across source and target domains, and (3) task classifier trained on the intermediate domain for final sentiment classification. We conduct extensive experiments on three benchmark datasets and achieve substantial gains over state-of-the-art DA approaches.
arXiv Detail & Related papers (2020-11-17T14:50:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.