Holistic White-light Polyp Classification via Alignment-free Dense Distillation of Auxiliary Optical Chromoendoscopy
- URL: http://arxiv.org/abs/2505.19319v3
- Date: Sat, 12 Jul 2025 14:17:27 GMT
- Title: Holistic White-light Polyp Classification via Alignment-free Dense Distillation of Auxiliary Optical Chromoendoscopy
- Authors: Qiang Hu, Qimei Wang, Jia Chen, Xuantao Ji, Mei Liu, Qiang Li, Zhiwei Wang,
- Abstract summary: This paper proposes a novel holistic classification framework that leverages full-image diagnosis without requiring polyp localization.<n>The key innovation lies in the Alignment-free Dense Distillation (ADD) module, which enables fine-grained cross-domain knowledge distillation.<n>Our method achieves state-of-the-art performance, relatively outperforming the other approaches by at least 2.5% and 16.2% in AUC.
- Score: 14.917217801444794
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: White Light Imaging (WLI) and Narrow Band Imaging (NBI) are the two main colonoscopic modalities for polyp classification. While NBI, as optical chromoendoscopy, offers valuable vascular details, WLI remains the most common and often the only available modality in resource-limited settings. However, WLI-based methods typically underperform, limiting their clinical applicability. Existing approaches transfer knowledge from NBI to WLI through global feature alignment but often rely on cropped lesion regions, which are susceptible to detection errors and neglect contextual and subtle diagnostic cues. To address this, this paper proposes a novel holistic classification framework that leverages full-image diagnosis without requiring polyp localization. The key innovation lies in the Alignment-free Dense Distillation (ADD) module, which enables fine-grained cross-domain knowledge distillation regardless of misalignment between WLI and NBI images. Without resorting to explicit image alignment, ADD learns pixel-wise cross-domain affinities to establish correspondences between feature maps, guiding the distillation along the most relevant pixel connections. To further enhance distillation reliability, ADD incorporates Class Activation Mapping (CAM) to filter cross-domain affinities, ensuring the distillation path connects only those semantically consistent regions with equal contributions to polyp diagnosis. Extensive results on public and in-house datasets show that our method achieves state-of-the-art performance, relatively outperforming the other approaches by at least 2.5% and 16.2% in AUC, respectively. Code is available at: https://github.com/Huster-Hq/ADD.
Related papers
- Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detections [50.343419243749054]
Anomaly Detection (AD) involves identifying deviations from normal data distributions.<n>We propose a novel approach that conditions the prompts of the text encoder based on image context extracted from the vision encoder.<n>Our method achieves state-of-the-art performance, improving performance by 2% to 29% across different metrics on 14 datasets.
arXiv Detail & Related papers (2025-04-15T10:42:25Z) - Domain Generalization for Endoscopic Image Segmentation by Disentangling Style-Content Information and SuperPixel Consistency [1.4991956341367338]
We propose an approach for style-content disentanglement using instance normalization and instance selective whitening (ISW) for improved domain generalization.
We evaluate our approach on two datasets: EndoUDA Barrett's Esophagus and EndoUDA polyps, and compare its performance with three state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2024-09-19T04:10:04Z) - Source-Free Domain Adaptation for Medical Image Segmentation via
Prototype-Anchored Feature Alignment and Contrastive Learning [57.43322536718131]
We present a two-stage source-free domain adaptation (SFDA) framework for medical image segmentation.
In the prototype-anchored feature alignment stage, we first utilize the weights of the pre-trained pixel-wise classifier as source prototypes.
Then, we introduce the bi-directional transport to align the target features with class prototypes by minimizing its expected cost.
arXiv Detail & Related papers (2023-07-19T06:07:12Z) - Semi-supervised GAN for Bladder Tissue Classification in Multi-Domain
Endoscopic Images [10.48945682277992]
We propose a semi-surprised Generative Adrial Network (GAN)-based method composed of three main components.
The overall average classification accuracy, precision, and recall obtained with the proposed method for tissue classification are 0.90, 0.88, and 0.89 respectively.
arXiv Detail & Related papers (2022-12-21T21:32:36Z) - SUPRA: Superpixel Guided Loss for Improved Multi-modal Segmentation in
Endoscopy [1.1470070927586016]
Domain shift is a well-known problem in the medical imaging community.
In this paper, we explore the domain generalisation technique to enable DL methods to be used in such scenarios.
We show that our method yields an improvement of nearly 20% in the target domain set compared to the baseline.
arXiv Detail & Related papers (2022-11-09T03:13:59Z) - Towards Effective Image Manipulation Detection with Proposal Contrastive
Learning [61.5469708038966]
We propose Proposal Contrastive Learning (PCL) for effective image manipulation detection.
Our PCL consists of a two-stream architecture by extracting two types of global features from RGB and noise views respectively.
Our PCL can be easily adapted to unlabeled data in practice, which can reduce manual labeling costs and promote more generalizable features.
arXiv Detail & Related papers (2022-10-16T13:30:13Z) - Radiomics-Guided Global-Local Transformer for Weakly Supervised
Pathology Localization in Chest X-Rays [65.88435151891369]
Radiomics-Guided Transformer (RGT) fuses textitglobal image information with textitlocal knowledge-guided radiomics information.
RGT consists of an image Transformer branch, a radiomics Transformer branch, and fusion layers that aggregate image and radiomic information.
arXiv Detail & Related papers (2022-07-10T06:32:56Z) - Toward Clinically Assisted Colorectal Polyp Recognition via Structured
Cross-modal Representation Consistency [16.225144477302923]
Most computer-aided diagnosis algorithms recognize colorectal polyps by adopting Narrow-Band Imaging (NBI)
NBI usually suffers from missing utilization in real clinic scenarios since the acquisition of this specific image requires manual switching of the light mode.
We propose a novel method to directly achieve accurate white-light colonoscopy image classification by conducting structured cross-modal representation consistency.
arXiv Detail & Related papers (2022-06-23T16:51:28Z) - Colorectal Polyp Classification from White-light Colonoscopy Images via
Domain Alignment [57.419727894848485]
A computer-aided diagnosis system is required to assist accurate diagnosis from colonoscopy images.
Most previous studies at-tempt to develop models for polyp differentiation using Narrow-Band Imaging (NBI) or other enhanced images.
We propose a novel framework based on a teacher-student architecture for the accurate colorectal polyp classification.
arXiv Detail & Related papers (2021-08-05T09:31:46Z) - Cross-Modal Contrastive Learning for Abnormality Classification and
Localization in Chest X-rays with Radiomics using a Feedback Loop [63.81818077092879]
We propose an end-to-end semi-supervised cross-modal contrastive learning framework for medical images.
We first apply an image encoder to classify the chest X-rays and to generate the image features.
The radiomic features are then passed through another dedicated encoder to act as the positive sample for the image features generated from the same chest X-ray.
arXiv Detail & Related papers (2021-04-11T09:16:29Z) - Consistent Posterior Distributions under Vessel-Mixing: A Regularization
for Cross-Domain Retinal Artery/Vein Classification [30.30848090813239]
We propose a vessel-mixing based consistency regularization framework, for cross-domain learning in retinal A/V classification.
Our method achieves the state-of-the-art cross-domain performance, which is also close to the upper bound obtained by fully supervised learning on target domain.
arXiv Detail & Related papers (2021-03-16T14:18:35Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.