Toward Clinically Assisted Colorectal Polyp Recognition via Structured
Cross-modal Representation Consistency
- URL: http://arxiv.org/abs/2206.11826v2
- Date: Fri, 24 Jun 2022 15:43:23 GMT
- Title: Toward Clinically Assisted Colorectal Polyp Recognition via Structured
Cross-modal Representation Consistency
- Authors: Weijie Ma, Ye Zhu, Ruimao Zhang, Jie Yang, Yiwen Hu, Zhen Li, Li Xiang
- Abstract summary: Most computer-aided diagnosis algorithms recognize colorectal polyps by adopting Narrow-Band Imaging (NBI)
NBI usually suffers from missing utilization in real clinic scenarios since the acquisition of this specific image requires manual switching of the light mode.
We propose a novel method to directly achieve accurate white-light colonoscopy image classification by conducting structured cross-modal representation consistency.
- Score: 16.225144477302923
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The colorectal polyps classification is a critical clinical examination. To
improve the classification accuracy, most computer-aided diagnosis algorithms
recognize colorectal polyps by adopting Narrow-Band Imaging (NBI). However, the
NBI usually suffers from missing utilization in real clinic scenarios since the
acquisition of this specific image requires manual switching of the light mode
when polyps have been detected by using White-Light (WL) images. To avoid the
above situation, we propose a novel method to directly achieve accurate
white-light colonoscopy image classification by conducting structured
cross-modal representation consistency. In practice, a pair of multi-modal
images, i.e. NBI and WL, are fed into a shared Transformer to extract
hierarchical feature representations. Then a novel designed Spatial Attention
Module (SAM) is adopted to calculate the similarities between the class token
and patch tokens %from multi-levels for a specific modality image. By aligning
the class tokens and spatial attention maps of paired NBI and WL images at
different levels, the Transformer achieves the ability to keep both global and
local representation consistency for the above two modalities. Extensive
experimental results illustrate the proposed method outperforms the recent
studies with a margin, realizing multi-modal prediction with a single
Transformer while greatly improving the classification accuracy when only with
WL images.
Related papers
- Deep Learning Based Speckle Filtering for Polarimetric SAR Images. Application to Sentinel-1 [51.404644401997736]
We propose a complete framework to remove speckle in polarimetric SAR images using a convolutional neural network.
Experiments show that the proposed approach offers exceptional results in both speckle reduction and resolution preservation.
arXiv Detail & Related papers (2024-08-28T10:07:17Z) - RATLIP: Generative Adversarial CLIP Text-to-Image Synthesis Based on Recurrent Affine Transformations [0.0]
Conditional affine transformations (CAT) have been applied to different layers of GAN to control content synthesis in images.
We first model CAT and a recurrent neural network (RAT) to ensure that different layers can access global information.
We then introduce shuffle attention between RAT to mitigate the characteristic of information forgetting in recurrent neural networks.
arXiv Detail & Related papers (2024-05-13T18:49:18Z) - Transformer-based Clipped Contrastive Quantization Learning for
Unsupervised Image Retrieval [15.982022297570108]
Unsupervised image retrieval aims to learn the important visual characteristics without any given level to retrieve the similar images for a given query image.
In this paper, we propose a TransClippedCLR model by encoding the global context of an image using Transformer having local context through patch based processing.
Results using the proposed clipped contrastive learning are greatly improved on all datasets as compared to same backbone network with vanilla contrastive learning.
arXiv Detail & Related papers (2024-01-27T09:39:11Z) - MB-RACS: Measurement-Bounds-based Rate-Adaptive Image Compressed Sensing Network [65.1004435124796]
We propose a Measurement-Bounds-based Rate-Adaptive Image Compressed Sensing Network (MB-RACS) framework.
Our experiments demonstrate that the proposed MB-RACS method surpasses current leading methods.
arXiv Detail & Related papers (2024-01-19T04:40:20Z) - Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z) - M$^{2}$SNet: Multi-scale in Multi-scale Subtraction Network for Medical
Image Segmentation [73.10707675345253]
We propose a general multi-scale in multi-scale subtraction network (M$2$SNet) to finish diverse segmentation from medical image.
Our method performs favorably against most state-of-the-art methods under different evaluation metrics on eleven datasets of four different medical image segmentation tasks.
arXiv Detail & Related papers (2023-03-20T06:26:49Z) - A Test Statistic Estimation-based Approach for Establishing
Self-interpretable CNN-based Binary Classifiers [7.424003880270276]
Post-hoc interpretability methods have the limitation that they can produce plausible but different interpretations.
The proposed method is self-interpretable, quantitative. Unlike the traditional post-hoc interpretability methods, the proposed method is self-interpretable, quantitative.
arXiv Detail & Related papers (2023-03-13T05:51:35Z) - Colorectal Polyp Classification from White-light Colonoscopy Images via
Domain Alignment [57.419727894848485]
A computer-aided diagnosis system is required to assist accurate diagnosis from colonoscopy images.
Most previous studies at-tempt to develop models for polyp differentiation using Narrow-Band Imaging (NBI) or other enhanced images.
We propose a novel framework based on a teacher-student architecture for the accurate colorectal polyp classification.
arXiv Detail & Related papers (2021-08-05T09:31:46Z) - A Hierarchical Transformation-Discriminating Generative Model for Few
Shot Anomaly Detection [93.38607559281601]
We devise a hierarchical generative model that captures the multi-scale patch distribution of each training image.
The anomaly score is obtained by aggregating the patch-based votes of the correct transformation across scales and image regions.
arXiv Detail & Related papers (2021-04-29T17:49:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.