GPF-Net: Gated Progressive Fusion Learning for Polyp Re-Identification
- URL: http://arxiv.org/abs/2512.21476v1
- Date: Thu, 25 Dec 2025 02:40:46 GMT
- Title: GPF-Net: Gated Progressive Fusion Learning for Polyp Re-Identification
- Authors: Suncheng Xiang, Xiaoyang Wang, Junjie Jiang, Hejia Wang, Dahong Qian,
- Abstract summary: Colonoscopic Polyp Re-Identification aims to match the same polyp with images from different views taken using different cameras.<n>The coarse resolution of high-level features of a specific polyp often leads to inferior results for small objects.<n>We propose a novel architecture, named Gated Progressive Fusion network, to selectively fuse features from multiple levels using gates.
- Score: 16.17476305564124
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Colonoscopic Polyp Re-Identification aims to match the same polyp from a large gallery with images from different views taken using different cameras, which plays an important role in the prevention and treatment of colorectal cancer in computer-aided diagnosis. However, the coarse resolution of high-level features of a specific polyp often leads to inferior results for small objects where detailed information is important. To address this challenge, we propose a novel architecture, named Gated Progressive Fusion network, to selectively fuse features from multiple levels using gates in a fully connected way for polyp ReID. On the basis of it, a gated progressive fusion strategy is introduced to achieve layer-wise refinement of semantic information through multi-level feature interactions. Experiments on standard benchmarks show the benefits of the multimodal setting over state-of-the-art unimodal ReID models, especially when combined with the specialized multimodal fusion strategy.
Related papers
- MSFNet-CPD: Multi-Scale Cross-Modal Fusion Network for Crop Pest Detection [3.5148549831413036]
Accurate identification of agricultural pests is essential for crop protection.<n>While deep learning has advanced pest detection, most existing approaches rely solely on low-level visual features.
arXiv Detail & Related papers (2025-05-05T08:10:22Z) - Multi-Scale Target-Aware Representation Learning for Fundus Image Enhancement [11.652205644265893]
High-quality fundus images provide essential anatomical information for clinical screening and ophthalmic disease diagnosis.<n>Recent years have witnessed promising progress in fundus image enhancement.<n>We propose a multi-scale target-aware representation learning framework (MTRL-FIE) for efficient fundus image enhancement.
arXiv Detail & Related papers (2025-05-03T14:25:48Z) - Robust Polyp Detection and Diagnosis through Compositional Prompt-Guided Diffusion Models [32.17651741681871]
We propose a Progressive Spectrum Diffusion Model (PSDM) for generating synthetic polyp images.<n>PSDM integrates diverse clinical annotations-such as segmentation masks, bounding boxes, and colonoscopy reports-by transforming them into compositional prompts.<n>By augmenting training data with PSDM-generated samples, our model significantly improves polyp detection, classification, and segmentation.
arXiv Detail & Related papers (2025-02-25T08:22:45Z) - Learning Collaborative Knowledge with Multimodal Representation for Polyp Re-Identification [14.63589190319602]
Colonoscopic Polyp Re-Identification aims to match the same polyp from a large gallery with images from different views taken using different cameras.<n>Traditional methods for object ReID directly adopting CNN models trained on the ImageNet dataset produce unsatisfactory retrieval performance.<n>We propose a novel Deep Multimodal Collaborative Learning framework named DMCL for polyp re-identification.
arXiv Detail & Related papers (2024-08-12T04:05:19Z) - ECC-PolypDet: Enhanced CenterNet with Contrastive Learning for Automatic
Polyp Detection [88.4359020192429]
Existing methods either involve computationally expensive context aggregation or lack prior modeling of polyps, resulting in poor performance in challenging cases.
In this paper, we propose the Enhanced CenterNet with Contrastive Learning (ECC-PolypDet), a two-stage training & end-to-end inference framework.
Box-assisted Contrastive Learning (BCL) during training to minimize the intra-class difference and maximize the inter-class difference between foreground polyps and backgrounds, enabling our model to capture concealed polyps.
In the fine-tuning stage, we introduce the IoU-guided Sample Re-weighting
arXiv Detail & Related papers (2024-01-10T07:03:41Z) - FLDNet: A Foreground-Aware Network for Polyp Segmentation Leveraging
Long-Distance Dependencies [1.7623838912231695]
We propose FLDNet, a Transformer-based neural network that captures long-distance dependencies for accurate polyp segmentation.
Our proposed method, FLDNet, was evaluated using seven metrics on common datasets and demonstrated superiority over state-of-the-art methods on widely-used evaluation measures.
arXiv Detail & Related papers (2023-09-12T06:32:42Z) - Lesion-aware Dynamic Kernel for Polyp Segmentation [49.63274623103663]
We propose a lesion-aware dynamic network (LDNet) for polyp segmentation.
It is a traditional u-shape encoder-decoder structure incorporated with a dynamic kernel generation and updating scheme.
This simple but effective scheme endows our model with powerful segmentation performance and generalization capability.
arXiv Detail & Related papers (2023-01-12T09:53:57Z) - Multi-modal Aggregation Network for Fast MR Imaging [85.25000133194762]
We propose a novel Multi-modal Aggregation Network, named MANet, which is capable of discovering complementary representations from a fully sampled auxiliary modality.
In our MANet, the representations from the fully sampled auxiliary and undersampled target modalities are learned independently through a specific network.
Our MANet follows a hybrid domain learning framework, which allows it to simultaneously recover the frequency signal in the $k$-space domain.
arXiv Detail & Related papers (2021-10-15T13:16:59Z) - Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers [124.01928050651466]
We propose a new type of polyp segmentation method, named Polyp-PVT.
The proposed model, named Polyp-PVT, effectively suppresses noises in the features and significantly improves their expressive capabilities.
arXiv Detail & Related papers (2021-08-16T07:09:06Z) - High-resolution Depth Maps Imaging via Attention-based Hierarchical
Multi-modal Fusion [84.24973877109181]
We propose a novel attention-based hierarchical multi-modal fusion network for guided DSR.
We show that our approach outperforms state-of-the-art methods in terms of reconstruction accuracy, running speed and memory efficiency.
arXiv Detail & Related papers (2021-04-04T03:28:33Z) - Hi-Net: Hybrid-fusion Network for Multi-modal MR Image Synthesis [143.55901940771568]
We propose a novel Hybrid-fusion Network (Hi-Net) for multi-modal MR image synthesis.
In our Hi-Net, a modality-specific network is utilized to learn representations for each individual modality.
A multi-modal synthesis network is designed to densely combine the latent representation with hierarchical features from each modality.
arXiv Detail & Related papers (2020-02-11T08:26:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.