Related papers: PDSE: A Multiple Lesion Detector for CT Images using PANet and Deformable Squeeze-and-Excitation Block

PDSE: A Multiple Lesion Detector for CT Images using PANet and Deformable Squeeze-and-Excitation Block

URL: http://arxiv.org/abs/2506.03608v1
Date: Wed, 04 Jun 2025 06:38:31 GMT
Title: PDSE: A Multiple Lesion Detector for CT Images using PANet and Deformable Squeeze-and-Excitation Block
Authors: Di Fan, Heng Yu, Zhiyuan Xu,
Abstract summary: We introduce a one-stage lesion detection framework, PDSE, by redesigning Retinanet.<n>We enhance the path aggregation flow by incorporating a low-level feature map.<n>Our algorithm achieved an mAP of over 0.20 on the public DeepLesion benchmark.
Score: 10.563907026873443
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Detecting lesions in Computed Tomography (CT) scans is a challenging task in medical image processing due to the diverse types, sizes, and locations of lesions. Recently, various one-stage and two-stage framework networks have been developed to focus on lesion localization. We introduce a one-stage lesion detection framework, PDSE, by redesigning Retinanet to achieve higher accuracy and efficiency for detecting lesions in multimodal CT images. Specifically, we enhance the path aggregation flow by incorporating a low-level feature map. Additionally, to improve model representation, we utilize the adaptive Squeeze-and-Excitation (SE) block and integrate channel feature map attention. This approach has resulted in achieving new state-of-the-art performance. Our method significantly improves the detection of small and multiscaled objects. When evaluated against other advanced algorithms on the public DeepLesion benchmark, our algorithm achieved an mAP of over 0.20.

Related papers

Cross-Modal Clustering-Guided Negative Sampling for Self-Supervised Joint Learning from Medical Images and Reports [11.734906190235066]
This paper presents a Cross-Modal Cluster-Guided Negative Sampling (CM-CGNS) method with two-fold ideas.<n>First, it extends the k-means clustering used for local text features in the single-modal domain to the multimodal domain through cross-modal attention.<n>Second, it introduces a Cross-Modal Masked Image Reconstruction (CM-MIR) module that leverages local text-to-image features obtained via cross-modal attention to reconstruct masked local image regions.
arXiv Detail & Related papers (2025-06-13T11:08:16Z)
Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection. Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels. Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z)
Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis. We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z)
Gravity Network for end-to-end small lesion detection [50.38534263407915]
This paper introduces a novel one-stage end-to-end detector specifically designed to detect small lesions in medical images. Precise localization of small lesions presents challenges due to their appearance and the diverse contextual backgrounds in which they are found. We refer to this new architecture as GravityNet, and the novel anchors as gravity points since they appear to be "attracted" by the lesions.
arXiv Detail & Related papers (2023-09-22T14:02:22Z)
Diffusion Models for Counterfactual Generation and Anomaly Detection in Brain Images [39.94162291765236]
We present a weakly supervised method to generate a healthy version of a diseased image and then use it to obtain a pixel-wise anomaly map. We employ a diffusion model trained on healthy samples and combine Denoising Diffusion Probabilistic Model (DDPM) and Denoising Implicit Model (DDIM) at each step of the sampling process.
arXiv Detail & Related papers (2023-08-03T21:56:50Z)
Scale-aware Super-resolution Network with Dual Affinity Learning for Lesion Segmentation from Medical Images [50.76668288066681]
We present a scale-aware super-resolution network to adaptively segment lesions of various sizes from low-resolution medical images. Our proposed network achieved consistent improvements compared to other state-of-the-art methods.
arXiv Detail & Related papers (2023-05-30T14:25:55Z)
Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network. We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module. Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z)
RetiFluidNet: A Self-Adaptive and Multi-Attention Deep Convolutional Network for Retinal OCT Fluid Segmentation [3.57686754209902]
Quantification of retinal fluids is necessary for OCT-guided treatment management. New convolutional neural architecture named RetiFluidNet is proposed for multi-class retinal fluid segmentation. Model benefits from hierarchical representation learning of textural, contextual, and edge features.
arXiv Detail & Related papers (2022-09-26T07:18:00Z)
Superresolution and Segmentation of OCT scans using Multi-Stage adversarial Guided Attention Training [18.056525121226862]
We propose the multi-stage & multi-discriminatory generative adversarial network (MultiSDGAN) to translate OCT scans in high-resolution segmentation labels. We evaluate and compare various combinations of channel and spatial attention to the MultiSDGAN architecture to extract more powerful feature maps. Our results demonstrate relative improvements of 21.44% and 19.45% on the Dice coefficient and SSIM, respectively.
arXiv Detail & Related papers (2022-06-10T00:26:55Z)
Mixed-UNet: Refined Class Activation Mapping for Weakly-Supervised Semantic Segmentation with Multi-scale Inference [28.409679398886304]
We develop a novel model named Mixed-UNet, which has two parallel branches in the decoding phase. We evaluate the designed Mixed-UNet against several prevalent deep learning-based segmentation approaches on our dataset collected from the local hospital and public datasets.
arXiv Detail & Related papers (2022-05-06T08:37:02Z)
Anomaly Detection in Retinal Images using Multi-Scale Deep Feature Sparse Coding [30.097208168480826]
We introduce an unsupervised approach for detecting anomalies in retinal images to overcome this issue. We achieve relative AUC score improvement of 7.8%, 6.7% and 12.1% over state-of-the-art SPADE on Eye-Q, IDRiD and OCTID datasets respectively.
arXiv Detail & Related papers (2022-01-27T13:36:22Z)
Improved Slice-wise Tumour Detection in Brain MRIs by Computing Dissimilarities between Latent Representations [68.8204255655161]
Anomaly detection for Magnetic Resonance Images (MRIs) can be solved with unsupervised methods. We have proposed a slice-wise semi-supervised method for tumour detection based on the computation of a dissimilarity function in the latent space of a Variational AutoEncoder. We show that by training the models on higher resolution images and by improving the quality of the reconstructions, we obtain results which are comparable with different baselines.
arXiv Detail & Related papers (2020-07-24T14:02:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.