Related papers: Dual-Domain CLIP-Assisted Residual Optimization Perception Model for Metal Artifact Reduction

Dual-Domain CLIP-Assisted Residual Optimization Perception Model for Metal Artifact Reduction

URL: http://arxiv.org/abs/2408.14342v2
Date: Thu, 29 Aug 2024 09:11:13 GMT
Title: Dual-Domain CLIP-Assisted Residual Optimization Perception Model for Metal Artifact Reduction
Authors: Xinrui Zhang, Ailong Cai, Shaoyu Wang, Linyuan Wang, Zhizhong Zheng, Lei Li, Bin Yan,
Abstract summary: Metal artifacts in computed tomography (CT) imaging pose significant challenges to accurate clinical diagnosis. Deep learning-based approaches, particularly generative models, have been proposed for metal artifact reduction (MAR)
Score: 9.028901322902913
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Metal artifacts in computed tomography (CT) imaging pose significant challenges to accurate clinical diagnosis. The presence of high-density metallic implants results in artifacts that deteriorate image quality, manifesting in the forms of streaking, blurring, or beam hardening effects, etc. Nowadays, various deep learning-based approaches, particularly generative models, have been proposed for metal artifact reduction (MAR). However, these methods have limited perception ability in the diverse morphologies of different metal implants with artifacts, which may generate spurious anatomical structures and exhibit inferior generalization capability. To address the issues, we leverage visual-language model (VLM) to identify these morphological features and introduce them into a dual-domain CLIP-assisted residual optimization perception model (DuDoCROP) for MAR. Specifically, a dual-domain CLIP (DuDoCLIP) is fine-tuned on the image domain and sinogram domain using contrastive learning to extract semantic descriptions from anatomical structures and metal artifacts. Subsequently, a diffusion model is guided by the embeddings of DuDoCLIP, thereby enabling the dual-domain prior generation. Additionally, we design prompt engineering for more precise image-text descriptions that can enhance the model's perception capability. Then, a downstream task is devised for the one-step residual optimization and integration of dual-domain priors, while incorporating raw data fidelity. Ultimately, a new perceptual indicator is proposed to validate the model's perception and generation performance. With the assistance of DuDoCLIP, our DuDoCROP exhibits at least 63.7% higher generalization capability compared to the baseline model. Numerical experiments demonstrate that the proposed method can generate more realistic image structures and outperform other SOTA approaches both qualitatively and quantitatively.

Related papers

Direct Dual-Energy CT Material Decomposition using Model-based Denoising Diffusion Model [105.95160543743984]
We propose a deep learning procedure called Dual-Energy Decomposition Model-based Diffusion (DEcomp-MoD) for quantitative material decomposition.<n>We show that DEcomp-MoD outperform state-of-the-art unsupervised score-based model and supervised deep learning networks.
arXiv Detail & Related papers (2025-07-24T01:00:06Z)
Structure and Smoothness Constrained Dual Networks for MR Bias Field Correction [6.078318492288723]
Deep learning models have been proposed for MR image improvement.<n>S2DNets are proposed aiming to self-supervised bias field correction.<n>Experiments on both clinical and simulated MR datasets show that the proposed model outperforms other conventional and deep learning-based models.
arXiv Detail & Related papers (2025-07-02T03:23:43Z)
DiffDoctor: Diagnosing Image Diffusion Models Before Treating [57.82359018425674]
We propose DiffDoctor, a two-stage pipeline to assist image diffusion models in generating fewer artifacts. We collect a dataset of over 1M flawed synthesized images and set up an efficient human-in-the-loop annotation process. The learned artifact detector is then involved in the second stage to optimize the diffusion model by providing pixel-level feedback.
arXiv Detail & Related papers (2025-01-21T18:56:41Z)
DGSSA: Domain generalization with structural and stylistic augmentation for retinal vessel segmentation [17.396365010722423]
Retinal vascular morphology is crucial for diagnosing diseases such as diabetes, glaucoma, and hypertension. Traditional segmentation methods assume that training and testing data share similar distributions, which can lead to poor performance on unseen domains. This paper presents a novel approach, DGSSA, for retinal vessel image segmentation that enhances model generalization by combining structural and style augmentation strategies.
arXiv Detail & Related papers (2025-01-07T01:47:57Z)
Self-supervised Vision Transformer are Scalable Generative Models for Domain Generalization [0.13108652488669734]
We propose a novel generative method for domain generalization in histopathology images. Our method employs a generative, self-supervised Vision Transformer to dynamically extract characteristics of image patches. Experiments conducted on two distinct histopathology datasets demonstrate the effectiveness of our proposed approach.
arXiv Detail & Related papers (2024-07-03T08:20:27Z)
Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection. Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels. Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z)
PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with Dual-Discriminators [120.06891448820447]
How to obtain clear and visually pleasant images has become a common concern of people. The task of underwater image enhancement (UIE) has also emerged as the times require. In this paper, we propose a physical model-guided GAN model for UIE, referred to as PUGAN. Our PUGAN outperforms state-of-the-art methods in both qualitative and quantitative metrics.
arXiv Detail & Related papers (2023-06-15T07:41:12Z)
Orientation-Shared Convolution Representation for CT Metal Artifact Learning [63.67718355820655]
During X-ray computed tomography (CT) scanning, metallic implants carrying with patients often lead to adverse artifacts. Existing deep-learning-based methods have gained promising reconstruction performance. We propose an orientation-shared convolution representation strategy to adapt the physical prior structures of artifacts.
arXiv Detail & Related papers (2022-12-26T13:56:12Z)
ROCT-Net: A new ensemble deep convolutional model with improved spatial resolution learning for detecting common diseases from retinal OCT images [0.0]
This paper presents a new enhanced deep ensemble convolutional neural network for detecting retinal diseases from OCT images. Our model generates rich and multi-resolution features by employing the learning architectures of two robust convolutional models. Our experiments on two datasets and comparing our model with some other well-known deep convolutional neural networks have proven that our architecture can increase the classification accuracy up to 5%.
arXiv Detail & Related papers (2022-03-03T17:51:01Z)
InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images [53.4351366246531]
We construct a novel interpretable dual domain network, termed InDuDoNet+, into which CT imaging process is finely embedded. We analyze the CT values among different tissues, and merge the prior observations into a prior network for our InDuDoNet+, which significantly improve its generalization performance.
arXiv Detail & Related papers (2021-12-23T15:52:37Z)
DAN-Net: Dual-Domain Adaptive-Scaling Non-local Network for CT Metal Artifact Reduction [15.225899631788973]
Metal implants can heavily attenuate X-rays in computed tomography (CT) scans, leading to severe artifacts in reconstructed images. Several network models have been proposed for metal artifact reduction (MAR) in CT. We present a novel Dual-domain Adaptive-scaling Non-local network (DAN-Net) for MAR.
arXiv Detail & Related papers (2021-02-16T08:09:16Z)
Hierarchical Amortized Training for Memory-efficient High Resolution 3D GAN [52.851990439671475]
We propose a novel end-to-end GAN architecture that can generate high-resolution 3D images. We achieve this goal by using different configurations between training and inference. Experiments on 3D thorax CT and brain MRI demonstrate that our approach outperforms state of the art in image generation.
arXiv Detail & Related papers (2020-08-05T02:33:04Z)
Learning Deformable Image Registration from Optimization: Perspective, Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation. We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.