Related papers: A study of why we need to reassess full reference image quality assessment with medical images

A study of why we need to reassess full reference image quality assessment with medical images

URL: http://arxiv.org/abs/2405.19097v4
Date: Fri, 14 Mar 2025 11:56:29 GMT
Title: A study of why we need to reassess full reference image quality assessment with medical images
Authors: Anna Breger, Ander Biguri, Malena Sabaté Landman, Ian Selby, Nicole Amberg, Elisabeth Brunner, Janek Gröhl, Sepideh Hatamikia, Clemens Karner, Lipeng Ning, Sören Dittmer, Michael Roberts, AIX-COVNET Collaboration, Carola-Bibiane Schönlieb,
Abstract summary: PSNR and SSIM are known and tested for working successfully in many natural imaging tasks.<n> discrepancies in medical scenarios have been reported, highlighting the gap between development and actual clinical application.<n>This paper provides a structured and comprehensive overview of examples where PSNR and SSIM prove to be unsuitable for the assessment of novel algorithms.
Score: 7.018256825895632
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Image quality assessment (IQA) is indispensable in clinical practice to ensure high standards, as well as in the development stage of machine learning algorithms that operate on medical images. The popular full reference (FR) IQA measures PSNR and SSIM are known and tested for working successfully in many natural imaging tasks, but discrepancies in medical scenarios have been reported in the literature, highlighting the gap between development and actual clinical application. Such inconsistencies are not surprising, as medical images have very different properties than natural images, and PSNR and SSIM have neither been targeted nor properly tested for medical images. This may cause unforeseen problems in clinical applications due to wrong judgment of novel methods. This paper provides a structured and comprehensive overview of examples where PSNR and SSIM prove to be unsuitable for the assessment of novel algorithms using different kinds of medical images, including real-world MRI, CT, OCT, X-Ray, digital pathology and photoacoustic imaging data. Therefore, improvement is urgently needed in particular in this era of AI to increase reliability and explainability in machine learning for medical imaging and beyond. Lastly, we will provide ideas for future research as well as suggesting guidelines for the usage of FR-IQA measures applied to medical images.

Related papers

PhotIQA: A photoacoustic image data set with image quality ratings [7.753621023890248]
PhotIQA is a data set consisting of 1134 reconstructed photoacoustic (PA) images rated by 2 experts across five quality properties.<n>Our baseline experiments show that HaarPSI$_med$ significantly outperforms SSIM in correlating with the quality ratings.
arXiv Detail & Related papers (2025-07-04T11:06:54Z)
Metrics that matter: Evaluating image quality metrics for medical image generation [48.85783422900129]
This study comprehensively assesses commonly used no-reference image quality metrics using brain MRI data.<n>We evaluate metric sensitivity to a range of challenges, including noise, distribution shifts, and, critically, morphological alterations designed to mimic clinically relevant inaccuracies.
arXiv Detail & Related papers (2025-05-12T01:57:25Z)
RadIR: A Scalable Framework for Multi-Grained Medical Image Retrieval via Radiology Report Mining [48.21287619304126]
We propose a novel methodology that leverages dense radiology reports to define image-wise similarity ordering at multiple granularities. We construct two comprehensive medical imaging retrieval datasets: MIMIC-IR for Chest X-rays and CTRATE-IR for CT scans. We develop two retrieval systems, RadIR-CXR and model-ChestCT, which demonstrate superior performance in traditional image-image and image-report retrieval tasks.
arXiv Detail & Related papers (2025-03-06T17:43:03Z)
Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering [70.44269982045415]
Retrieval-augmented generation (RAG) has emerged as a promising approach to enhance the performance of large language models (LLMs) We introduce Medical Retrieval-Augmented Generation Benchmark (MedRGB) that provides various supplementary elements to four medical QA datasets. Our experimental results reveals current models' limited ability to handle noise and misinformation in the retrieved documents.
arXiv Detail & Related papers (2024-11-14T06:19:18Z)
Clinical Evaluation of Medical Image Synthesis: A Case Study in Wireless Capsule Endoscopy [63.39037092484374]
This study focuses on the clinical evaluation of medical Synthetic Data Generation using Artificial Intelligence (AI) models. The paper contributes by a) presenting a protocol for the systematic evaluation of synthetic images by medical experts and b) applying it to assess TIDE-II, a novel variational autoencoder-based model for high-resolution WCE image synthesis. The results show that TIDE-II generates clinically relevant WCE images, helping to address data scarcity and enhance diagnostic tools.
arXiv Detail & Related papers (2024-10-31T19:48:50Z)
Evidence Is All You Need: Ordering Imaging Studies via Language Model Alignment with the ACR Appropriateness Criteria [22.897900474995012]
We introduce a framework to intelligently leverage language models by recommending imaging studies for patient cases aligned with evidence-based guidelines. We make available a novel dataset of patient "one-liner" scenarios to power our experiments, and optimize state-of-the-art language models to achieve an accuracy on par with clinicians in image ordering.
arXiv Detail & Related papers (2024-09-27T23:13:17Z)
SeCo-INR: Semantically Conditioned Implicit Neural Representations for Improved Medical Image Super-Resolution [25.078280843551322]
Implicit Neural Representations (INRs) have recently advanced the field of deep learning due to their ability to learn continuous representations of signals. We propose a novel framework, referred to as the Semantically Conditioned INR (SeCo-INR), that conditions an INR using local priors from a medical image. Our framework learns a continuous representation of the semantic segmentation features of a medical image and utilizes it to derive the optimal INR for each semantic region of the image.
arXiv Detail & Related papers (2024-09-02T07:45:06Z)
A study on the adequacy of common IQA measures for medical images [6.580928439802918]
The most commonly used IQA measures have been developed and tested for natural images, but not in the medical setting. In this study, we test the applicability of common IQA measures for medical image data by comparing their assessment to manually rated chest X-ray (5 experts) and photoacoustic image data (2 experts)
arXiv Detail & Related papers (2024-05-29T16:04:03Z)
Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection. Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels. Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z)
Radiology Report Generation Using Transformers Conditioned with Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information. The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z)
Implicit Neural Representation in Medical Imaging: A Comparative Survey [3.478921293603811]
Implicit neural representations (INRs) have gained prominence as a powerful paradigm in scene reconstruction and computer graphics. This survey aims to provide a comprehensive overview of INR models in the field of medical imaging.
arXiv Detail & Related papers (2023-07-30T06:39:25Z)
A Trustworthy Framework for Medical Image Analysis with Deep Learning [71.48204494889505]
TRUDLMIA is a trustworthy deep learning framework for medical image analysis. It is anticipated that the framework will support researchers and clinicians in advancing the use of deep learning for dealing with public health crises including COVID-19.
arXiv Detail & Related papers (2022-12-06T05:30:22Z)
Automated SSIM Regression for Detection and Quantification of Motion Artefacts in Brain MR Images [54.739076152240024]
Motion artefacts in magnetic resonance brain images are a crucial issue. The assessment of MR image quality is fundamental before proceeding with the clinical diagnosis. An automated image quality assessment based on the structural similarity index (SSIM) regression has been proposed here.
arXiv Detail & Related papers (2022-06-14T10:16:54Z)
Assessing the ability of generative adversarial networks to learn canonical medical image statistics [10.479865560555199]
generative adversarial networks (GANs) have gained tremendous popularity for potential applications in medical imaging. It is not clear if modern GANs reliably learn the statistics that are meaningful to a downstream medical imaging application. In this work, the ability of a state-of-the-art GAN to learn the statistics of canonical image models (SIMs) that are relevant to objective assessment of image quality is investigated.
arXiv Detail & Related papers (2022-04-26T00:30:01Z)
Image Quality Assessment for Magnetic Resonance Imaging [4.05136808278614]
Image quality assessment (IQA) algorithms aim to reproduce the human's perception of the image quality. We use outputs of neural network models trained to solve problems relevant to MRI. Seven trained radiologists assess distorted images, with their verdicts then correlated with 35 different image quality metrics.
arXiv Detail & Related papers (2022-03-15T11:52:29Z)
Artifact- and content-specific quality assessment for MRI with image rulers [11.551528894727573]
In clinical practice MR images are often first seen by radiologists long after the scan. If image quality is inadequate either patients have to return for an additional scan, or a suboptimal interpretation is rendered. We propose a framework with multi-task CNN model trained with calibrated labels and inferenced with image rulers.
arXiv Detail & Related papers (2021-11-06T02:17:12Z)
Generative Residual Attention Network for Disease Detection [51.60842580044539]
We present a novel approach for disease generation in X-rays using a conditional generative adversarial learning. We generate a corresponding radiology image in a target domain while preserving the identity of the patient. We then use the generated X-ray image in the target domain to augment our training to improve the detection performance.
arXiv Detail & Related papers (2021-10-25T14:15:57Z)
Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment [157.1292674649519]
We propose a practical solution named degraded-reference IQA (DR-IQA) DR-IQA exploits the inputs of IR models, degraded images, as references. Our results can even be close to the performance of full-reference settings.
arXiv Detail & Related papers (2021-08-18T02:35:08Z)
Explaining Clinical Decision Support Systems in Medical Imaging using Cycle-Consistent Activation Maximization [112.2628296775395]
Clinical decision support using deep neural networks has become a topic of steadily growing interest. clinicians are often hesitant to adopt the technology because its underlying decision-making process is considered to be intransparent and difficult to comprehend. We propose a novel decision explanation scheme based on CycleGAN activation which generates high-quality visualizations of classifier decisions even in smaller data sets.
arXiv Detail & Related papers (2020-10-09T14:39:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.