Five Pitfalls When Assessing Synthetic Medical Images with Reference Metrics
- URL: http://arxiv.org/abs/2408.06075v2
- Date: Thu, 24 Oct 2024 08:15:16 GMT
- Title: Five Pitfalls When Assessing Synthetic Medical Images with Reference Metrics
- Authors: Melanie Dohmen, Tuan Truong, Ivo M. Baltruschat, Matthias Lenga,
- Abstract summary: Reference metrics have been developed to objectively and quantitatively compare two images.
The correlation of reference metrics and human perception of quality can vary strongly for different kinds of distortions.
We selected five pitfalls that showcase unexpected and probably undesired reference metric scores.
- Score: 0.9582978458237521
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Reference metrics have been developed to objectively and quantitatively compare two images. Especially for evaluating the quality of reconstructed or compressed images, these metrics have shown very useful. Extensive tests of such metrics on benchmarks of artificially distorted natural images have revealed which metric best correlate with human perception of quality. Direct transfer of these metrics to the evaluation of generative models in medical imaging, however, can easily lead to pitfalls, because assumptions about image content, image data format and image interpretation are often very different. Also, the correlation of reference metrics and human perception of quality can vary strongly for different kinds of distortions and commonly used metrics, such as SSIM, PSNR and MAE are not the best choice for all situations. We selected five pitfalls that showcase unexpected and probably undesired reference metric scores and discuss strategies to avoid them.
Related papers
- Non-Reference Quality Assessment for Medical Imaging: Application to Synthetic Brain MRIs [0.0]
This study introduces a novel deep learning-based non-reference approach to assess brain MRI quality by training a 3D ResNet.
The network is designed to estimate quality across six distinct artifacts commonly encountered in MRI scans.
Results demonstrate superior performance in accurately estimating distortions and reflecting image quality from multiple perspectives.
arXiv Detail & Related papers (2024-07-20T22:05:30Z) - Similarity and Quality Metrics for MR Image-To-Image Translation [0.8932296777085644]
We quantitatively analyze 11 similarity (reference) and 12 quality (non-reference) metrics for assessing synthetic images.
We investigate the sensitivity regarding 11 kinds of distortions and typical MR artifacts, and analyze the influence of different normalization methods on each metric and distortion.
arXiv Detail & Related papers (2024-05-14T08:51:16Z) - Compressed image quality assessment using stacking [4.971244477217376]
Generalization can be regarded as the major challenge in compressed image quality assessment.
Both semantic and low-level information are employed in the presented IQA to predict the human visual system.
The accuracy of the quality benchmark of the clic2024 perceptual image challenge was achieved 79.6%.
arXiv Detail & Related papers (2024-02-01T20:12:26Z) - Privacy Assessment on Reconstructed Images: Are Existing Evaluation
Metrics Faithful to Human Perception? [86.58989831070426]
We study the faithfulness of hand-crafted metrics to human perception of privacy information from reconstructed images.
We propose a learning-based measure called SemSim to evaluate the Semantic Similarity between the original and reconstructed images.
arXiv Detail & Related papers (2023-09-22T17:58:04Z) - Introspective Deep Metric Learning for Image Retrieval [80.29866561553483]
We argue that a good similarity model should consider the semantic discrepancies with caution to better deal with ambiguous images for more robust training.
We propose to represent an image using not only a semantic embedding but also an accompanying uncertainty embedding, which describes the semantic characteristics and ambiguity of an image, respectively.
The proposed IDML framework improves the performance of deep metric learning through uncertainty modeling and attains state-of-the-art results on the widely used CUB-200-2011, Cars196, and Stanford Online Products datasets.
arXiv Detail & Related papers (2022-05-09T17:51:44Z) - Image Quality Assessment for Magnetic Resonance Imaging [4.05136808278614]
Image quality assessment (IQA) algorithms aim to reproduce the human's perception of the image quality.
We use outputs of neural network models trained to solve problems relevant to MRI.
Seven trained radiologists assess distorted images, with their verdicts then correlated with 35 different image quality metrics.
arXiv Detail & Related papers (2022-03-15T11:52:29Z) - Transparent Human Evaluation for Image Captioning [70.03979566548823]
We develop a rubric-based human evaluation protocol for image captioning models.
We show that human-generated captions show substantially higher quality than machine-generated ones.
We hope that this work will promote a more transparent evaluation protocol for image captioning.
arXiv Detail & Related papers (2021-11-17T07:09:59Z) - Learning Conditional Knowledge Distillation for Degraded-Reference Image
Quality Assessment [157.1292674649519]
We propose a practical solution named degraded-reference IQA (DR-IQA)
DR-IQA exploits the inputs of IR models, degraded images, as references.
Our results can even be close to the performance of full-reference settings.
arXiv Detail & Related papers (2021-08-18T02:35:08Z) - Malignancy Prediction and Lesion Identification from Clinical
Dermatological Images [65.1629311281062]
We consider machine-learning-based malignancy prediction and lesion identification from clinical dermatological images.
We first identify all lesions present in the image regardless of sub-type or likelihood of malignancy, then it estimates their likelihood of malignancy, and through aggregation, it also generates an image-level likelihood of malignancy.
arXiv Detail & Related papers (2021-04-02T20:52:05Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.