Related papers: A Semantically-Aware Relevance Measure for Content-Based Medical Image Retrieval Evaluation

A Semantically-Aware Relevance Measure for Content-Based Medical Image Retrieval Evaluation

URL: http://arxiv.org/abs/2506.13509v1
Date: Mon, 16 Jun 2025 14:04:48 GMT
Title: A Semantically-Aware Relevance Measure for Content-Based Medical Image Retrieval Evaluation
Authors: Xiaoyang Wei, Camille Kurtz, Florence Cloppet,
Abstract summary: We propose a novel relevance measure for the evaluation of CBIR by defining an approximate matching-based relevance score between two sets of medical concepts.<n>We quantitatively demonstrate the effectiveness and feasibility of our relevance measure using a public dataset.
Score: 0.4915744683251149
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Performance evaluation for Content-Based Image Retrieval (CBIR) remains a crucial but unsolved problem today especially in the medical domain. Various evaluation metrics have been discussed in the literature to solve this problem. Most of the existing metrics (e.g., precision, recall) are adapted from classification tasks which require manual labels as ground truth. However, such labels are often expensive and unavailable in specific thematic domains. Furthermore, medical images are usually associated with (radiological) case reports or annotated with descriptive captions in literature figures, such text contains information that can help to assess CBIR.Several researchers have argued that the medical concepts hidden in the text can serve as the basis for CBIR evaluation purpose. However, these works often consider these medical concepts as independent and isolated labels while in fact the subtle relationships between various concepts are neglected. In this work, we introduce the use of knowledge graphs to measure the distance between various medical concepts and propose a novel relevance measure for the evaluation of CBIR by defining an approximate matching-based relevance score between two sets of medical concepts which allows us to indirectly measure the similarity between medical images.We quantitatively demonstrate the effectiveness and feasibility of our relevance measure using a public dataset.

Related papers

Metrics that matter: Evaluating image quality metrics for medical image generation [48.85783422900129]
This study comprehensively assesses commonly used no-reference image quality metrics using brain MRI data.<n>We evaluate metric sensitivity to a range of challenges, including noise, distribution shifts, and, critically, morphological alterations designed to mimic clinically relevant inaccuracies.
arXiv Detail & Related papers (2025-05-12T01:57:25Z)
Fréchet Radiomic Distance (FRD): A Versatile Metric for Comparing Medical Imaging Datasets [13.737058479403311]
We introduce a new perceptual metric tailored for medical images, FRD (Fr'echet Radiomic Distance)<n>We show that FRD is superior to other image distribution metrics for a range of medical imaging applications.<n> FRD offers additional benefits such as stability and computational efficiency at low sample sizes.
arXiv Detail & Related papers (2024-12-02T13:49:14Z)
Image-aware Evaluation of Generated Medical Reports [11.190146577567548]
The paper proposes a novel evaluation metric for automatic medical report generation from X-ray images, VLScore. The key idea of our metric is to measure the similarity between radiology reports while considering the corresponding image. We demonstrate the benefit of our metric through evaluation on a dataset where radiologists marked errors in pairs of reports, showing notable alignment with radiologists' judgments.
arXiv Detail & Related papers (2024-10-22T18:50:20Z)
FedMedICL: Towards Holistic Evaluation of Distribution Shifts in Federated Medical Imaging [68.6715007665896]
FedMedICL is a unified framework and benchmark to holistically evaluate federated medical imaging challenges. We comprehensively evaluate several popular methods on six diverse medical imaging datasets. We find that a simple batch balancing technique surpasses advanced methods in average performance across FedMedICL experiments.
arXiv Detail & Related papers (2024-07-11T19:12:23Z)
RaTEScore: A Metric for Radiology Report Generation [59.37561810438641]
This paper introduces a novel, entity-aware metric, as Radiological Report (Text) Evaluation (RaTEScore) RaTEScore emphasizes crucial medical entities such as diagnostic outcomes and anatomical details, and is robust against complex medical synonyms and sensitive to negation expressions. Our evaluations demonstrate that RaTEScore aligns more closely with human preference than existing metrics, validated both on established public benchmarks and our newly proposed RaTE-Eval benchmark.
arXiv Detail & Related papers (2024-06-24T17:49:28Z)
Uncertainty-aware Medical Diagnostic Phrase Identification and Grounding [72.18719355481052]
We introduce a novel task called Medical Report Grounding (MRG)<n>MRG aims to directly identify diagnostic phrases and their corresponding grounding boxes from medical reports in an end-to-end manner.<n>We propose uMedGround, a robust and reliable framework that leverages a multimodal large language model to predict diagnostic phrases.
arXiv Detail & Related papers (2024-04-10T07:41:35Z)
Semantic Textual Similarity Assessment in Chest X-ray Reports Using a Domain-Specific Cosine-Based Metric [1.7802147489386628]
We introduce a novel approach designed specifically for assessing the semantic similarity between generated medical reports and the ground truth. Our approach is validated, demonstrating its efficiency in assessing domain-specific semantic similarity within medical contexts.
arXiv Detail & Related papers (2024-02-19T07:48:25Z)
MISm: A Medical Image Segmentation Metric for Evaluation of weak labeled Data [0.440401067183266]
We propose a new medical image segmentation metric: MISm. In order to allow application in the community and to experimental results, we included MISm in the publicly available evaluation framework MISeval.
arXiv Detail & Related papers (2022-10-24T22:55:00Z)
Impact of detecting clinical trial elements in exploration of COVID-19 literature [29.027162080682643]
We compare the results retrieved by a standard search engine with those filtered using clinically-relevant concepts and their relations. We find that the relational concept selection filters the original retrieved collection in a way that decreases the proportion of unjudged documents.
arXiv Detail & Related papers (2021-05-25T23:41:24Z)
Unifying Relational Sentence Generation and Retrieval for Medical Image Report Composition [142.42920413017163]
Current methods often generate the most common sentences due to dataset bias for individual case. We propose a novel framework that unifies template retrieval and sentence generation to handle both common and rare abnormality.
arXiv Detail & Related papers (2021-01-09T04:33:27Z)
Semi-supervised Medical Image Classification with Relation-driven Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification. It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations. Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.