Building RadiologyNET: Unsupervised annotation of a large-scale
multimodal medical database
- URL: http://arxiv.org/abs/2308.08517v1
- Date: Thu, 27 Jul 2023 13:00:33 GMT
- Title: Building RadiologyNET: Unsupervised annotation of a large-scale
multimodal medical database
- Authors: Mateja Napravnik, Franko Hr\v{z}i\'c, Sebastian Tschauner, Ivan
\v{S}tajduhar
- Abstract summary: The usage of machine learning in medical diagnosis and treatment has witnessed significant growth in recent years.
However, the availability of large annotated image datasets remains a major obstacle since the process of annotation is time-consuming and costly.
This paper explores how to automatically annotate a database of medical radiology images with regard to their semantic similarity.
- Score: 0.4915744683251151
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Background and objective: The usage of machine learning in medical diagnosis
and treatment has witnessed significant growth in recent years through the
development of computer-aided diagnosis systems that are often relying on
annotated medical radiology images. However, the availability of large
annotated image datasets remains a major obstacle since the process of
annotation is time-consuming and costly. This paper explores how to
automatically annotate a database of medical radiology images with regard to
their semantic similarity.
Material and methods: An automated, unsupervised approach is used to
construct a large annotated dataset of medical radiology images originating
from Clinical Hospital Centre Rijeka, Croatia, utilising multimodal sources,
including images, DICOM metadata, and narrative diagnoses. Several appropriate
feature extractors are tested for each of the data sources, and their utility
is evaluated using k-means and k-medoids clustering on a representative data
subset.
Results: The optimal feature extractors are then integrated into a multimodal
representation, which is then clustered to create an automated pipeline for
labelling a precursor dataset of 1,337,926 medical images into 50 clusters of
visually similar images. The quality of the clusters is assessed by examining
their homogeneity and mutual information, taking into account the anatomical
region and modality representation.
Conclusion: The results suggest that fusing the embeddings of all three data
sources together works best for the task of unsupervised clustering of
large-scale medical data, resulting in the most concise clusters. Hence, this
work is the first step towards building a much larger and more fine-grained
annotated dataset of medical radiology images.
Related papers
- Efficient Medical Image Retrieval Using DenseNet and FAISS for BIRADS Classification [0.0]
We propose an approach to medical image retrieval using DenseNet and FAISS.
DenseNet is well-suited for feature extraction in complex medical images.
FAISS enables efficient handling of high-dimensional data in large-scale datasets.
arXiv Detail & Related papers (2024-11-03T08:14:31Z) - Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training [99.2891802841936]
We introduce the Med-ST framework for fine-grained spatial and temporal modeling.
For spatial modeling, Med-ST employs the Mixture of View Expert (MoVE) architecture to integrate different visual features from both frontal and lateral views.
For temporal modeling, we propose a novel cross-modal bidirectional cycle consistency objective by forward mapping classification (FMC) and reverse mapping regression (RMR)
arXiv Detail & Related papers (2024-05-30T03:15:09Z) - HyperFusion: A Hypernetwork Approach to Multimodal Integration of Tabular and Medical Imaging Data for Predictive Modeling [4.44283662576491]
We present a novel framework based on hypernetworks to fuse clinical imaging and tabular data by conditioning the image processing on the EHR's values and measurements.
We show that our framework outperforms both single-modality models and state-of-the-art MRI-tabular data fusion methods.
arXiv Detail & Related papers (2024-03-20T05:50:04Z) - Eye-gaze Guided Multi-modal Alignment for Medical Representation Learning [65.54680361074882]
Eye-gaze Guided Multi-modal Alignment (EGMA) framework harnesses eye-gaze data for better alignment of medical visual and textual features.
We conduct downstream tasks of image classification and image-text retrieval on four medical datasets.
arXiv Detail & Related papers (2024-03-19T03:59:14Z) - Radiology Report Generation Using Transformers Conditioned with
Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information.
The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z) - Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report
Generation [47.250147322130545]
Image-to-text radiology report generation aims to automatically produce radiology reports that describe the findings in medical images.
Most existing methods focus solely on the image data, disregarding the other patient information accessible to radiologists.
We present a novel multi-modal deep neural network framework for generating chest X-rays reports by integrating structured patient data, such as vital signs and symptoms, alongside unstructured clinical notes.
arXiv Detail & Related papers (2023-11-18T14:37:53Z) - C^2M-DoT: Cross-modal consistent multi-view medical report generation
with domain transfer network [67.97926983664676]
We propose a cross-modal consistent multi-view medical report generation with a domain transfer network (C2M-DoT)
C2M-DoT substantially outperforms state-of-the-art baselines in all metrics.
arXiv Detail & Related papers (2023-10-09T02:31:36Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - A Spatial Guided Self-supervised Clustering Network for Medical Image
Segmentation [16.448375091671004]
We propose a new spatial guided self-supervised clustering network (SGSCN) for medical image segmentation.
It iteratively learns feature representations and clustering assignment of each pixel in an end-to-end fashion from a single image.
We evaluated our method on 2 public medical image datasets and compared it to existing conventional and self-supervised clustering methods.
arXiv Detail & Related papers (2021-07-11T00:40:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.