LIMITR: Leveraging Local Information for Medical Image-Text
Representation
- URL: http://arxiv.org/abs/2303.11755v1
- Date: Tue, 21 Mar 2023 11:20:34 GMT
- Title: LIMITR: Leveraging Local Information for Medical Image-Text
Representation
- Authors: Gefen Dawidowicz, Elad Hirsch, Ayellet Tal
- Abstract summary: This paper focuses on chest X-ray images and their corresponding radiological reports.
It presents a new model that learns a joint X-ray image & report representation.
- Score: 17.102338932907294
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medical imaging analysis plays a critical role in the diagnosis and treatment
of various medical conditions. This paper focuses on chest X-ray images and
their corresponding radiological reports. It presents a new model that learns a
joint X-ray image & report representation. The model is based on a novel
alignment scheme between the visual data and the text, which takes into account
both local and global information. Furthermore, the model integrates
domain-specific information of two types -- lateral images and the consistent
visual structure of chest images. Our representation is shown to benefit three
types of retrieval tasks: text-image retrieval, class-based retrieval, and
phrase-grounding.
Related papers
- Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training [99.2891802841936]
We introduce the Med-ST framework for fine-grained spatial and temporal modeling.
For spatial modeling, Med-ST employs the Mixture of View Expert (MoVE) architecture to integrate different visual features from both frontal and lateral views.
For temporal modeling, we propose a novel cross-modal bidirectional cycle consistency objective by forward mapping classification (FMC) and reverse mapping regression (RMR)
arXiv Detail & Related papers (2024-05-30T03:15:09Z) - Self-supervised vision-langage alignment of deep learning representations for bone X-rays analysis [53.809054774037214]
This paper proposes leveraging vision-language pretraining on bone X-rays paired with French reports.
It is the first study to integrate French reports to shape the embedding space devoted to bone X-Rays representations.
arXiv Detail & Related papers (2024-05-14T19:53:20Z) - A Novel Corpus of Annotated Medical Imaging Reports and Information Extraction Results Using BERT-based Language Models [4.023338734079828]
Medical imaging is critical to the diagnosis, surveillance, and treatment of many health conditions.
Radiologists interpret these complex, unstructured images and articulate their assessments through narrative reports that remain largely unstructured.
This unstructured narrative must be converted into a structured semantic representation to facilitate secondary applications such as retrospective analyses or clinical decision support.
arXiv Detail & Related papers (2024-03-27T19:43:45Z) - Unified Medical Image Pre-training in Language-Guided Common Semantic Space [39.61770813855078]
We propose an Unified Medical Image Pre-training framework, namely UniMedI.
UniMedI uses diagnostic reports as common semantic space to create unified representations for diverse modalities of medical images.
We evaluate its performance on both 2D and 3D images across 10 different datasets.
arXiv Detail & Related papers (2023-11-24T22:01:12Z) - Vision-Language Modelling For Radiological Imaging and Reports In The
Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space.
We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains.
Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z) - Cyclic Generative Adversarial Networks With Congruent Image-Report
Generation For Explainable Medical Image Analysis [5.6512908295414]
We present a novel framework for explainable labeling and interpretation of medical images.
The aim of the work is to generate trustworthy and faithful explanations for the outputs of a model diagnosing chest x-ray images.
arXiv Detail & Related papers (2022-11-16T12:41:21Z) - Medical Image Captioning via Generative Pretrained Transformers [57.308920993032274]
We combine two language models, the Show-Attend-Tell and the GPT-3, to generate comprehensive and descriptive radiology records.
The proposed model is tested on two medical datasets, the Open-I, MIMIC-CXR, and the general-purpose MS-COCO.
arXiv Detail & Related papers (2022-09-28T10:27:10Z) - Variational Topic Inference for Chest X-Ray Report Generation [102.04931207504173]
Report generation for medical imaging promises to reduce workload and assist diagnosis in clinical practice.
Recent work has shown that deep learning models can successfully caption natural images.
We propose variational topic inference for automatic report generation.
arXiv Detail & Related papers (2021-07-15T13:34:38Z) - Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report
Generation [107.3538598876467]
We propose an Auxiliary Signal-Guided Knowledge-Decoder (ASGK) to mimic radiologists' working patterns.
ASGK integrates internal visual feature fusion and external medical linguistic information to guide medical knowledge transfer and learning.
arXiv Detail & Related papers (2020-06-06T01:00:15Z) - Show, Describe and Conclude: On Exploiting the Structure Information of
Chest X-Ray Reports [5.6070625920019825]
Chest X-Ray (CXR) images are commonly used for clinical screening and diagnosis.
The complex structures between and within sections of the reports pose a great challenge to the automatic report generation.
We propose a novel framework that exploits the structure information between and within report sections for generating CXR imaging reports.
arXiv Detail & Related papers (2020-04-26T02:29:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.