Enhanced Knowledge Injection for Radiology Report Generation
- URL: http://arxiv.org/abs/2311.00399v1
- Date: Wed, 1 Nov 2023 09:50:55 GMT
- Title: Enhanced Knowledge Injection for Radiology Report Generation
- Authors: Qingqiu Li, Jilan Xu, Runtian Yuan, Mohan Chen, Yuejie Zhang, Rui
Feng, Xiaobo Zhang, Shang Gao
- Abstract summary: We propose an enhanced knowledge injection framework, which utilizes two branches to extract different types of knowledge.
By integrating this finer-grained and well-structured knowledge with the current image, we are able to leverage the multi-source knowledge gain to ultimately facilitate more accurate report generation.
- Score: 21.937372129714884
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic generation of radiology reports holds crucial clinical value, as it
can alleviate substantial workload on radiologists and remind less experienced
ones of potential anomalies. Despite the remarkable performance of various
image captioning methods in the natural image field, generating accurate
reports for medical images still faces challenges, i.e., disparities in visual
and textual data, and lack of accurate domain knowledge. To address these
issues, we propose an enhanced knowledge injection framework, which utilizes
two branches to extract different types of knowledge. The Weighted Concept
Knowledge (WCK) branch is responsible for introducing clinical medical concepts
weighted by TF-IDF scores. The Multimodal Retrieval Knowledge (MRK) branch
extracts triplets from similar reports, emphasizing crucial clinical
information related to entity positions and existence. By integrating this
finer-grained and well-structured knowledge with the current image, we are able
to leverage the multi-source knowledge gain to ultimately facilitate more
accurate report generation. Extensive experiments have been conducted on two
public benchmarks, demonstrating that our method achieves superior performance
over other state-of-the-art methods. Ablation studies further validate the
effectiveness of two extracted knowledge sources.
Related papers
- MLIP: Enhancing Medical Visual Representation with Divergence Encoder
and Knowledge-guided Contrastive Learning [48.97640824497327]
We propose a novel framework leveraging domain-specific medical knowledge as guiding signals to integrate language information into the visual domain through image-text contrastive learning.
Our model includes global contrastive learning with our designed divergence encoder, local token-knowledge-patch alignment contrastive learning, and knowledge-guided category-level contrastive learning with expert knowledge.
Notably, MLIP surpasses state-of-the-art methods even with limited annotated data, highlighting the potential of multimodal pre-training in advancing medical representation learning.
arXiv Detail & Related papers (2024-02-03T05:48:50Z) - Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report
Generation [47.250147322130545]
Image-to-text radiology report generation aims to automatically produce radiology reports that describe the findings in medical images.
Most existing methods focus solely on the image data, disregarding the other patient information accessible to radiologists.
We present a novel multi-modal deep neural network framework for generating chest X-rays reports by integrating structured patient data, such as vital signs and symptoms, alongside unstructured clinical notes.
arXiv Detail & Related papers (2023-11-18T14:37:53Z) - KiUT: Knowledge-injected U-Transformer for Radiology Report Generation [10.139767157037829]
Radiology report generation aims to automatically generate a clinically accurate and coherent paragraph from the X-ray image.
We propose a Knowledge-injected U-Transformer (KiUT) to learn multi-level visual representation and adaptively distill the information.
arXiv Detail & Related papers (2023-06-20T07:27:28Z) - K-Diag: Knowledge-enhanced Disease Diagnosis in Radiographic Imaging [40.52487429030841]
We propose a knowledge-enhanced framework, that enables training visual representation with the guidance of medical domain knowledge.
First, to explicitly incorporate experts' knowledge, we propose to learn a neural representation for the medical knowledge graph.
Second, while training the visual encoder, we keep the parameters of the knowledge encoder frozen and propose to learn a set of prompt vectors for efficient adaptation.
arXiv Detail & Related papers (2023-02-22T18:53:57Z) - Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation [116.87918100031153]
We propose a Cross-modal clinical Graph Transformer (CGT) for ophthalmic report generation (ORG)
CGT injects clinical relation triples into the visual features as prior knowledge to drive the decoding procedure.
Experiments on the large-scale FFA-IR benchmark demonstrate that the proposed CGT is able to outperform previous benchmark methods.
arXiv Detail & Related papers (2022-06-04T13:16:30Z) - Radiology Report Generation with a Learned Knowledge Base and
Multi-modal Alignment [27.111857943935725]
We present an automatic, multi-modal approach for report generation from chest x-ray.
Our approach features two distinct modules: (i) Learned knowledge base and (ii) Multi-modal alignment.
With the aid of both modules, our approach clearly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-12-30T10:43:56Z) - Knowledge Matters: Radiology Report Generation with General and Specific
Knowledge [24.995748604459013]
We propose a knowledge-enhanced radiology report generation approach.
By merging the visual features of the radiology image with general knowledge and specific knowledge, the proposed model can improve the quality of generated reports.
arXiv Detail & Related papers (2021-12-30T10:36:04Z) - Exploring and Distilling Posterior and Prior Knowledge for Radiology
Report Generation [55.00308939833555]
The PPKED includes three modules: Posterior Knowledge Explorer (PoKE), Prior Knowledge Explorer (PrKE) and Multi-domain Knowledge Distiller (MKD)
PoKE explores the posterior knowledge, which provides explicit abnormal visual regions to alleviate visual data bias.
PrKE explores the prior knowledge from the prior medical knowledge graph (medical knowledge) and prior radiology reports (working experience) to alleviate textual data bias.
arXiv Detail & Related papers (2021-06-13T11:10:02Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z) - Cross-Modal Information Maximization for Medical Imaging: CMIM [62.28852442561818]
In hospitals, data are siloed to specific information systems that make the same information available under different modalities.
This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time.
We propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time.
arXiv Detail & Related papers (2020-10-20T20:05:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.