Factored Attention and Embedding for Unstructured-view Topic-related
Ultrasound Report Generation
- URL: http://arxiv.org/abs/2203.06458v1
- Date: Sat, 12 Mar 2022 15:24:03 GMT
- Title: Factored Attention and Embedding for Unstructured-view Topic-related
Ultrasound Report Generation
- Authors: Fuhai Chen, Rongrong Ji, Chengpeng Dai, Xuri Ge, Shengchuang Zhang,
Xiaojing Ma, Yue Gao
- Abstract summary: We propose a novel factored attention and embedding model (termed FAE-Gen) for the unstructured-view topic-related ultrasound report generation.
The proposed FAE-Gen mainly consists of two modules, i.e., view-guided factored attention and topic-oriented factored embedding, which capture the homogeneous and heterogeneous morphological characteristic across different views.
- Score: 70.7778938191405
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Echocardiography is widely used to clinical practice for diagnosis and
treatment, e.g., on the common congenital heart defects. The traditional manual
manipulation is error-prone due to the staff shortage, excess workload, and
less experience, leading to the urgent requirement of an automated
computer-aided reporting system to lighten the workload of ultrasonologists
considerably and assist them in decision making. Despite some recent successful
attempts in automatical medical report generation, they are trapped in the
ultrasound report generation, which involves unstructured-view images and
topic-related descriptions. To this end, we investigate the task of the
unstructured-view topic-related ultrasound report generation, and propose a
novel factored attention and embedding model (termed FAE-Gen). The proposed
FAE-Gen mainly consists of two modules, i.e., view-guided factored attention
and topic-oriented factored embedding, which 1) capture the homogeneous and
heterogeneous morphological characteristic across different views, and 2)
generate the descriptions with different syntactic patterns and different
emphatic contents for different topics. Experimental evaluations are conducted
on a to-be-released large-scale clinical cardiovascular ultrasound dataset
(CardUltData). Both quantitative comparisons and qualitative analysis
demonstrate the effectiveness and the superiority of FAE-Gen over seven
commonly-used metrics.
Related papers
- FODA-PG for Enhanced Medical Imaging Narrative Generation: Adaptive Differentiation of Normal and Abnormal Attributes [26.912139217120874]
We propose FODA-PG, a novel Fine-grained Organ-Disease Adaptive Partitioning Graph framework.
FODA-PG constructs a granular representation of radiological findings by separating disease-related attributes into distinct "disease-specific" and "disease-free" categories.
By integrating this fine-grained semantic knowledge into a powerful transformer-based architecture, FODA-PG generates precise and clinically coherent reports.
arXiv Detail & Related papers (2024-09-06T00:04:35Z) - Ultrasound Report Generation with Cross-Modality Feature Alignment via Unsupervised Guidance [37.37279393074854]
We propose a novel framework for automatic ultrasound report generation, leveraging a combination of unsupervised and supervised learning methods.
Our framework incorporates unsupervised learning methods to extract potential knowledge from ultrasound text reports.
We design a global semantic comparison mechanism to enhance the performance of generating more comprehensive and accurate medical reports.
arXiv Detail & Related papers (2024-06-02T07:16:58Z) - Topicwise Separable Sentence Retrieval for Medical Report Generation [41.812337937025084]
We introduce a Topicwise Separable Sentence Retrieval (Teaser) for medical report generation.
To ensure comprehensive learning of both common and rare topics, we categorize queries into common and rare types, and then propose Topic Contrastive Loss.
Experiments on the MIMIC-CXR and IU X-ray datasets demonstrate that Teaser surpasses state-of-the-art models.
arXiv Detail & Related papers (2024-05-07T10:21:23Z) - GEMTrans: A General, Echocardiography-based, Multi-Level Transformer
Framework for Cardiovascular Diagnosis [14.737295160286939]
Vision-based machine learning (ML) methods have gained popularity to act as secondary layers of verification.
We propose a General, Echo-based, Multi-Level Transformer (GEMTrans) framework that provides explainability.
We show the flexibility of our framework by considering two critical tasks including ejection fraction (EF) and aortic stenosis (AS) severity detection.
arXiv Detail & Related papers (2023-08-25T07:30:18Z) - K-Space-Aware Cross-Modality Score for Synthesized Neuroimage Quality
Assessment [71.27193056354741]
The problem of how to assess cross-modality medical image synthesis has been largely unexplored.
We propose a new metric K-CROSS to spur progress on this challenging problem.
K-CROSS uses a pre-trained multi-modality segmentation network to predict the lesion location.
arXiv Detail & Related papers (2023-07-10T01:26:48Z) - Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation [116.87918100031153]
We propose a Cross-modal clinical Graph Transformer (CGT) for ophthalmic report generation (ORG)
CGT injects clinical relation triples into the visual features as prior knowledge to drive the decoding procedure.
Experiments on the large-scale FFA-IR benchmark demonstrate that the proposed CGT is able to outperform previous benchmark methods.
arXiv Detail & Related papers (2022-06-04T13:16:30Z) - AlignTransformer: Hierarchical Alignment of Visual Regions and Disease
Tags for Medical Report Generation [50.21065317817769]
We propose an AlignTransformer framework, which includes the Align Hierarchical Attention (AHA) and the Multi-Grained Transformer (MGT) modules.
Experiments on the public IU-Xray and MIMIC-CXR datasets show that the AlignTransformer can achieve results competitive with state-of-the-art methods on the two datasets.
arXiv Detail & Related papers (2022-03-18T13:43:53Z) - Towards A Device-Independent Deep Learning Approach for the Automated
Segmentation of Sonographic Fetal Brain Structures: A Multi-Center and
Multi-Device Validation [0.0]
We propose a DL based segmentation framework for the automated segmentation of 10 key fetal brain structures from 2 axial planes from fetal brain USG images (2D)
The proposed DL system offered a promising and generalizable performance (multi-centers, multi-device) and also presents evidence in support of device-induced variation in image quality.
arXiv Detail & Related papers (2022-02-28T05:42:03Z) - Variational Topic Inference for Chest X-Ray Report Generation [102.04931207504173]
Report generation for medical imaging promises to reduce workload and assist diagnosis in clinical practice.
Recent work has shown that deep learning models can successfully caption natural images.
We propose variational topic inference for automatic report generation.
arXiv Detail & Related papers (2021-07-15T13:34:38Z) - Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report
Generation [107.3538598876467]
We propose an Auxiliary Signal-Guided Knowledge-Decoder (ASGK) to mimic radiologists' working patterns.
ASGK integrates internal visual feature fusion and external medical linguistic information to guide medical knowledge transfer and learning.
arXiv Detail & Related papers (2020-06-06T01:00:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.