Multimodal Foundation Models For Echocardiogram Interpretation
- URL: http://arxiv.org/abs/2308.15670v2
- Date: Sat, 2 Sep 2023 17:47:47 GMT
- Title: Multimodal Foundation Models For Echocardiogram Interpretation
- Authors: Matthew Christensen, Milos Vukadinovic, Neal Yuan, David Ouyang
- Abstract summary: We leverage 1,032,975 cardiac ultrasound videos and corresponding expert interpretations to develop EchoCLIP.
EchoCLIP displays strong zero-shot (not explicitly trained) performance in cardiac function assessment.
We also developed a long-context variant (EchoCLIP-R) with a custom echocardiography report text tokenizer.
- Score: 0.24578723416255746
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Multimodal deep learning foundation models can learn the relationship between
images and text. In the context of medical imaging, mapping images to language
concepts reflects the clinical task of diagnostic image interpretation, however
current general-purpose foundation models do not perform well in this context
because their training corpus have limited medical text and images. To address
this challenge and account for the range of cardiac physiology, we leverage
1,032,975 cardiac ultrasound videos and corresponding expert interpretations to
develop EchoCLIP, a multimodal foundation model for echocardiography. EchoCLIP
displays strong zero-shot (not explicitly trained) performance in cardiac
function assessment (external validation left ventricular ejection fraction
mean absolute error (MAE) of 7.1%) and identification of implanted intracardiac
devices (areas under the curve (AUC) between 0.84 and 0.98 for pacemakers and
artificial heart valves). We also developed a long-context variant (EchoCLIP-R)
with a custom echocardiography report text tokenizer which can accurately
identify unique patients across multiple videos (AUC of 0.86), identify
clinical changes such as orthotopic heart transplants (AUC of 0.79) or cardiac
surgery (AUC 0.77), and enable robust image-to-text search (mean cross-modal
retrieval rank in the top 1% of candidate text reports). These emergent
capabilities can be used for preliminary assessment and summarization of
echocardiographic findings.
Related papers
- CT-GLIP: 3D Grounded Language-Image Pretraining with CT Scans and Radiology Reports for Full-Body Scenarios [53.94122089629544]
We introduce CT-GLIP (Grounded Language-Image Pretraining with CT scans), a novel method that constructs organ-level image-text pairs to enhance multimodal contrastive learning.
Our method, trained on a multimodal CT dataset comprising 44,011 organ-level vision-text pairs from 17,702 patients across 104 organs, demonstrates it can identify organs and abnormalities in a zero-shot manner using natural languages.
arXiv Detail & Related papers (2024-04-23T17:59:01Z) - Echocardiogram Foundation Model -- Application 1: Estimating Ejection
Fraction [2.4164193358532438]
We introduce EchoAI, an echocardiogram foundation model, that is trained using self-supervised learning (SSL) on 1.5 million echocardiograms.
We evaluate our approach by fine-tuning EchoAI to estimate the ejection fraction achieving a mean absolute percentage error of 9.40%.
arXiv Detail & Related papers (2023-11-21T13:00:03Z) - M(otion)-mode Based Prediction of Ejection Fraction using
Echocardiograms [13.112371567924802]
We propose using the M(otion)-mode of echocardiograms for estimating the left ventricular ejection fraction (EF) and classifying cardiomyopathy.
We generate multiple artificial M-mode images from a single echocardiogram and combine them using off-the-shelf model architectures.
Our experiments show that the supervised setting converges with only ten modes and is comparable to the baseline method.
arXiv Detail & Related papers (2023-09-07T15:00:58Z) - Multi-scale, Data-driven and Anatomically Constrained Deep Learning
Image Registration for Adult and Fetal Echocardiography [4.923733944174007]
We propose a framework that combines three strategies for deep learning image registration in both fetal and adult echo.
Our tests show that good anatomical topology and image textures are strongly linked to shape-encoded and data-driven adversarial losses.
Our approach outperforms traditional non-DL gold standard registration approaches, including Optical Flow and Elastix.
arXiv Detail & Related papers (2023-09-02T05:33:31Z) - GEMTrans: A General, Echocardiography-based, Multi-Level Transformer
Framework for Cardiovascular Diagnosis [14.737295160286939]
Vision-based machine learning (ML) methods have gained popularity to act as secondary layers of verification.
We propose a General, Echo-based, Multi-Level Transformer (GEMTrans) framework that provides explainability.
We show the flexibility of our framework by considering two critical tasks including ejection fraction (EF) and aortic stenosis (AS) severity detection.
arXiv Detail & Related papers (2023-08-25T07:30:18Z) - Learning to diagnose cirrhosis from radiological and histological labels
with joint self and weakly-supervised pretraining strategies [62.840338941861134]
We propose to leverage transfer learning from large datasets annotated by radiologists, to predict the histological score available on a small annex dataset.
We compare different pretraining methods, namely weakly-supervised and self-supervised ones, to improve the prediction of the cirrhosis.
This method outperforms the baseline classification of the METAVIR score, reaching an AUC of 0.84 and a balanced accuracy of 0.75.
arXiv Detail & Related papers (2023-02-16T17:06:23Z) - Self-supervised contrastive learning of echocardiogram videos enables
label-efficient cardiac disease diagnosis [48.64462717254158]
We developed a self-supervised contrastive learning approach, EchoCLR, to catered to echocardiogram videos.
When fine-tuned on small portions of labeled data, EchoCLR pretraining significantly improved classification performance for left ventricular hypertrophy (LVH) and aortic stenosis (AS)
EchoCLR is unique in its ability to learn representations of medical videos and demonstrates that SSL can enable label-efficient disease classification from small, labeled datasets.
arXiv Detail & Related papers (2022-07-23T19:17:26Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report
Generation [107.3538598876467]
We propose an Auxiliary Signal-Guided Knowledge-Decoder (ASGK) to mimic radiologists' working patterns.
ASGK integrates internal visual feature fusion and external medical linguistic information to guide medical knowledge transfer and learning.
arXiv Detail & Related papers (2020-06-06T01:00:15Z) - Co-Heterogeneous and Adaptive Segmentation from Multi-Source and
Multi-Phase CT Imaging Data: A Study on Pathological Liver and Lesion
Segmentation [48.504790189796836]
We present a novel segmentation strategy, co-heterogenous and adaptive segmentation (CHASe)
We propose a versatile framework that fuses appearance based semi-supervision, mask based adversarial domain adaptation, and pseudo-labeling.
CHASe can further improve pathological liver mask Dice-Sorensen coefficients by ranges of $4.2% sim 9.4%$.
arXiv Detail & Related papers (2020-05-27T06:58:39Z) - Uncertainty Estimation in Deep 2D Echocardiography Segmentation [0.2062593640149623]
Uncertainty estimates can be important when testing on data coming from a distribution further away from that of the training data.
We show how uncertainty estimation can be used to automatically reject poor quality images and improve state-of-the-art segmentation results.
arXiv Detail & Related papers (2020-05-19T10:19:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.