Ultrasound Video Transformers for Cardiac Ejection Fraction Estimation
- URL: http://arxiv.org/abs/2107.00977v1
- Date: Fri, 2 Jul 2021 11:23:09 GMT
- Title: Ultrasound Video Transformers for Cardiac Ejection Fraction Estimation
- Authors: Hadrien Reynaud, Athanasios Vlontzos, Benjamin Hou, Arian Beqiri, Paul
Leeson, Bernhard Kainz
- Abstract summary: We propose a novel approach to ultrasound video analysis using a Residual Auto-Encoder Network and a BERT model adapted for token classification.
We apply our model to the task of End-Systolic (ES) and End-Diastolic (ED) frame detection and the automated computation of the left ventricular ejection fraction.
Our end-to-end learnable approach can estimate the ejection fraction with a MAE of 5.95 and $R2$ of 0.52 in 0.15s per video, showing that segmentation is not the only way to predict ejection fraction.
- Score: 3.188100483042461
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cardiac ultrasound imaging is used to diagnose various heart diseases. Common
analysis pipelines involve manual processing of the video frames by expert
clinicians. This suffers from intra- and inter-observer variability. We propose
a novel approach to ultrasound video analysis using a transformer architecture
based on a Residual Auto-Encoder Network and a BERT model adapted for token
classification. This enables videos of any length to be processed. We apply our
model to the task of End-Systolic (ES) and End-Diastolic (ED) frame detection
and the automated computation of the left ventricular ejection fraction. We
achieve an average frame distance of 3.36 frames for the ES and 7.17 frames for
the ED on videos of arbitrary length. Our end-to-end learnable approach can
estimate the ejection fraction with a MAE of 5.95 and $R^2$ of 0.52 in 0.15s
per video, showing that segmentation is not the only way to predict ejection
fraction. Code and models are available at https://github.com/HReynaud/UVT.
Related papers
- Towards a Benchmark for Colorectal Cancer Segmentation in Endorectal Ultrasound Videos: Dataset and Model Development [59.74920439478643]
In this paper, we collect and annotated the first benchmark dataset that covers diverse ERUS scenarios.
Our ERUS-10K dataset comprises 77 videos and 10,000 high-resolution annotated frames.
We introduce a benchmark model for colorectal cancer segmentation, named the Adaptive Sparse-context TRansformer (ASTR)
arXiv Detail & Related papers (2024-08-19T15:04:42Z) - CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images.
The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism.
We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z) - Automated interpretation of congenital heart disease from multi-view
echocardiograms [10.238433789459624]
Congenital heart disease (CHD) is the most common birth defect and the leading cause of neonate death in China.
This study proposes to automatically analyze the multi-view echocardiograms with a practical end-to-end framework.
arXiv Detail & Related papers (2023-11-30T18:37:21Z) - Hierarchical Vision Transformers for Cardiac Ejection Fraction
Estimation [0.0]
We propose a deep learning approach, based on hierarchical vision Transformers, to estimate the ejection fraction from echocardiogram videos.
The proposed method can estimate ejection fraction without the need for left ventrice segmentation first, make it more efficient than other methods.
arXiv Detail & Related papers (2023-03-31T23:42:17Z) - Feature-Conditioned Cascaded Video Diffusion Models for Precise
Echocardiogram Synthesis [5.102090025931326]
We extend elucidated diffusion models for video modelling to generate plausible video sequences from single images.
Our image to sequence approach achieves an $R2$ score of 93%, 38 points higher than recently proposed sequence to sequence generation methods.
arXiv Detail & Related papers (2023-03-22T15:26:22Z) - EchoCoTr: Estimation of the Left Ventricular Ejection Fraction from
Spatiotemporal Echocardiography [0.0]
We propose a method that addresses the limitations we typically face when training on medical video data such as echocardiographic scans.
The algorithm we propose (EchoTr) utilizes the strength of vision transformers and CNNs to tackle the problem of estimating the left ventricular ejection fraction (LVEF) on ultrasound videos.
arXiv Detail & Related papers (2022-09-09T11:01:59Z) - Focused Decoding Enables 3D Anatomical Detection by Transformers [64.36530874341666]
We propose a novel Detection Transformer for 3D anatomical structure detection, dubbed Focused Decoder.
Focused Decoder leverages information from an anatomical region atlas to simultaneously deploy query anchors and restrict the cross-attention's field of view.
We evaluate our proposed approach on two publicly available CT datasets and demonstrate that Focused Decoder not only provides strong detection results and thus alleviates the need for a vast amount of annotated data but also exhibits exceptional and highly intuitive explainability of results via attention weights.
arXiv Detail & Related papers (2022-07-21T22:17:21Z) - D'ARTAGNAN: Counterfactual Video Generation [3.4079278794252232]
Causally-enabled machine learning frameworks could help clinicians to identify the best course of treatments by answering counterfactual questions.
We combine deep neural networks, twin causal networks and generative adversarial methods for the first time to build D'ARTAGNAN.
We generate new ultrasound videos, retaining the video style and anatomy of the original patient, with variations of the Ejection Fraction conditioned on a given input.
arXiv Detail & Related papers (2022-06-03T15:53:32Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - Learning Trajectory-Aware Transformer for Video Super-Resolution [50.49396123016185]
Video super-resolution aims to restore a sequence of high-resolution (HR) frames from their low-resolution (LR) counterparts.
Existing approaches usually align and aggregate video frames from limited adjacent frames.
We propose a novel Transformer for Video Super-Resolution (TTVSR)
arXiv Detail & Related papers (2022-04-08T03:37:39Z) - Towards Unsupervised Learning for Instrument Segmentation in Robotic
Surgery with Cycle-Consistent Adversarial Networks [54.00217496410142]
We propose an unpaired image-to-image translation where the goal is to learn the mapping between an input endoscopic image and a corresponding annotation.
Our approach allows to train image segmentation models without the need to acquire expensive annotations.
We test our proposed method on Endovis 2017 challenge dataset and show that it is competitive with supervised segmentation methods.
arXiv Detail & Related papers (2020-07-09T01:39:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.