Unsupervised multi-latent space reinforcement learning framework for
video summarization in ultrasound imaging
- URL: http://arxiv.org/abs/2109.01309v1
- Date: Fri, 3 Sep 2021 04:50:35 GMT
- Title: Unsupervised multi-latent space reinforcement learning framework for
video summarization in ultrasound imaging
- Authors: Roshan P Mathews, Mahesh Raveendranatha Panicker, Abhilash R
Hareendranathan, Yale Tung Chen, Jacob L Jaremko, Brian Buchanan, Kiran
Vishnu Narayan, Kesavadas C, Greeta Mathews
- Abstract summary: The COVID-19 pandemic has highlighted the need for a tool to speed up triage in ultrasound scans.
The proposed video-summarization technique is a step in this direction.
We propose a new unsupervised reinforcement learning framework with novel rewards.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The COVID-19 pandemic has highlighted the need for a tool to speed up triage
in ultrasound scans and provide clinicians with fast access to relevant
information. The proposed video-summarization technique is a step in this
direction that provides clinicians access to relevant key-frames from a given
ultrasound scan (such as lung ultrasound) while reducing resource, storage and
bandwidth requirements. We propose a new unsupervised reinforcement learning
(RL) framework with novel rewards that facilitates unsupervised learning
avoiding tedious and impractical manual labelling for summarizing ultrasound
videos to enhance its utility as a triage tool in the emergency department (ED)
and for use in telemedicine. Using an attention ensemble of encoders, the high
dimensional image is projected into a low dimensional latent space in terms of:
a) reduced distance with a normal or abnormal class (classifier encoder), b)
following a topology of landmarks (segmentation encoder), and c) the distance
or topology agnostic latent representation (convolutional autoencoders). The
decoder is implemented using a bi-directional long-short term memory (Bi-LSTM)
which utilizes the latent space representation from the encoder. Our new
paradigm for video summarization is capable of delivering classification labels
and segmentation of key landmarks for each of the summarized keyframes.
Validation is performed on lung ultrasound (LUS) dataset, that typically
represent potential use cases in telemedicine and ED triage acquired from
different medical centers across geographies (India, Spain and Canada).
Related papers
- CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images.
The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism.
We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z) - AiAReSeg: Catheter Detection and Segmentation in Interventional
Ultrasound using Transformers [75.20925220246689]
endovascular surgeries are performed using the golden standard of Fluoroscopy, which uses ionising radiation to visualise catheters and vasculature.
This work proposes a solution using an adaptation of a state-of-the-art machine learning transformer architecture to detect and segment catheters in axial interventional Ultrasound image sequences.
arXiv Detail & Related papers (2023-09-25T19:34:12Z) - EDMAE: An Efficient Decoupled Masked Autoencoder for Standard View
Identification in Pediatric Echocardiography [16.215207742732893]
The Efficient Decoupled Masked Autoencoder (EDMAE) is a novel self-supervised method for recognizing standard views in pediatric echocardiography.
EDMAE uses pure convolution operations instead of the ViT structure in the MAE encoder.
The proposed method achieves high classification accuracy in 27 standard views of pediatric echocardiography.
arXiv Detail & Related papers (2023-02-27T15:17:01Z) - Focused Decoding Enables 3D Anatomical Detection by Transformers [64.36530874341666]
We propose a novel Detection Transformer for 3D anatomical structure detection, dubbed Focused Decoder.
Focused Decoder leverages information from an anatomical region atlas to simultaneously deploy query anchors and restrict the cross-attention's field of view.
We evaluate our proposed approach on two publicly available CT datasets and demonstrate that Focused Decoder not only provides strong detection results and thus alleviates the need for a vast amount of annotated data but also exhibits exceptional and highly intuitive explainability of results via attention weights.
arXiv Detail & Related papers (2022-07-21T22:17:21Z) - Temporally Constrained Neural Networks (TCNN): A framework for
semi-supervised video semantic segmentation [5.0754434714665715]
We present Temporally Constrained Neural Networks (TCNN), a semi-supervised framework used for video semantic segmentation of surgical videos.
In this work, we show that autoencoder networks can be used to efficiently provide both spatial and temporal supervisory signals.
We demonstrate that lower-dimensional representations of predicted masks can be leveraged to provide a consistent improvement on both sparsely labeled datasets.
arXiv Detail & Related papers (2021-12-27T18:06:12Z) - Voice-assisted Image Labelling for Endoscopic Ultrasound Classification
using Neural Networks [48.732863591145964]
We propose a multi-modal convolutional neural network architecture that labels endoscopic ultrasound (EUS) images from raw verbal comments provided by a clinician during the procedure.
Our results show a prediction accuracy of 76% at image level on a dataset with 5 different labels.
arXiv Detail & Related papers (2021-10-12T21:22:24Z) - Deep Learning for Ultrasound Beamforming [120.12255978513912]
Beamforming, the process of mapping received ultrasound echoes to the spatial image domain, lies at the heart of the ultrasound image formation chain.
Modern ultrasound imaging leans heavily on innovations in powerful digital receive channel processing.
Deep learning methods can play a compelling role in the digital beamforming pipeline.
arXiv Detail & Related papers (2021-09-23T15:15:21Z) - Atrous Residual Interconnected Encoder to Attention Decoder Framework
for Vertebrae Segmentation via 3D Volumetric CT Images [1.8146155083014204]
This paper proposes a novel algorithm for automated vertebrae segmentation via 3D volumetric spine CT images.
The proposed model is based on the structure of encoder to decoder, using layer normalization to optimize mini-batch training performance.
The experimental results show that our model achieves competitive performance compared with other state-of-the-art medical semantic segmentation methods.
arXiv Detail & Related papers (2021-04-08T12:09:16Z) - Deep Q-Network-Driven Catheter Segmentation in 3D US by Hybrid
Constrained Semi-Supervised Learning and Dual-UNet [74.22397862400177]
We propose a novel catheter segmentation approach, which requests fewer annotations than the supervised learning method.
Our scheme considers a deep Q learning as the pre-localization step, which avoids voxel-level annotation.
With the detected catheter, patch-based Dual-UNet is applied to segment the catheter in 3D volumetric data.
arXiv Detail & Related papers (2020-06-25T21:10:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.