EDMAE: An Efficient Decoupled Masked Autoencoder for Standard View
Identification in Pediatric Echocardiography
- URL: http://arxiv.org/abs/2302.13869v3
- Date: Thu, 3 Aug 2023 07:46:03 GMT
- Title: EDMAE: An Efficient Decoupled Masked Autoencoder for Standard View
Identification in Pediatric Echocardiography
- Authors: Yiman Liu, Xiaoxiang Han, Tongtong Liang, Bin Dong, Jiajun Yuan,
Menghan Hu, Qiaohong Liu, Jiangang Chen, Qingli Li, Yuqi Zhang
- Abstract summary: The Efficient Decoupled Masked Autoencoder (EDMAE) is a novel self-supervised method for recognizing standard views in pediatric echocardiography.
EDMAE uses pure convolution operations instead of the ViT structure in the MAE encoder.
The proposed method achieves high classification accuracy in 27 standard views of pediatric echocardiography.
- Score: 16.215207742732893
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces the Efficient Decoupled Masked Autoencoder (EDMAE), a
novel self-supervised method for recognizing standard views in pediatric
echocardiography. EDMAE introduces a new proxy task based on the
encoder-decoder structure. The EDMAE encoder is composed of a teacher and a
student encoder. The teacher encoder extracts the potential representation of
the masked image blocks, while the student encoder extracts the potential
representation of the visible image blocks. The loss is calculated between the
feature maps output by the two encoders to ensure consistency in the latent
representations they extract. EDMAE uses pure convolution operations instead of
the ViT structure in the MAE encoder. This improves training efficiency and
convergence speed. EDMAE is pre-trained on a large-scale private dataset of
pediatric echocardiography using self-supervised learning, and then fine-tuned
for standard view recognition. The proposed method achieves high classification
accuracy in 27 standard views of pediatric echocardiography. To further verify
the effectiveness of the proposed method, the authors perform another
downstream task of cardiac ultrasound segmentation on the public dataset CAMUS.
The experimental results demonstrate that the proposed method outperforms some
popular supervised and recent self-supervised methods, and is more competitive
on different downstream tasks.
Related papers
- Bringing Masked Autoencoders Explicit Contrastive Properties for Point Cloud Self-Supervised Learning [116.75939193785143]
Contrastive learning (CL) for Vision Transformers (ViTs) in image domains has achieved performance comparable to CL for traditional convolutional backbones.
In 3D point cloud pretraining with ViTs, masked autoencoder (MAE) modeling remains dominant.
arXiv Detail & Related papers (2024-07-08T12:28:56Z) - DopUS-Net: Quality-Aware Robotic Ultrasound Imaging based on Doppler
Signal [48.97719097435527]
DopUS-Net combines the Doppler images with B-mode images to increase the segmentation accuracy and robustness of small blood vessels.
An artery re-identification module qualitatively evaluate the real-time segmentation results and automatically optimize the probe pose for enhanced Doppler images.
arXiv Detail & Related papers (2023-05-15T18:19:29Z) - Rethinking Boundary Detection in Deep Learning Models for Medical Image
Segmentation [27.322629156662547]
A novel network architecture, referred to as Convolution, Transformer, and Operator (CTO) is proposed.
CTO employs a combination of Convolutional Neural Networks (CNNs), Vision Transformer (ViT), and an explicit boundary detection operator to achieve high recognition accuracy.
The performance of the proposed method is evaluated on six challenging medical image segmentation datasets.
arXiv Detail & Related papers (2023-05-01T06:13:08Z) - Attentive Symmetric Autoencoder for Brain MRI Segmentation [56.02577247523737]
We propose a novel Attentive Symmetric Auto-encoder based on Vision Transformer (ViT) for 3D brain MRI segmentation tasks.
In the pre-training stage, the proposed auto-encoder pays more attention to reconstruct the informative patches according to the gradient metrics.
Experimental results show that our proposed attentive symmetric auto-encoder outperforms the state-of-the-art self-supervised learning methods and medical image segmentation models.
arXiv Detail & Related papers (2022-09-19T09:43:19Z) - Masked Autoencoders that Listen [79.99280830830854]
This paper studies a simple extension of image-based Masked Autoencoders (MAE) to self-supervised representation learning from audio spectrograms.
Following the Transformer encoder-decoder design in MAE, our Audio-MAE first encodes audio spectrogram patches with a high masking ratio, feeding only the non-masked tokens through encoder layers.
The decoder then re-orders and decodes the encoded context padded with mask tokens, in order to reconstruct the input spectrogram.
arXiv Detail & Related papers (2022-07-13T17:59:55Z) - LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text
Retrieval [117.15862403330121]
We propose LoopITR, which combines dual encoders and cross encoders in the same network for joint learning.
Specifically, we let the dual encoder provide hard negatives to the cross encoder, and use the more discriminative cross encoder to distill its predictions back to the dual encoder.
arXiv Detail & Related papers (2022-03-10T16:41:12Z) - Scribble-Supervised Medical Image Segmentation via Dual-Branch Network
and Dynamically Mixed Pseudo Labels Supervision [15.414578073908906]
We propose a simple yet efficient scribble-supervised image segmentation method and apply it to cardiac MRI segmentation.
By combining the scribble supervision and auxiliary pseudo labels supervision, the dual-branch network can efficiently learn from scribble annotations end-to-end.
arXiv Detail & Related papers (2022-03-04T02:50:30Z) - Context Autoencoder for Self-Supervised Representation Learning [64.63908944426224]
We pretrain an encoder by making predictions in the encoded representation space.
The network is an encoder-regressor-decoder architecture.
We demonstrate the effectiveness of our CAE through superior transfer performance in downstream tasks.
arXiv Detail & Related papers (2022-02-07T09:33:45Z) - Unsupervised multi-latent space reinforcement learning framework for
video summarization in ultrasound imaging [0.0]
The COVID-19 pandemic has highlighted the need for a tool to speed up triage in ultrasound scans.
The proposed video-summarization technique is a step in this direction.
We propose a new unsupervised reinforcement learning framework with novel rewards.
arXiv Detail & Related papers (2021-09-03T04:50:35Z) - EncoderMI: Membership Inference against Pre-trained Encoders in
Contrastive Learning [27.54202989524394]
We proposeMI, the first membership inference method against image encoders pre-trained by contrastive learning.
We evaluateMI on image encoders pre-trained on multiple datasets by ourselves as well as the Contrastive Language-Image Pre-training (CLIP) image encoder, which is pre-trained on 400 million (image, text) pairs collected from the Internet and released by OpenAI.
arXiv Detail & Related papers (2021-08-25T03:00:45Z) - Atrous Residual Interconnected Encoder to Attention Decoder Framework
for Vertebrae Segmentation via 3D Volumetric CT Images [1.8146155083014204]
This paper proposes a novel algorithm for automated vertebrae segmentation via 3D volumetric spine CT images.
The proposed model is based on the structure of encoder to decoder, using layer normalization to optimize mini-batch training performance.
The experimental results show that our model achieves competitive performance compared with other state-of-the-art medical semantic segmentation methods.
arXiv Detail & Related papers (2021-04-08T12:09:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.