Spatiotemporal Feature Learning Based on Two-Step LSTM and Transformer
for CT Scans
- URL: http://arxiv.org/abs/2207.01579v1
- Date: Mon, 4 Jul 2022 16:59:05 GMT
- Title: Spatiotemporal Feature Learning Based on Two-Step LSTM and Transformer
for CT Scans
- Authors: Chih-Chung Hsu, Chi-Han Tsai, Guan-Lin Chen, Sin-Di Ma, Shen-Chieh Tai
- Abstract summary: We propose a novel, effective, two-step-wise approach to tickle this issue for COVID-19 symptom classification thoroughly.
First, the semantic feature embedding of each slice for a CT scan is extracted by conventional backbone networks.
Then, we proposed a long short-term memory (LSTM) and Transformer-based sub-network to deal with temporal feature learning.
- Score: 2.3682456328966115
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computed tomography (CT) imaging could be very practical for diagnosing
various diseases. However, the nature of the CT images is even more diverse
since the resolution and number of the slices of a CT scan are determined by
the machine and its settings. Conventional deep learning models are hard to
tickle such diverse data since the essential requirement of the deep neural
network is the consistent shape of the input data. In this paper, we propose a
novel, effective, two-step-wise approach to tickle this issue for COVID-19
symptom classification thoroughly. First, the semantic feature embedding of
each slice for a CT scan is extracted by conventional backbone networks. Then,
we proposed a long short-term memory (LSTM) and Transformer-based sub-network
to deal with temporal feature learning, leading to spatiotemporal feature
representation learning. In this fashion, the proposed two-step LSTM model
could prevent overfitting, as well as increase performance. Comprehensive
experiments reveal that the proposed two-step method not only shows excellent
performance but also could be compensated for each other. More specifically,
the two-step LSTM model has a lower false-negative rate, while the 2-step Swin
model has a lower false-positive rate. In summary, it is suggested that the
model ensemble could be adopted for more stable and promising performance in
real-world applications.
Related papers
- PEMMA: Parameter-Efficient Multi-Modal Adaptation for Medical Image Segmentation [5.056996354878645]
When both CT and PET scans are available, it is common to combine them as two channels of the input to the segmentation model.
This method requires both scan types during training and inference, posing a challenge due to the limited availability of PET scans.
We propose a parameter-efficient multi-modal adaptation framework for lightweight upgrading of a transformer-based segmentation model.
arXiv Detail & Related papers (2024-04-21T16:29:49Z) - A Closer Look at Spatial-Slice Features Learning for COVID-19 Detection [8.215897530386343]
We introduce an enhanced Spatial-Slice Feature Learning (SSFL++) framework specifically designed for CT scan.
It aim to filter out a OOD data within whole CT scan, enabling our to select crucial spatial-slice for analysis by reducing 70% redundancy totally.
Experiments demonstrate the promising performance of our model using a simple EfficientNet-2D (E2D) model, even with only 1% of the training data.
arXiv Detail & Related papers (2024-04-02T05:19:27Z) - Simple 2D Convolutional Neural Network-based Approach for COVID-19 Detection [8.215897530386343]
This study explores the use of deep learning techniques for analyzing lung Computed Tomography (CT) images.
We propose an advanced Spatial-Slice Feature Learning (SSFL++) framework specifically tailored for CT scans.
It aims to filter out out out-of-distribution (OOD) data within the entire CT scan, allowing us to select essential spatial-slice features for analysis by reducing data redundancy by 70%.
arXiv Detail & Related papers (2024-03-17T14:34:51Z) - SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion
Classification Using 3D Multi-Phase Imaging [59.78761085714715]
This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework for liver lesion classification.
The proposed framework has been validated through comprehensive experiments on two clinical datasets.
To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public.
arXiv Detail & Related papers (2024-02-27T06:32:56Z) - Strong Baseline and Bag of Tricks for COVID-19 Detection of CT Scans [2.696776905220987]
Traditional deep learning frameworks encounter compatibility issues due to variations in slice numbers and resolutions in CT images.
We propose a novel slice selection method for each CT dataset to address this limitation.
In addition to the aforementioned methods, we explore various high-performance classification models, ultimately achieving promising results.
arXiv Detail & Related papers (2023-03-15T09:52:28Z) - A Light-weight CNN Model for Efficient Parkinson's Disease Diagnostics [1.382077805849933]
The proposed model consists of a convolution neural network (CNN) to short-term memory (LSTM) to adapt the characteristics of collected time-series signals.
Experimental results show that the proposed model achieves a high-quality diagnostic result over multiple evaluation metrics with much fewer parameters and operations.
arXiv Detail & Related papers (2023-02-02T09:49:07Z) - InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal
Artifact Reduction in CT Images [53.4351366246531]
We construct a novel interpretable dual domain network, termed InDuDoNet+, into which CT imaging process is finely embedded.
We analyze the CT values among different tissues, and merge the prior observations into a prior network for our InDuDoNet+, which significantly improve its generalization performance.
arXiv Detail & Related papers (2021-12-23T15:52:37Z) - Incremental Cross-view Mutual Distillation for Self-supervised Medical
CT Synthesis [88.39466012709205]
This paper builds a novel medical slice to increase the between-slice resolution.
Considering that the ground-truth intermediate medical slices are always absent in clinical practice, we introduce the incremental cross-view mutual distillation strategy.
Our method outperforms state-of-the-art algorithms by clear margins.
arXiv Detail & Related papers (2021-12-20T03:38:37Z) - CyTran: A Cycle-Consistent Transformer with Multi-Level Consistency for
Non-Contrast to Contrast CT Translation [56.622832383316215]
We propose a novel approach to translate unpaired contrast computed tomography (CT) scans to non-contrast CT scans.
Our approach is based on cycle-consistent generative adversarial convolutional transformers, for short, CyTran.
Our empirical results show that CyTran outperforms all competing methods.
arXiv Detail & Related papers (2021-10-12T23:25:03Z) - Modality Completion via Gaussian Process Prior Variational Autoencoders
for Multi-Modal Glioma Segmentation [75.58395328700821]
We propose a novel model, Multi-modal Gaussian Process Prior Variational Autoencoder (MGP-VAE), to impute one or more missing sub-modalities for a patient scan.
MGP-VAE can leverage the Gaussian Process (GP) prior on the Variational Autoencoder (VAE) to utilize the subjects/patients and sub-modalities correlations.
We show the applicability of MGP-VAE on brain tumor segmentation where either, two, or three of four sub-modalities may be missing.
arXiv Detail & Related papers (2021-07-07T19:06:34Z) - A Multi-Stage Attentive Transfer Learning Framework for Improving
COVID-19 Diagnosis [49.3704402041314]
We propose a multi-stage attentive transfer learning framework for improving COVID-19 diagnosis.
Our proposed framework consists of three stages to train accurate diagnosis models through learning knowledge from multiple source tasks and data of different domains.
Importantly, we propose a novel self-supervised learning method to learn multi-scale representations for lung CT images.
arXiv Detail & Related papers (2021-01-14T01:39:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.