Siamese Network Features for Endoscopy Image and Video Localization
- URL: http://arxiv.org/abs/2103.08504v1
- Date: Mon, 15 Mar 2021 16:24:30 GMT
- Title: Siamese Network Features for Endoscopy Image and Video Localization
- Authors: Mohammad Reza Mohebbian, Seyed Shahim Vedaei, Khan A. Wahid and Paul
Babyn
- Abstract summary: Localizing frames provide valuable information about anomaly location.
In this study, we present a combination of meta-learning and deep learning for localizing both endoscopy images and video.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Conventional Endoscopy (CE) and Wireless Capsule Endoscopy (WCE) are known
tools for diagnosing gastrointestinal (GI) tract disorders. Localizing frames
provide valuable information about the anomaly location and also can help
clinicians determine a more appropriate treatment plan. There are many
automated algorithms to detect the anomaly. However, very few of the existing
works address the issue of localization. In this study, we present a
combination of meta-learning and deep learning for localizing both endoscopy
images and video. A dataset is collected from 10 different anatomical positions
of human GI tract. In the meta-learning section, the system was trained using
78 CE and 27 WCE annotated frames with a modified Siamese Neural Network (SNN)
to predict the location of one single image/frame. Then, a postprocessing
section using bidirectional long short-term memory is proposed for localizing a
sequence of frames. Here, we have employed feature vector, distance and
predicted location obtained from a trained SNN. The postprocessing section is
trained and tested on 1,028 and 365 seconds of CE and WCE videos using hold-out
validation (50%), and achieved F1-score of 86.3% and 83.0%, respectively. In
addition, we performed subjective evaluation using nine gastroenterologists.
The results show that the computer-aided methods can outperform
gastroenterologists assessment of localization. The proposed method is compared
with various approaches, such as support vector machine with hand-crafted
features, convolutional neural network and the transfer learning-based methods,
and showed better results. Therefore, it can be used in frame localization,
which can help in video summarization and anomaly detection.
Related papers
- Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - Non-invasive Localization of the Ventricular Excitation Origin Without
Patient-specific Geometries Using Deep Learning [0.6999972048611302]
Ventricular tachycardia (VT) can be one cause of sudden cardiac death affecting 4.25 million persons per year worldwide.
To facilitate and expedite the localization during the ablation procedure, we present two novel localization techniques based on convolutional neural networks (CNNs)
arXiv Detail & Related papers (2022-09-16T09:30:13Z) - Data-Efficient Vision Transformers for Multi-Label Disease
Classification on Chest Radiographs [55.78588835407174]
Vision Transformers (ViTs) have not been applied to this task despite their high classification performance on generic images.
ViTs do not rely on convolutions but on patch-based self-attention and in contrast to CNNs, no prior knowledge of local connectivity is present.
Our results show that while the performance between ViTs and CNNs is on par with a small benefit for ViTs, DeiTs outperform the former if a reasonably large data set is available for training.
arXiv Detail & Related papers (2022-08-17T09:07:45Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - A Temporal Learning Approach to Inpainting Endoscopic Specularities and
Its effect on Image Correspondence [13.25903945009516]
We propose using a temporal generative adversarial network (GAN) to inpaint the hidden anatomy under specularities.
This is achieved using in-vivo data of gastric endoscopy (Hyper-Kvasir) in a fully unsupervised manner.
We also assess the effect of our method in computer vision tasks that underpin 3D reconstruction and camera motion estimation.
arXiv Detail & Related papers (2022-03-31T13:14:00Z) - Localized Perturbations For Weakly-Supervised Segmentation of Glioma
Brain Tumours [0.5801621787540266]
This work proposes the use of localized perturbations as a weakly-supervised solution to extract segmentation masks of brain tumours from a pretrained 3D classification model.
We also propose a novel optimal perturbation method that exploits 3D superpixels to find the most relevant area for a given classification using a U-net architecture.
arXiv Detail & Related papers (2021-11-29T21:01:20Z) - Hepatic vessel segmentation based on 3Dswin-transformer with inductive
biased multi-head self-attention [46.46365941681487]
We propose a robust end-to-end vessel segmentation network called Indu BIased Multi-Head Attention Vessel Net.
We introduce the voxel-wise embedding rather than patch-wise embedding to locate precise liver vessel voxels.
On the other hand, we propose inductive biased multi-head self-attention which learns inductive biased relative positional embedding from absolute position embedding.
arXiv Detail & Related papers (2021-11-05T10:17:08Z) - Explaining Predictions of Deep Neural Classifier via Activation Analysis [0.11470070927586014]
We present a novel approach to explain and support an interpretation of the decision-making process to a human expert operating a deep learning system based on Convolutional Neural Network (CNN)
Our results indicate that our method is capable of detecting distinct prediction strategies that enable us to identify the most similar predictions from an existing atlas.
arXiv Detail & Related papers (2020-12-03T20:36:19Z) - Accurate and Efficient Intracranial Hemorrhage Detection and Subtype
Classification in 3D CT Scans with Convolutional and Long Short-Term Memory
Neural Networks [20.4701676109641]
We present our system for the RSNA Intracranial Hemorrhage Detection challenge.
The proposed system is based on a lightweight deep neural network architecture composed of a convolutional neural network (CNN)
We report a weighted mean log loss of 0.04989 on the final test set, which places us in the top 30 ranking (2%) from a total of 1345 participants.
arXiv Detail & Related papers (2020-08-01T17:28:25Z) - Y-Net for Chest X-Ray Preprocessing: Simultaneous Classification of
Geometry and Segmentation of Annotations [70.0118756144807]
This work introduces a general pre-processing step for chest x-ray input into machine learning algorithms.
A modified Y-Net architecture based on the VGG11 encoder is used to simultaneously learn geometric orientation and segmentation of radiographs.
Results were evaluated by expert clinicians, with acceptable geometry in 95.8% and annotation mask in 96.2%, compared to 27.0% and 34.9% respectively in control images.
arXiv Detail & Related papers (2020-05-08T02:16:17Z) - 3D medical image segmentation with labeled and unlabeled data using
autoencoders at the example of liver segmentation in CT images [58.720142291102135]
This work investigates the potential of autoencoder-extracted features to improve segmentation with a convolutional neural network.
A convolutional autoencoder was used to extract features from unlabeled data and a multi-scale, fully convolutional CNN was used to perform the target task of 3D liver segmentation in CT images.
arXiv Detail & Related papers (2020-03-17T20:20:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.