Multi-Class Abnormality Classification in Video Capsule Endoscopy Using Deep Learning
- URL: http://arxiv.org/abs/2410.18879v2
- Date: Sun, 01 Dec 2024 11:58:37 GMT
- Title: Multi-Class Abnormality Classification in Video Capsule Endoscopy Using Deep Learning
- Authors: Arnav Samal, Ranya Batsyas,
- Abstract summary: This report outlines Team Seq2Cure's deep learning approach for the Capsule Vision 2024 Challenge.<n>We leverage an ensemble of convolutional neural networks (CNNs) and transformer-based architectures for multi-class abnormality classification in video capsule endoscopy frames.<n>Our approach achieved a balanced accuracy of 86.34 percent and a mean AUC-ROC score of 0.9908 on the validation set, earning our submission 5th place in the challenge.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This report outlines Team Seq2Cure's deep learning approach for the Capsule Vision 2024 Challenge, leveraging an ensemble of convolutional neural networks (CNNs) and transformer-based architectures for multi-class abnormality classification in video capsule endoscopy frames. The dataset comprised over 50,000 frames from three public sources and one private dataset, labeled across 10 abnormality classes. To overcome the limitations of traditional CNNs in capturing global context, we integrated CNN and transformer models within a multi-model ensemble. Our approach achieved a balanced accuracy of 86.34 percent and a mean AUC-ROC score of 0.9908 on the validation set, earning our submission 5th place in the challenge. Code is available at http://github.com/arnavs04/capsule-vision-2024 .
Related papers
- Perception Encoder: The best visual embeddings are not at the output of the network [70.86738083862099]
We introduce Perception (PE), a vision encoder for image and video understanding trained via simple vision-language learning.
We find that contrastive vision-language training alone can produce strong, general embeddings for all of these downstream tasks.
Together, our PE family of models achieves best-in-class results on a wide variety of tasks.
arXiv Detail & Related papers (2025-04-17T17:59:57Z) - Capsule Vision Challenge 2024: Multi-Class Abnormality Classification for Video Capsule Endoscopy [1.124958340749622]
We present an approach to developing a model for classifying abnormalities in video capsule endoscopy (VCE) frames.
We implement a tiered augmentation strategy using the albumentations library to enhance minority class representation.
Our pipeline, developed in PyTorch, employs a flexible architecture enabling seamless adjustments to classification complexity.
arXiv Detail & Related papers (2024-11-03T08:34:04Z) - Multi-Class Abnormality Classification Task in Video Capsule Endoscopy [3.656114607436271]
We address the challenge of multi-class anomaly classification in Video Capsule Endoscopy (VCE) with a variety of deep learning models.
The purpose is to correctly classify diverse gastrointestinal disorders, which is critical for increasing diagnostic efficiency in clinical settings.
arXiv Detail & Related papers (2024-10-25T21:22:52Z) - Classification of Endoscopy and Video Capsule Images using CNN-Transformer Model [1.0994755279455526]
This study proposes a hybrid model that combines the advantages of Transformers and Convolutional Neural Networks (CNNs) to enhance classification performance.
For the GastroVision dataset, our proposed model demonstrates excellent performance with Precision, Recall, F1 score, Accuracy, and Matthews Correlation Coefficient (MCC) of 0.8320, 0.8386, 0.8324, 0.8386, and 0.8191, respectively.
arXiv Detail & Related papers (2024-08-20T11:05:32Z) - CoNe: Contrast Your Neighbours for Supervised Image Classification [62.12074282211957]
Contrast Your Neighbours (CoNe) is a learning framework for supervised image classification.
CoNe employs the features of its similar neighbors as anchors to generate more adaptive and refined targets.
Our CoNe achieves 80.8% Top-1 accuracy on ImageNet with ResNet-50, which surpasses the recent Timm training recipe.
arXiv Detail & Related papers (2023-08-21T14:49:37Z) - Severity classification of ground-glass opacity via 2-D convolutional
neural network and lung CT scans: a 3-day exploration [0.0]
Ground-glass opacity is a hallmark of numerous lung diseases, including patients with COVID19 and pneumonia, pulmonary fibrosis, and tuberculosis.
This note presents experimental results of a proof-of-concept framework that got implemented and tested over three days as driven by the third challenge entitled "COVID-19 Competition"
As part of the challenge requirement, the source code produced during the course of this exercise is posted at https://github.com/lisatwyw/cov19.
arXiv Detail & Related papers (2023-03-23T22:35:37Z) - Capsule Network based Contrastive Learning of Unsupervised Visual
Representations [13.592112044121683]
Contrastive Capsule (CoCa) Model is a Siamese style Capsule Network using Contrastive loss with our novel architecture, training and testing algorithm.
We evaluate the model on unsupervised image classification CIFAR-10 dataset and achieve a top-1 test accuracy of 70.50% and top-5 test accuracy of 98.10%.
Due to our efficient architecture our model has 31 times less parameters and 71 times less FLOPs than the current SOTA in both supervised and unsupervised learning.
arXiv Detail & Related papers (2022-09-22T19:05:27Z) - Video Capsule Endoscopy Classification using Focal Modulation Guided
Convolutional Neural Network [3.1374864575817214]
We propose FocalConvNet, a focal modulation network integrated with lightweight convolutional layers for the classification of small bowel anatomical landmarks and luminal findings.
We report the highest throughput of 148.02 images/second rate to establish the potential of FocalConvNet in a real-time clinical environment.
arXiv Detail & Related papers (2022-06-16T16:57:45Z) - Do We Really Need a Learnable Classifier at the End of Deep Neural
Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training.
Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z) - UniFormer: Unifying Convolution and Self-attention for Visual
Recognition [69.68907941116127]
Convolution neural networks (CNNs) and vision transformers (ViTs) have been two dominant frameworks in the past few years.
We propose a novel Unified transFormer (UniFormer) which seamlessly integrates the merits of convolution and self-attention in a concise transformer format.
Our UniFormer achieves 86.3 top-1 accuracy on ImageNet-1K classification.
arXiv Detail & Related papers (2022-01-24T04:39:39Z) - Parallel Capsule Networks for Classification of White Blood Cells [1.5749416770494706]
Capsule Networks (CapsNets) is a machine learning architecture proposed to overcome some of the shortcomings of convolutional neural networks (CNNs)
We present a new architecture, parallel CapsNets, which exploits the concept of branching the network to isolate certain capsules.
arXiv Detail & Related papers (2021-08-05T14:30:44Z) - Multiclass Anomaly Detection in GI Endoscopic Images using Optimized
Deep One-class Classification in an Imbalanced Dataset [0.0]
Wireless Capsule Endoscopy helps physicians examine the gastrointestinal (GI) tract noninvasively.
Many available datasets, such as KID2 and Kvasir, suffer from imbalance issue which make it difficult to train an effective artificial intelligence (AI) system.
In this study, an ensemble of one-class classifiers is used for detecting anomaly.
arXiv Detail & Related papers (2021-03-15T16:28:42Z) - PS-DeVCEM: Pathology-sensitive deep learning model for video capsule
endoscopy based on weakly labeled data [0.0]
We propose a pathology-sensitive deep learning model (PS-DeVCEM) for frame-level anomaly detection and multi-label classification of different colon diseases in video capsule endoscopy (VCE) data.
Our model is driven by attention-based deep multiple instance learning and is trained end-to-end on weakly labeled data.
We show our model's ability to temporally localize frames with pathologies, without frame annotation information during training.
arXiv Detail & Related papers (2020-11-22T15:33:37Z) - A Two-Stage Approach to Device-Robust Acoustic Scene Classification [63.98724740606457]
Two-stage system based on fully convolutional neural networks (CNNs) is proposed to improve device robustness.
Our results show that the proposed ASC system attains a state-of-the-art accuracy on the development set.
Neural saliency analysis with class activation mapping gives new insights on the patterns learnt by our models.
arXiv Detail & Related papers (2020-11-03T03:27:18Z) - Classification of COVID-19 in CT Scans using Multi-Source Transfer
Learning [91.3755431537592]
We propose the use of Multi-Source Transfer Learning to improve upon traditional Transfer Learning for the classification of COVID-19 from CT scans.
With our multi-source fine-tuning approach, our models outperformed baseline models fine-tuned with ImageNet.
Our best performing model was able to achieve an accuracy of 0.893 and a Recall score of 0.897, outperforming its baseline Recall score by 9.3%.
arXiv Detail & Related papers (2020-09-22T11:53:06Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - Automatic sleep stage classification with deep residual networks in a
mixed-cohort setting [63.52264764099532]
We developed a novel deep neural network model to assess the generalizability of several large-scale cohorts.
Overall classification accuracy improved with increasing fractions of training data.
arXiv Detail & Related papers (2020-08-21T10:48:35Z) - Generalized Focal Loss: Learning Qualified and Distributed Bounding
Boxes for Dense Object Detection [85.53263670166304]
One-stage detector basically formulates object detection as dense classification and localization.
Recent trend for one-stage detectors is to introduce an individual prediction branch to estimate the quality of localization.
This paper delves into the representations of the above three fundamental elements: quality estimation, classification and localization.
arXiv Detail & Related papers (2020-06-08T07:24:33Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z) - Subspace Capsule Network [85.69796543499021]
SubSpace Capsule Network (SCN) exploits the idea of capsule networks to model possible variations in the appearance or implicitly defined properties of an entity.
SCN can be applied to both discriminative and generative models without incurring computational overhead compared to CNN during test time.
arXiv Detail & Related papers (2020-02-07T17:51:56Z) - Identifying and Compensating for Feature Deviation in Imbalanced Deep
Learning [59.65752299209042]
We investigate learning a ConvNet under such a scenario.
We found that a ConvNet significantly over-fits the minor classes.
We propose to incorporate class-dependent temperatures (CDT) training ConvNet.
arXiv Detail & Related papers (2020-01-06T03:52:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.