Related papers: Hard-Attention Gates with Gradient Routing for Endoscopic Image Computing

Hard-Attention Gates with Gradient Routing for Endoscopic Image Computing

URL: http://arxiv.org/abs/2407.04400v1
Date: Fri, 5 Jul 2024 10:20:24 GMT
Title: Hard-Attention Gates with Gradient Routing for Endoscopic Image Computing
Authors: Giorgio Roffo, Carlo Biffi, Pietro Salvagnini, Andrea Cherubini,
Abstract summary: We introduce FeatureSelection Gates (FSG) or Hard-Attention Gates (HAG) for dynamic feature selection. This technique aims to boost Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) by promoting sparse connectivity. We show that our HAG-enhanced networks substantially enhance performance in both binary and triclass classification tasks related to polyp sizing.
Score: 3.146247125118741
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: To address overfitting and enhance model generalization in gastroenterological polyp size assessment, our study introduces Feature-Selection Gates (FSG) or Hard-Attention Gates (HAG) alongside Gradient Routing (GR) for dynamic feature selection. This technique aims to boost Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) by promoting sparse connectivity, thereby reducing overfitting and enhancing generalization. HAG achieves this through sparsification with learnable weights, serving as a regularization strategy. GR further refines this process by optimizing HAG parameters via dual forward passes, independently from the main model, to improve feature re-weighting. Our evaluation spanned multiple datasets, including CIFAR-100 for a broad impact assessment and specialized endoscopic datasets (REAL-Colon, Misawa, and SUN) focusing on polyp size estimation, covering over 200 polyps in more than 370,000 frames. The findings indicate that our HAG-enhanced networks substantially enhance performance in both binary and triclass classification tasks related to polyp sizing. Specifically, CNNs experienced an F1 Score improvement to 87.8% in binary classification, while in triclass classification, the ViT-T model reached an F1 Score of 76.5%, outperforming traditional CNNs and ViT-T models. To facilitate further research, we are releasing our codebase, which includes implementations for CNNs, multistream CNNs, ViT, and HAG-augmented variants. This resource aims to standardize the use of endoscopic datasets, providing public training-validation-testing splits for reliable and comparable research in gastroenterological polyp size estimation. The codebase is available at github.com/cosmoimd/feature-selection-gates.

Related papers

Attention-Enhanced U-Net for Accurate Segmentation of COVID-19 Infected Lung Regions in CT Scans [0.0]
We propose a robust methodology for automatic segmentation of infected lung regions in COVID-19 CT scans using convolutional neural networks.<n>The approach is based on a modified U-Net architecture enhanced with attention mechanisms, data augmentation, and postprocessing techniques.
arXiv Detail & Related papers (2025-05-18T08:27:12Z)
GNN-Suite: a Graph Neural Network Benchmarking Framework for Biomedical Informatics [0.0]
We present GNN-Suite, a framework for constructing and benchmarking Graph Neural Network (GNN) architectures in computational biology.<n>We demonstrate its utility in identifying cancer-driver genes by constructing molecular networks from protein-protein interaction (PPI) data.<n>Our results show that a common framework for implementing and evaluating GNN architectures aids in identifying not only the best model but also the most effective means of incorporating complementary data.
arXiv Detail & Related papers (2025-05-15T21:14:30Z)
How Learnable Grids Recover Fine Detail in Low Dimensions: A Neural Tangent Kernel Analysis of Multigrid Parametric Encodings [106.3726679697804]
We compare the two most common techniques for mitigating this spectral bias: Fourier feature encodings (FFE) and multigrid parametric encodings (MPE) MPEs are seen as the standard for low dimensional mappings, but MPEs often outperform them and learn representations with higher resolution and finer detail. We prove that MPEs improve a network's performance through the structure of their grid and not their learnable embedding.
arXiv Detail & Related papers (2025-04-18T02:18:08Z)
LGPS: A Lightweight GAN-Based Approach for Polyp Segmentation in Colonoscopy Images [40.19053450600923]
We propose LGPS, a lightweight GAN-based framework for polyp segmentation. LGPS incorporates three key innovations: (1) a MobileNetV2 backbone enhanced with modified residual blocks and Squeeze-and-Excitation modules for efficient feature extraction. On the largest and challenging PolypGen test dataset, LGPS achieves a Dice of 0.7299 and an IoU of 0.7867, outperformed all SOTA works and demonstrating robust generalization.
arXiv Detail & Related papers (2025-03-24T02:41:53Z)
A Feature-Level Ensemble Model for COVID-19 Identification in CXR Images using Choquet Integral and Differential Evolution Optimization [0.7510165488300369]
An effective strategy to mitigate the COVID-19 pandemic involves integrating testing to identify infected individuals. While RT-PCR is considered the gold standard for diagnosing COVID-19, it has some limitations such as the risk of false negatives. This paper introduces a novel Deep Learning Diagnosis System that integrates pre-trained Deep Conal Neural Networks (DCNNs) within an ensemble learning framework.
arXiv Detail & Related papers (2025-01-14T16:28:02Z)
Dumpling GNN: Hybrid GNN Enables Better ADC Payload Activity Prediction Based on Chemical Structure [53.76752789814785]
DumplingGNN is a hybrid Graph Neural Network architecture specifically designed for predicting ADC payload activity based on chemical structure. We evaluate it on a comprehensive ADC payload dataset focusing on DNA Topoisomerase I inhibitors. It demonstrates exceptional accuracy (91.48%), sensitivity (95.08%), and specificity (97.54%) on our specialized ADC payload dataset.
arXiv Detail & Related papers (2024-09-23T17:11:04Z)
Segmentation of Non-Small Cell Lung Carcinomas: Introducing DRU-Net and Multi-Lens Distortion [0.1935997508026988]
We are proposing a segmentation model (DRU-Net) that can provide a delineation of human non-small cell lung carcinomas. We have used two datasets (Norwegian Lung Cancer Biobank and Haukeland University Hospital lung cancer cohort) to create our proposed model. The proposed spatial augmentation method (multi-lens distortion) improved the network performance by 3%.
arXiv Detail & Related papers (2024-06-20T13:14:00Z)
Leveraging Spatial and Semantic Feature Extraction for Skin Cancer Diagnosis with Capsule Networks and Graph Neural Networks [0.0]
This study introduces an innovative approach by integrating Graph Neural Networks (GNNs) with Capsule Networks to enhance classification performance. Our research focuses on evaluating and enhancing the Tiny Pyramid Vision GNN (Tiny Pyramid ViG) architecture by incorporating it with a Capsule Network. After 75 epochs of training, our model achieved a significant accuracy improvement, reaching 89.23% and 95.52%, surpassing established benchmarks.
arXiv Detail & Related papers (2024-03-18T17:47:39Z)
Additional Look into GAN-based Augmentation for Deep Learning COVID-19 Image Classification [57.1795052451257]
We study the dependence of the GAN-based augmentation performance on dataset size with a focus on small samples. We train StyleGAN2-ADA with both sets and then, after validating the quality of generated images, we use trained GANs as one of the augmentations approaches in multi-class classification problems. The GAN-based augmentation approach is found to be comparable with classical augmentation in the case of medium and large datasets but underperforms in the case of smaller datasets.
arXiv Detail & Related papers (2024-01-26T08:28:13Z)
Efficient Heterogeneous Graph Learning via Random Projection [58.4138636866903]
Heterogeneous Graph Neural Networks (HGNNs) are powerful tools for deep learning on heterogeneous graphs. Recent pre-computation-based HGNNs use one-time message passing to transform a heterogeneous graph into regular-shaped tensors. We propose a hybrid pre-computation-based HGNN, named Random Projection Heterogeneous Graph Neural Network (RpHGNN)
arXiv Detail & Related papers (2023-10-23T01:25:44Z)
Benchmarking Deep Learning Frameworks for Automated Diagnosis of Ocular Toxoplasmosis: A Comprehensive Approach to Classification and Segmentation [1.3701366534590498]
Ocular Toxoplasmosis (OT) is a common eye infection caused by T. gondii that can cause vision problems. This research seeks to provide a guide for future researchers looking to utilise DL techniques and develop a cheap, automated, easy-to-use, and accurate diagnostic method.
arXiv Detail & Related papers (2023-05-18T13:42:15Z)
TransNetR: Transformer-based Residual Network for Polyp Segmentation with Multi-Center Out-of-Distribution Testing [2.3293678240472517]
We propose a novel real-time deep learning based architecture, Transformer based Residual network (TransNetR) for colon polyp segmentation. TransNetR is an encoder-decoder network that consists of a pre-trained ResNet50 as the encoder, three decoder blocks, and an upsampling layer at the end of the network. It obtains a high dice coefficient of 0.8706 and a mean Intersection over union of 0.8016 and retains a real-time processing speed of 54.60 on the Kvasir-SEG dataset.
arXiv Detail & Related papers (2023-03-13T19:11:17Z)
Adversarial Feature Augmentation and Normalization for Visual Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models. Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings. We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)
Classification of COVID-19 in CT Scans using Multi-Source Transfer Learning [91.3755431537592]
We propose the use of Multi-Source Transfer Learning to improve upon traditional Transfer Learning for the classification of COVID-19 from CT scans. With our multi-source fine-tuning approach, our models outperformed baseline models fine-tuned with ImageNet. Our best performing model was able to achieve an accuracy of 0.893 and a Recall score of 0.897, outperforming its baseline Recall score by 9.3%.
arXiv Detail & Related papers (2020-09-22T11:53:06Z)
CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors and Efficient Neural Networks [51.589769497681175]
The novel coronavirus (SARS-CoV-2) has led to a pandemic. The current testing regime based on Reverse Transcription-Polymerase Chain Reaction for SARS-CoV-2 has been unable to keep up with testing demands. We propose a framework called CovidDeep that combines efficient DNNs with commercially available WMSs for pervasive testing of the virus.
arXiv Detail & Related papers (2020-07-20T21:47:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.