Related papers: nnMobileNet: Rethinking CNN for Retinopathy Research

nnMobileNet: Rethinking CNN for Retinopathy Research

URL: http://arxiv.org/abs/2306.01289v4
Date: Mon, 15 Apr 2024 20:35:03 GMT
Title: nnMobileNet: Rethinking CNN for Retinopathy Research
Authors: Wenhui Zhu, Peijie Qiu, Xiwen Chen, Xin Li, Natasha Lepore, Oana M. Dumitrascu, Yalin Wang,
Abstract summary: convolutional neural networks (CNNs) have been at the forefront of the detection and tracking of various retinal diseases (RD) The emergence of vision transformers (ViT) in the 2020s has shifted the trajectory of RD model development. We revisited and updated the architecture of a CNN model, specifically MobileNet, to enhance its utility in RD diagnostics.
Score: 4.882524311496886
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Over the past few decades, convolutional neural networks (CNNs) have been at the forefront of the detection and tracking of various retinal diseases (RD). Despite their success, the emergence of vision transformers (ViT) in the 2020s has shifted the trajectory of RD model development. The leading-edge performance of ViT-based models in RD can be largely credited to their scalability-their ability to improve as more parameters are added. As a result, ViT-based models tend to outshine traditional CNNs in RD applications, albeit at the cost of increased data and computational demands. ViTs also differ from CNNs in their approach to processing images, working with patches rather than local regions, which can complicate the precise localization of small, variably presented lesions in RD. In our study, we revisited and updated the architecture of a CNN model, specifically MobileNet, to enhance its utility in RD diagnostics. We found that an optimized MobileNet, through selective modifications, can surpass ViT-based models in various RD benchmarks, including diabetic retinopathy grading, detection of multiple fundus diseases, and classification of diabetic macular edema. The code is available at https://github.com/Retinal-Research/NN-MOBILENET

Related papers

Comparative Analysis of Deep Learning Strategies for Hypertensive Retinopathy Detection from Fundus Images: From Scratch and Pre-trained Models [5.860609259063137]
This paper presents a comparative analysis of deep learning strategies for detecting hypertensive retinopathy from fundus images.<n>We investigate three distinct approaches: a custom CNN, a suite of pre-trained transformer-based models, and an AutoML solution.
arXiv Detail & Related papers (2025-06-14T13:11:33Z)
Enhancing Diabetic Retinopathy Detection with CNN-Based Models: A Comparative Study of UNET and Stacked UNET Architectures [0.0]
Diabetic Retinopathy DR is a severe complication of diabetes. Damaged or abnormal blood vessels can cause loss of vision. The need for massive screening of a large population of diabetic patients has generated an interest in a computer-aided fully automatic diagnosis of DR. Deep learning frameworks, particularly convolutional neural networks CNNs, have shown great interest and promise in detecting DR by analyzing retinal images.
arXiv Detail & Related papers (2024-11-02T14:02:45Z)
Explainability of Deep Neural Networks for Brain Tumor Detection [0.0828720658988688]
We apply explainable AI (XAI) techniques to assess the performance of various models on real-world medical data. CNNs with shallower architectures are more effective for small datasets and can support medical decision-making.
arXiv Detail & Related papers (2024-10-10T05:01:21Z)
Synthetic Trajectory Generation Through Convolutional Neural Networks [6.717469146587211]
We introduce a Reversible Trajectory-to-CNN Transformation (RTCT) RTCT adapts trajectories into a format suitable for CNN-based models. We evaluate its performance against an RNN-based trajectory GAN.
arXiv Detail & Related papers (2024-07-24T02:16:52Z)
A Comparative Study of CNN, ResNet, and Vision Transformers for Multi-Classification of Chest Diseases [0.0]
Vision Transformers (ViT) are powerful tools due to their scalability and ability to process large amounts of data. We fine-tuned two variants of ViT models, one pre-trained on ImageNet and another trained from scratch, using the NIH Chest X-ray dataset. Our study evaluates the performance of these models in the multi-label classification of 14 distinct diseases.
arXiv Detail & Related papers (2024-05-31T23:56:42Z)
Unveiling the Unseen: Identifiable Clusters in Trained Depthwise Convolutional Kernels [56.69755544814834]
Recent advances in depthwise-separable convolutional neural networks (DS-CNNs) have led to novel architectures. This paper reveals another striking property of DS-CNN architectures: discernible and explainable patterns emerge in their trained depthwise convolutional kernels in all layers.
arXiv Detail & Related papers (2024-01-25T19:05:53Z)
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions [95.94629864981091]
This work presents a new large-scale CNN-based foundation model, termed InternImage, which can obtain the gain from increasing parameters and training data like ViTs. The proposed InternImage reduces the strict inductive bias of traditional CNNs and makes it possible to learn stronger and more robust patterns with large-scale parameters from massive data like ViTs.
arXiv Detail & Related papers (2022-11-10T18:59:04Z)
Data-Efficient Vision Transformers for Multi-Label Disease Classification on Chest Radiographs [55.78588835407174]
Vision Transformers (ViTs) have not been applied to this task despite their high classification performance on generic images. ViTs do not rely on convolutions but on patch-based self-attention and in contrast to CNNs, no prior knowledge of local connectivity is present. Our results show that while the performance between ViTs and CNNs is on par with a small benefit for ViTs, DeiTs outperform the former if a reasonably large data set is available for training.
arXiv Detail & Related papers (2022-08-17T09:07:45Z)
Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware. It is a challenge to efficiently train SNNs due to their non-differentiability. We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z)
Scopeformer: n-CNN-ViT Hybrid Model for Intracranial Hemorrhage Classification [0.0]
We propose a feature generator composed of an ensemble of convolutional neuralnetworks (CNNs) to improve the Vision Transformer (ViT) models. We show that by gradually stacking several feature maps extracted using multiple Xception CNNs, we can develop a feature-rich input for the ViT model.
arXiv Detail & Related papers (2021-07-07T20:20:24Z)
RetiNerveNet: Using Recursive Deep Learning to Estimate Pointwise 24-2 Visual Field Data based on Retinal Structure [109.33721060718392]
glaucoma is the leading cause of irreversible blindness in the world, affecting over 70 million people. Due to the Standard Automated Perimetry (SAP) test's innate difficulty and its high test-retest variability, we propose the RetiNerveNet.
arXiv Detail & Related papers (2020-10-15T03:09:08Z)
Classification of COVID-19 in CT Scans using Multi-Source Transfer Learning [91.3755431537592]
We propose the use of Multi-Source Transfer Learning to improve upon traditional Transfer Learning for the classification of COVID-19 from CT scans. With our multi-source fine-tuning approach, our models outperformed baseline models fine-tuned with ImageNet. Our best performing model was able to achieve an accuracy of 0.893 and a Recall score of 0.897, outperforming its baseline Recall score by 9.3%.
arXiv Detail & Related papers (2020-09-22T11:53:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.