Related papers: nnMobileNet++: Towards Efficient Hybrid Networks for Retinal Image Analysis

nnMobileNet++: Towards Efficient Hybrid Networks for Retinal Image Analysis

URL: http://arxiv.org/abs/2512.01273v1
Date: Mon, 01 Dec 2025 04:45:39 GMT
Title: nnMobileNet++: Towards Efficient Hybrid Networks for Retinal Image Analysis
Authors: Xin Li, Wenhui Zhu, Xuanzhao Dong, Hao Wang, Yujian Xiong, Oana Dumitrascu, Yalin Wang,
Abstract summary: We propose nnMobileNet++, a hybrid architecture that progressively bridges convolutional and transformer representations.<n>nnMobileNet++ achieves state-of-the-art or highly competitive accuracy while maintaining low computational cost.
Score: 10.186038549004266
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Retinal imaging is a critical, non-invasive modality for the early detection and monitoring of ocular and systemic diseases. Deep learning, particularly convolutional neural networks (CNNs), has significant progress in automated retinal analysis, supporting tasks such as fundus image classification, lesion detection, and vessel segmentation. As a representative lightweight network, nnMobileNet has demonstrated strong performance across multiple retinal benchmarks while remaining computationally efficient. However, purely convolutional architectures inherently struggle to capture long-range dependencies and model the irregular lesions and elongated vascular patterns that characterize on retinal images, despite the critical importance of vascular features for reliable clinical diagnosis. To further advance this line of work and extend the original vision of nnMobileNet, we propose nnMobileNet++, a hybrid architecture that progressively bridges convolutional and transformer representations. The framework integrates three key components: (i) dynamic snake convolution for boundary-aware feature extraction, (ii) stage-specific transformer blocks introduced after the second down-sampling stage for global context modeling, and (iii) retinal image pretraining to improve generalization. Experiments on multiple public retinal datasets for classification, together with ablation studies, demonstrate that nnMobileNet++ achieves state-of-the-art or highly competitive accuracy while maintaining low computational cost, underscoring its potential as a lightweight yet effective framework for retinal image analysis.

Related papers

Robust Multi-Disease Retinal Classification via Xception-Based Transfer Learning and W-Net Vessel Segmentation [0.0]
This paper presents a comprehensive study on deep learning architectures for the automated diagnosis of ocular conditions.<n>We implement a pipeline that combines deep feature extraction with interpretable image processing modules.<n>By grounding the model's predictions in clinically relevant morphological features, we aim to bridge the gap between algorithmic output and expert medical validation.
arXiv Detail & Related papers (2025-12-11T13:03:03Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Progressive Retinal Image Registration via Global and Local Deformable Transformations [49.032894312826244]
We propose a hybrid registration framework called HybridRetina. We use a keypoint detector and a deformation network called GAMorph to estimate the global transformation and local deformable transformation. Experiments on two widely-used datasets, FIRE and FLoRI21, show that our proposed HybridRetina significantly outperforms some state-of-the-art methods.
arXiv Detail & Related papers (2024-09-02T08:43:50Z)
Mew: Multiplexed Immunofluorescence Image Analysis through an Efficient Multiplex Network [84.88767228835928]
We introduce Mew, a novel framework designed to efficiently process mIF images through the lens of multiplex network. Mew innovatively constructs a multiplex network comprising two distinct layers: a Voronoi network for geometric information and a Cell-type network for capturing cell-wise homogeneity. This framework equips a scalable and efficient Graph Neural Network (GNN), capable of processing the entire graph during training.
arXiv Detail & Related papers (2024-07-25T08:22:30Z)
Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions. We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training. Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z)
LMBiS-Net: A Lightweight Multipath Bidirectional Skip Connection based CNN for Retinal Blood Vessel Segmentation [0.0]
Blinding eye diseases are often correlated with altered retinal morphology, which can be clinically identified by segmenting retinal structures in fundus images. Deep learning has shown promise in medical image segmentation, but its reliance on repeated convolution and pooling operations can hinder the representation of edge information. We propose a lightweight pixel-level CNN named LMBiS-Net for the segmentation of retinal vessels with an exceptionally low number of learnable parameters.
arXiv Detail & Related papers (2023-09-10T09:03:53Z)
LDMRes-Net: Enabling Efficient Medical Image Segmentation on IoT and Edge Platforms [9.626726110488386]
We propose a lightweight dual-multiscale residual block-based computational neural network tailored for medical image segmentation on IoT and edge platforms. LDMRes-Net overcomes limitations with its remarkably low number of learnable parameters (0.072M), making it highly suitable for resource-constrained devices.
arXiv Detail & Related papers (2023-06-09T10:34:18Z)
RetiFluidNet: A Self-Adaptive and Multi-Attention Deep Convolutional Network for Retinal OCT Fluid Segmentation [3.57686754209902]
Quantification of retinal fluids is necessary for OCT-guided treatment management. New convolutional neural architecture named RetiFluidNet is proposed for multi-class retinal fluid segmentation. Model benefits from hierarchical representation learning of textural, contextual, and edge features.
arXiv Detail & Related papers (2022-09-26T07:18:00Z)
A novel approach for glaucoma classification by wavelet neural networks using graph-based, statisitcal features of qualitatively improved images [0.0]
We have proposed a new glaucoma classification approach that employs a wavelet neural network (WNN) on optimally enhanced retinal images features. The performance of the WNN classifier is compared with multilayer perceptron neural networks with various datasets.
arXiv Detail & Related papers (2022-06-24T06:19:30Z)
Stain Normalized Breast Histopathology Image Recognition using Convolutional Neural Networks for Cancer Detection [9.826027427965354]
Recent advances have shown that the convolutional Neural Network (CNN) architectures can be used to design a Computer Aided Diagnostic (CAD) System for breast cancer detection. We consider some contemporary CNN models for binary classification of breast histopathology images. We have validated the trained CNN networks on a publicly available BreaKHis dataset, for 200x and 400x magnified histopathology images.
arXiv Detail & Related papers (2022-01-04T03:09:40Z)
InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images [53.4351366246531]
We construct a novel interpretable dual domain network, termed InDuDoNet+, into which CT imaging process is finely embedded. We analyze the CT values among different tissues, and merge the prior observations into a prior network for our InDuDoNet+, which significantly improve its generalization performance.
arXiv Detail & Related papers (2021-12-23T15:52:37Z)
Retinopathy of Prematurity Stage Diagnosis Using Object Segmentation and Convolutional Neural Networks [68.96150598294072]
Retinopathy of Prematurity (ROP) is an eye disorder primarily affecting premature infants with lower weights. It causes proliferation of vessels in the retina and could result in vision loss and, eventually, retinal detachment, leading to blindness. In recent years, there has been a significant effort to automate the diagnosis using deep learning. This paper builds upon the success of previous models and develops a novel architecture, which combines object segmentation and convolutional neural networks (CNN) Our proposed system first trains an object segmentation model to identify the demarcation line at a pixel level and adds the resulting mask as an additional "color" channel in
arXiv Detail & Related papers (2020-04-03T14:07:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.