AdaptoVision: A Multi-Resolution Image Recognition Model for Robust and Scalable Classification
- URL: http://arxiv.org/abs/2504.12652v1
- Date: Thu, 17 Apr 2025 05:23:07 GMT
- Title: AdaptoVision: A Multi-Resolution Image Recognition Model for Robust and Scalable Classification
- Authors: Md. Sanaullah Chowdhury Lameya Sabrin,
- Abstract summary: AdaptoVision is a novel convolutional neural network (CNN) architecture designed to efficiently balance computational complexity and classification accuracy.<n>By leveraging enhanced residual units, depth-wise separable convolutions, and hierarchical skip connections, AdaptoVision significantly reduces parameter count and computational requirements.<n>It achieves state-of-the-art on BreakHis dataset and comparable accuracy levels, notably 95.3% on CIFAR-10 and 85.77% on CIFAR-100, without relying on any pretrained weights.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces AdaptoVision, a novel convolutional neural network (CNN) architecture designed to efficiently balance computational complexity and classification accuracy. By leveraging enhanced residual units, depth-wise separable convolutions, and hierarchical skip connections, AdaptoVision significantly reduces parameter count and computational requirements while preserving competitive performance across various benchmark and medical image datasets. Extensive experimentation demonstrates that AdaptoVision achieves state-of-the-art on BreakHis dataset and comparable accuracy levels, notably 95.3\% on CIFAR-10 and 85.77\% on CIFAR-100, without relying on any pretrained weights. The model's streamlined architecture and strategic simplifications promote effective feature extraction and robust generalization, making it particularly suitable for deployment in real-time and resource-constrained environments.
Related papers
- Federated Learning of Low-Rank One-Shot Image Detection Models in Edge Devices with Scalable Accuracy and Compute Complexity [5.820612543019548]
LoRa-FL is designed for training low-rank one-shot image detection models deployed on edge devices.
By incorporating low-rank adaptation techniques into one-shot detection architectures, our method significantly reduces both computational and communication overhead.
arXiv Detail & Related papers (2025-04-23T08:40:44Z) - Subject Representation Learning from EEG using Graph Convolutional Variational Autoencoders [20.364067310176054]
GC-VASE is a graph convolutional-based variational autoencoder that leverages contrastive learning for subject representation learning from EEG data.<n>Our method successfully learns robust subject-specific latent representations using the split-latent space architecture tailored for subject identification.
arXiv Detail & Related papers (2025-01-13T17:29:31Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - IncSAR: A Dual Fusion Incremental Learning Framework for SAR Target Recognition [13.783950035836593]
IncSAR is an incremental learning framework designed to tackle catastrophic forgetting in target recognition.<n>To mitigate the speckle noise inherent in SAR images, we employ a denoising module based on a neural network approximation.<n>Experiments on the MSTAR, SAR-AIRcraft-1.0, and OpenSARShip benchmark datasets demonstrate that IncSAR significantly outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2024-10-08T08:49:47Z) - SHA-CNN: Scalable Hierarchical Aware Convolutional Neural Network for Edge AI [6.168286187549952]
This paper introduces a Hierarchical Aware Convolutional Neural Network (SHA-CNN) model architecture for Edge AI applications.
The proposed hierarchical CNN model is meticulously crafted to strike a balance between computational efficiency and accuracy.
The key innovation lies in the model's hierarchical awareness, enabling it to discern and prioritize relevant features at multiple levels of abstraction.
arXiv Detail & Related papers (2024-07-31T06:44:52Z) - Restore Anything Model via Efficient Degradation Adaptation [129.38475243424563]
RAM takes a unified path that leverages inherent similarities across various degradations to enable efficient and comprehensive restoration.<n> RAM's SOTA performance confirms RAM's SOTA performance, reducing model complexity by approximately 82% in trainable parameters and 85% in FLOPs.
arXiv Detail & Related papers (2024-07-18T10:26:53Z) - Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR)
CFSR inherits the advantages of both convolution-based and transformer-based approaches.
Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - Effective Invertible Arbitrary Image Rescaling [77.46732646918936]
Invertible Neural Networks (INN) are able to increase upscaling accuracy significantly by optimizing the downscaling and upscaling cycle jointly.
A simple and effective invertible arbitrary rescaling network (IARN) is proposed to achieve arbitrary image rescaling by training only one model in this work.
It is shown to achieve a state-of-the-art (SOTA) performance in bidirectional arbitrary rescaling without compromising perceptual quality in LR outputs.
arXiv Detail & Related papers (2022-09-26T22:22:30Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Deep Adaptive Inference Networks for Single Image Super-Resolution [72.7304455761067]
Single image super-resolution (SISR) has witnessed tremendous progress in recent years owing to the deployment of deep convolutional neural networks (CNNs)
In this paper, we take a step forward to address this issue by leveraging the adaptive inference networks for deep SISR (AdaDSR)
Our AdaDSR involves an SISR model as backbone and a lightweight adapter module which takes image features and resource constraint as input and predicts a map of local network depth.
arXiv Detail & Related papers (2020-04-08T10:08:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.