MobilePlantViT: A Mobile-friendly Hybrid ViT for Generalized Plant Disease Image Classification
- URL: http://arxiv.org/abs/2503.16628v1
- Date: Thu, 20 Mar 2025 18:34:02 GMT
- Title: MobilePlantViT: A Mobile-friendly Hybrid ViT for Generalized Plant Disease Image Classification
- Authors: Moshiur Rahman Tonmoy, Md. Mithun Hossain, Nilanjan Dey, M. F. Mridha,
- Abstract summary: Plant diseases significantly threaten global food security.<n>Deep learning models have demonstrated impressive performance in plant disease identification.<n> deploying these models on mobile and edge devices remains challenging due to high computational demands and resource constraints.<n>We propose MobilePlantViT, a novel hybrid Vision Transformer (ViT) architecture designed for generalized plant disease classification.
- Score: 2.0681376988193843
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Plant diseases significantly threaten global food security by reducing crop yields and undermining agricultural sustainability. AI-driven automated classification has emerged as a promising solution, with deep learning models demonstrating impressive performance in plant disease identification. However, deploying these models on mobile and edge devices remains challenging due to high computational demands and resource constraints, highlighting the need for lightweight, accurate solutions for accessible smart agriculture systems. To address this, we propose MobilePlantViT, a novel hybrid Vision Transformer (ViT) architecture designed for generalized plant disease classification, which optimizes resource efficiency while maintaining high performance. Extensive experiments across diverse plant disease datasets of varying scales show our model's effectiveness and strong generalizability, achieving test accuracies ranging from 80% to over 99%. Notably, with only 0.69 million parameters, our architecture outperforms the smallest versions of MobileViTv1 and MobileViTv2, despite their higher parameter counts. These results underscore the potential of our approach for real-world, AI-powered automated plant disease classification in sustainable and resource-efficient smart agriculture systems. All codes will be available in the GitHub repository: https://github.com/moshiurtonmoy/MobilePlantViT
Related papers
- DS_FusionNet: Dynamic Dual-Stream Fusion with Bidirectional Knowledge Distillation for Plant Disease Recognition [5.665116885785105]
This study innovatively proposes a Dynamic Dual-Stream Fusion Network (DS_FusionNet)
The network integrates a dual-backbone architecture, deformable dynamic fusion modules, and bidirectional knowledge distillation strategy.
Experimental results demonstrate that DS_FusionNet achieves classification accuracies exceeding 90% using only 10% of the PlantDisease and CIFAR-10 datasets.
arXiv Detail & Related papers (2025-04-29T17:15:02Z) - Hybrid Knowledge Transfer through Attention and Logit Distillation for On-Device Vision Systems in Agricultural IoT [0.0]
This work advances real-time, energy-efficient crop monitoring in precision agriculture.
It demonstrates how we can attain ViT-level diagnostic precision on edge devices.
arXiv Detail & Related papers (2025-04-21T06:56:41Z) - Smooth Handovers via Smoothed Online Learning [48.953313950521746]
We first analyze an extensive dataset from a commercial mobile network operator (MNO) in Europe with more than 40M users, to understand and reveal important features and performance impacts on HOs.<n>Our findings highlight a correlation between HO failures/delays, and the characteristics of radio cells and end-user devices.<n>We propose a realistic system model for smooth and accurate HOs that extends existing approaches by incorporating device and cell features on HO optimization.
arXiv Detail & Related papers (2025-01-14T13:16:33Z) - Automatic Fused Multimodal Deep Learning for Plant Identification [1.2289361708127877]
We introduce a pioneering multimodal DL-based approach for plant classification with automatic modality fusion.<n>Our method achieves 82.61% accuracy on 979 classes of Multimodal-PlantCLEF, surpassing state-of-the-art methods and outperforming late fusion by 10.33%.
arXiv Detail & Related papers (2024-06-03T15:43:29Z) - Generating Diverse Agricultural Data for Vision-Based Farming Applications [74.79409721178489]
This model is capable of simulating distinct growth stages of plants, diverse soil conditions, and randomized field arrangements under varying lighting conditions.
Our dataset includes 12,000 images with semantic labels, offering a comprehensive resource for computer vision tasks in precision agriculture.
arXiv Detail & Related papers (2024-03-27T08:42:47Z) - Forging Vision Foundation Models for Autonomous Driving: Challenges,
Methodologies, and Opportunities [59.02391344178202]
Vision foundation models (VFMs) serve as potent building blocks for a wide range of AI applications.
The scarcity of comprehensive training data, the need for multi-sensor integration, and the diverse task-specific architectures pose significant obstacles to the development of VFMs.
This paper delves into the critical challenge of forging VFMs tailored specifically for autonomous driving, while also outlining future directions.
arXiv Detail & Related papers (2024-01-16T01:57:24Z) - SugarViT -- Multi-objective Regression of UAV Images with Vision
Transformers and Deep Label Distribution Learning Demonstrated on Disease
Severity Prediction in Sugar Beet [3.2925222641796554]
This work will introduce a machine learning framework for automatized large-scale plant-specific trait annotation.
We develop an efficient Vision Transformer based model for disease severity scoring called SugarViT.
Although the model is evaluated on this special use case, it is held as generic as possible to also be applicable to various image-based classification and regression tasks.
arXiv Detail & Related papers (2023-11-06T13:01:17Z) - Filling the Missing: Exploring Generative AI for Enhanced Federated
Learning over Heterogeneous Mobile Edge Devices [72.61177465035031]
We propose a generative AI-empowered federated learning to address these challenges by leveraging the idea of FIlling the MIssing (FIMI) portion of local data.
Experiment results demonstrate that FIMI can save up to 50% of the device-side energy to achieve the target global test accuracy.
arXiv Detail & Related papers (2023-10-21T12:07:04Z) - AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context
Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images.
AMIGO uses the celluar graph within the tissue to provide a single representation for a patient.
We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z) - Explainable vision transformer enabled convolutional neural network for
plant disease identification: PlantXViT [11.623005206620498]
Plant diseases are the primary cause of crop losses globally, with an impact on the world economy.
In this study, a Vision Transformer enabled Convolutional Neural Network model called "PlantXViT" is proposed for plant disease identification.
The proposed model has a lightweight structure with only 0.8 million trainable parameters, which makes it suitable for IoT-based smart agriculture services.
arXiv Detail & Related papers (2022-07-16T12:05:06Z) - Vision Transformers For Weeds and Crops Classification Of High
Resolution UAV Images [3.1083892213758104]
Vision Transformer (ViT) models can achieve competitive or better results without applying any convolution operations.
Our experiments show that with small set of labelled training data, ViT models perform better compared to state-of-the-art CNN-based models.
arXiv Detail & Related papers (2021-09-06T19:58:54Z) - Two-View Fine-grained Classification of Plant Species [66.75915278733197]
We propose a novel method based on a two-view leaf image representation and a hierarchical classification strategy for fine-grained recognition of plant species.
A deep metric based on Siamese convolutional neural networks is used to reduce the dependence on a large number of training samples and make the method scalable to new plant species.
arXiv Detail & Related papers (2020-05-18T21:57:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.