TinyViT: Field Deployable Transformer Pipeline for Solar Panel Surface Fault and Severity Screening
- URL: http://arxiv.org/abs/2512.00117v1
- Date: Thu, 27 Nov 2025 17:35:57 GMT
- Title: TinyViT: Field Deployable Transformer Pipeline for Solar Panel Surface Fault and Severity Screening
- Authors: Ishwaryah Pandiarajan, Mohamed Mansoor Roomi Sindha, Uma Maheswari Pandyan, Sharafia N,
- Abstract summary: This work demonstrates that deep learning and classical machine learning may be judiciously combined to achieve robust surface anomaly categorization and severity estimation.<n>We introduce TinyViT which is a compact pipeline integrating Transformer based segmentation, spectral-spatial feature engineering, and ensemble regression.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sustained operation of solar photovoltaic assets hinges on accurate detection and prioritization of surface faults across vast, geographically distributed modules. While multi modal imaging strategies are popular, they introduce logistical and economic barriers for routine farm level deployment. This work demonstrates that deep learning and classical machine learning may be judiciously combined to achieve robust surface anomaly categorization and severity estimation from planar visible band imagery alone. We introduce TinyViT which is a compact pipeline integrating Transformer based segmentation, spectral-spatial feature engineering, and ensemble regression. The system ingests consumer grade color camera mosaics of PV panels, classifies seven nuanced surface faults, and generates actionable severity grades for maintenance triage. By eliminating reliance on electroluminescence or IR sensors, our method enables affordable, scalable upkeep for resource limited installations, and advances the state of solar health monitoring toward universal field accessibility. Experiments on real public world datasets validate both classification and regression sub modules, achieving accuracy and interpretability competitive with specialized approaches.
Related papers
- Contrastive Heliophysical Image Pretraining for Solar Dynamics Observatory Records [9.239205316203245]
SolarCHIP is a family of contrastively pretrained visual backbones tailored to multi-instrument SDO observations.<n>SolarCHIP addresses three key challenges in solar imaging: multimodal sensing across AIA and HMI instruments, weak inter-class separability due to slow temporal evolution, and strong intra-class variability with sparse activity signals.
arXiv Detail & Related papers (2025-11-28T08:03:46Z) - Power Battery Detection [91.99787495748218]
Power batteries are essential components in electric vehicles, where internal structural defects can pose serious safety risks.<n>We conduct a comprehensive study on power battery detection (PBD), which aims to localize the dense endpoints of cathode and anode plates from X-ray images for quality inspection.<n>We present PBD5K, the first large-scale benchmark for this task, consisting of 5,000 X-ray images from nine battery types with fine-grained annotations and eight types of real-world visual interference.
arXiv Detail & Related papers (2025-08-11T09:35:25Z) - Solar Photovoltaic Assessment with Large Language Model [5.156484100374059]
We investigate how large language models (LLMs) can be leveraged to overcome solar panel detection challenges.<n>LLMs face several challenges in solar panel detection, including difficulties with multi-step logical processes.<n>We propose the PV Assessment with LLMs framework, which incorporates task decomposition for more efficient output standardization.
arXiv Detail & Related papers (2025-07-25T10:26:29Z) - A Hybrid Ensemble Learning Framework for Image-Based Solar Panel Classification [2.80608717912532]
This paper presents a novel Dual Ensemble Neural Network (DENN) to classify solar panels using image-based features.<n>The DENN model is evaluated in comparison to current ensemble methods, showcasing its superior performance across a range of assessment metrics.
arXiv Detail & Related papers (2025-07-02T15:07:43Z) - Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy [73.75271615101754]
We present Dita, a scalable framework that leverages Transformer architectures to directly denoise continuous action sequences.<n>Dita employs in-context conditioning -- enabling fine-grained alignment between denoised actions and raw visual tokens from historical observations.<n>Dita effectively integrates cross-embodiment datasets across diverse camera perspectives, observation scenes, tasks, and action spaces.
arXiv Detail & Related papers (2025-03-25T15:19:56Z) - Aerial Infrared Health Monitoring of Solar Photovoltaic Farms at Scale [0.0]
Solar photovoltaic (PV) farms represent a major source of global renewable energy generation, yet their true operational efficiency often remains unknown at scale.<n>We present a comprehensive, data-driven framework for large-scale airborne infrared inspection of North American solar installations.
arXiv Detail & Related papers (2025-03-03T23:32:21Z) - Machine learning approaches for automatic defect detection in photovoltaic systems [1.121744174061766]
Solar photovoltaic (PV) modules are prone to damage during manufacturing, installation and operation.
Continuous monitoring of PV modules during operation via unmanned aerial vehicles is essential.
Computer vision provides an automatic, non-destructive and cost-effective tool for monitoring defects in large-scale PV plants.
arXiv Detail & Related papers (2024-09-24T13:11:05Z) - F$^3$Loc: Fusion and Filtering for Floorplan Localization [57.93061992125962]
We propose an efficient data-driven solution to self-localization within a floorplan.<n>Our method does not require retraining per map and location or demand a large database of images of the area of interest.
arXiv Detail & Related papers (2024-03-05T23:32:26Z) - SimPLR: A Simple and Plain Transformer for Efficient Object Detection and Segmentation [49.65221743520028]
We show that shifting the multiscale inductive bias into the attention mechanism can work well, resulting in a plain detector SimPLR'<n>We find through our experiments that SimPLR with scale-aware attention is plain and simple architecture, yet competitive with multi-scale vision transformer alternatives.
arXiv Detail & Related papers (2023-10-09T17:59:26Z) - A Comparative Study on Generative Models for High Resolution Solar
Observation Imaging [59.372588316558826]
This work investigates capabilities of current state-of-the-art generative models to accurately capture the data distribution behind observed solar activity states.
Using distributed training on supercomputers, we are able to train generative models for up to 1024x1024 resolution that produce high quality samples indistinguishable to human experts.
arXiv Detail & Related papers (2023-04-14T14:40:32Z) - TransVG: End-to-End Visual Grounding with Transformers [102.11922622103613]
We present a transformer-based framework for visual grounding, namely TransVG, to address the task of grounding a language query to an image.
We show that the complex fusion modules can be replaced by a simple stack of transformer encoder layers with higher performance.
arXiv Detail & Related papers (2021-04-17T13:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.