Application of Graph Based Vision Transformers Architectures for Accurate Temperature Prediction in Fiber Specklegram Sensors
- URL: http://arxiv.org/abs/2511.14792v1
- Date: Sat, 15 Nov 2025 07:56:15 GMT
- Title: Application of Graph Based Vision Transformers Architectures for Accurate Temperature Prediction in Fiber Specklegram Sensors
- Authors: Abhishek Sebastian,
- Abstract summary: This study investigates the use of transformer-based architectures to predict temperature from specklegram data over a range of 0 to 120 Celsius.<n>The results show that ViTs achieved a Mean Absolute Error (MAE) of 1.15, outperforming traditional models such as CNNs.<n>The study incorporates Explainable AI (XAI) techniques, including attention maps and saliency maps, to provide insights into the decision-making processes of the transformer models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Fiber Specklegram Sensors (FSS) are highly effective for environmental monitoring, particularly for detecting temperature variations. However, the nonlinear nature of specklegram data presents significant challenges for accurate temperature prediction. This study investigates the use of transformer-based architectures, including Vision Transformers (ViTs), Swin Transformers, and emerging models such as Learnable Importance Non-Symmetric Attention Vision Transformers (LINA-ViT) and Multi-Adaptive Proximity Vision Graph Attention Transformers (MAP-ViGAT), to predict temperature from specklegram data over a range of 0 to 120 Celsius. The results show that ViTs achieved a Mean Absolute Error (MAE) of 1.15, outperforming traditional models such as CNNs. GAT-ViT and MAP-ViGAT variants also demonstrated competitive accuracy, highlighting the importance of adaptive attention mechanisms and graph-based structures in capturing complex modal interactions and phase shifts in specklegram data. Additionally, this study incorporates Explainable AI (XAI) techniques, including attention maps and saliency maps, to provide insights into the decision-making processes of the transformer models, improving interpretability and transparency. These findings establish transformer architectures as strong benchmarks for optical fiber-based temperature sensing and offer promising directions for industrial monitoring and structural health assessment applications.
Related papers
- Benchmarking Vision Transformers and CNNs for Thermal Photovoltaic Fault Detection with Explainable AI Validation [0.0]
This study provides a systematic comparison of convolutional neural networks (ResNet-18, EfficientNet-B0) and vision transformers (ViT-Tiny, Swin-Tiny) for thermal PV fault detection.<n> Evaluation on 20,000 infrared images spanning normal operation and 11 fault categories shows that Swin Transformer achieves the highest performance.<n> XRAI analysis reveals that models learn physically meaningful features, such as localized hotspots for cell defects, linear thermal paths for diode failures, and thermal boundaries for vegetation shading.
arXiv Detail & Related papers (2025-09-08T08:38:53Z) - SSTAF: Spatial-Spectral-Temporal Attention Fusion Transformer for Motor Imagery Classification [0.0]
Brain-computer interfaces (BCI) in electroencephalography (EEG)-based motor imagery classification offer promising solutions in neurorehabilitation and assistive technologies.<n>Non-stationary nature of EEG signals and significant inter-subject variability cause substantial challenges for developing robust cross-subject classification models.<n>This paper introduces a novel Spatial-Spectral-Temporal Attention Fusion (SSTAF) Transformer specifically designed for upper-limb motor imagery classification.
arXiv Detail & Related papers (2025-04-17T07:45:14Z) - Prion-ViT: Prions-Inspired Vision Transformers for Temperature prediction with Specklegrams [0.0]
This study introduces Prion-ViT, a prion-inspired Vision Transformer model, to improve long-term dependency modeling and temperature prediction accuracy using FSS data.<n>Prion-ViT leverages a persistent memory state to retain and propagate key features across layers, reducing mean absolute error (MAE) to 0.71$circ$C and outperforming models like ResNet, Inception Net V2, and Standard Vision Transformers.
arXiv Detail & Related papers (2024-11-06T12:28:26Z) - Unveil Benign Overfitting for Transformer in Vision: Training Dynamics, Convergence, and Generalization [88.5582111768376]
We study the optimization of a Transformer composed of a self-attention layer with softmax followed by a fully connected layer under gradient descent on a certain data distribution model.
Our results establish a sharp condition that can distinguish between the small test error phase and the large test error regime, based on the signal-to-noise ratio in the data model.
arXiv Detail & Related papers (2024-09-28T13:24:11Z) - Automatic Graph Topology-Aware Transformer [50.2807041149784]
We build a comprehensive graph Transformer search space with the micro-level and macro-level designs.
EGTAS evolves graph Transformer topologies at the macro level and graph-aware strategies at the micro level.
We demonstrate the efficacy of EGTAS across a range of graph-level and node-level tasks.
arXiv Detail & Related papers (2024-05-30T07:44:31Z) - Residual-based Attention Physics-informed Neural Networks for Spatio-Temporal Ageing Assessment of Transformers Operated in Renewable Power Plants [0.6223528900192875]
This article introduces a-temporal model for transformer winding temperature and ageing.
It uses physics-based partial differential equations with data-driven Neural Networks.
Results are validated with a distribution transformer operating on a floating photovoltaic power plant.
arXiv Detail & Related papers (2024-05-10T12:48:57Z) - Performance Evaluation of Swin Vision Transformer Model using Gradient
Accumulation Optimization Technique [0.0]
This paper evaluates the performance of Swin ViT model using gradient accumulation optimization technique.
Applying the GAO technique leads to a significant decrease in the accuracy of the Swin ViT model.
arXiv Detail & Related papers (2023-07-31T23:30:16Z) - Visualizing and Understanding Patch Interactions in Vision Transformer [96.70401478061076]
Vision Transformer (ViT) has become a leading tool in various computer vision tasks.
We propose a novel explainable visualization approach to analyze and interpret the crucial attention interactions among patches for vision transformer.
arXiv Detail & Related papers (2022-03-11T13:48:11Z) - Learning Generative Vision Transformer with Energy-Based Latent Space
for Saliency Prediction [51.80191416661064]
We propose a novel vision transformer with latent variables following an informative energy-based prior for salient object detection.
Both the vision transformer network and the energy-based prior model are jointly trained via Markov chain Monte Carlo-based maximum likelihood estimation.
With the generative vision transformer, we can easily obtain a pixel-wise uncertainty map from an image, which indicates the model confidence in predicting saliency from the image.
arXiv Detail & Related papers (2021-12-27T06:04:33Z) - Efficient Vision Transformers via Fine-Grained Manifold Distillation [96.50513363752836]
Vision transformer architectures have shown extraordinary performance on many computer vision tasks.
Although the network performance is boosted, transformers are often required more computational resources.
We propose to excavate useful information from the teacher transformer through the relationship between images and the divided patches.
arXiv Detail & Related papers (2021-07-03T08:28:34Z) - Vision Transformers are Robust Learners [65.91359312429147]
We study the robustness of the Vision Transformer (ViT) against common corruptions and perturbations, distribution shifts, and natural adversarial examples.
We present analyses that provide both quantitative and qualitative indications to explain why ViTs are indeed more robust learners.
arXiv Detail & Related papers (2021-05-17T02:39:22Z) - TransMOT: Spatial-Temporal Graph Transformer for Multiple Object
Tracking [74.82415271960315]
We propose a solution named TransMOT to efficiently model the spatial and temporal interactions among objects in a video.
TransMOT is not only more computationally efficient than the traditional Transformer, but it also achieves better tracking accuracy.
The proposed method is evaluated on multiple benchmark datasets including MOT15, MOT16, MOT17, and MOT20.
arXiv Detail & Related papers (2021-04-01T01:49:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.