Related papers: Applying Vision Transformers on Spectral Analysis of Astronomical Objects

Applying Vision Transformers on Spectral Analysis of Astronomical Objects

URL: http://arxiv.org/abs/2506.00294v1
Date: Fri, 30 May 2025 22:53:45 GMT
Title: Applying Vision Transformers on Spectral Analysis of Astronomical Objects
Authors: Luis Felipe Strano Moraes, Ignacio Becker, Pavlos Protopapas, Guillermo Cabrera-Vives,
Abstract summary: We fine-tune a ViT pretrained on ImageNet using millions of spectra from the SDSS and LAMOST surveys, represented as spectral plots.<n>We achieve classification accuracy higher than Support Vector Machines and Random Forests, and attain $R2$ values comparable to AstroCLIP's spectrum encoder.
Score: 1.4061979259370274
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We apply pre-trained Vision Transformers (ViTs), originally developed for image recognition, to the analysis of astronomical spectral data. By converting traditional one-dimensional spectra into two-dimensional image representations, we enable ViTs to capture both local and global spectral features through spatial self-attention. We fine-tune a ViT pretrained on ImageNet using millions of spectra from the SDSS and LAMOST surveys, represented as spectral plots. Our model is evaluated on key tasks including stellar object classification and redshift ($z$) estimation, where it demonstrates strong performance and scalability. We achieve classification accuracy higher than Support Vector Machines and Random Forests, and attain $R^2$ values comparable to AstroCLIP's spectrum encoder, even when generalizing across diverse object types. These results demonstrate the effectiveness of using pretrained vision models for spectroscopic data analysis. To our knowledge, this is the first application of ViTs to large-scale, which also leverages real spectroscopic data and does not rely on synthetic inputs.

Related papers

HyperspectralMAE: The Hyperspectral Imagery Classification Model using Fourier-Encoded Dual-Branch Masked Autoencoder [0.04332259966721321]
Hyperspectral imagery provides rich spectral detail but poses unique challenges because of its high dimensionality in both spatial and spectral domains.<n>We propose textitHyperspectralMAE, a Transformer-based model for hyperspectral data that employs a textitdual masking strategy.<n>HyperspectralMAE achieves state-of-the-art transfer-learning accuracy on Indian Pines, confirming that masked dual-dimensional pre-training yields robust spectral-spatial representations.
arXiv Detail & Related papers (2025-05-09T01:16:42Z)
CARL: Camera-Agnostic Representation Learning for Spectral Image Analysis [75.25966323298003]
Spectral imaging offers promising applications across diverse domains, including medicine and urban scene understanding.<n> variability in channel dimensionality and captured wavelengths among spectral cameras impede the development of AI-driven methodologies.<n>We introduce $textbfCARL$, a model for $textbfC$amera-$textbfA$gnostic $textbfR$esupervised $textbfL$ across RGB, multispectral, and hyperspectral imaging modalities.
arXiv Detail & Related papers (2025-04-27T13:06:40Z)
SSF-Net: Spatial-Spectral Fusion Network with Spectral Angle Awareness for Hyperspectral Object Tracking [21.664141982246598]
Hyperspectral video (HSV) offers valuable spatial, spectral, and temporal information simultaneously.<n>Existing methods primarily focus on band regrouping and rely on RGB trackers for feature extraction.<n>In this paper, a spatial-spectral fusion network with spectral angle awareness (SST-Net) is proposed for hyperspectral (HS) object tracking.
arXiv Detail & Related papers (2024-03-09T09:37:13Z)
SpectralGPT: Spectral Remote Sensing Foundation Model [60.023956954916414]
A universal RS foundation model, named SpectralGPT, is purpose-built to handle spectral RS images using a novel 3D generative pretrained transformer (GPT) Compared to existing foundation models, SpectralGPT accommodates input images with varying sizes, resolutions, time series, and regions in a progressive training fashion, enabling full utilization of extensive RS big data. Our evaluation highlights significant performance improvements with pretrained SpectralGPT models, signifying substantial potential in advancing spectral RS big data applications within the field of geoscience.
arXiv Detail & Related papers (2023-11-13T07:09:30Z)
AstroCLIP: A Cross-Modal Foundation Model for Galaxies [40.43521617393482]
AstroCLIP embeds galaxy images and spectra separately by pretraining separate transformer-based image and spectrum encoders in self-supervised settings. We find remarkable performance on all downstream tasks, even relative to supervised baselines. Our approach represents the first cross-modal self-supervised model for galaxies, and the first self-supervised transformer-based architectures for galaxy images and spectra.
arXiv Detail & Related papers (2023-10-04T17:59:38Z)
ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution [76.7408734079706]
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation. We propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure.
arXiv Detail & Related papers (2023-07-26T07:45:14Z)
Specformer: Spectral Graph Neural Networks Meet Transformers [51.644312964537356]
Spectral graph neural networks (GNNs) learn graph representations via spectral-domain graph convolutions. We introduce Specformer, which effectively encodes the set of all eigenvalues and performs self-attention in the spectral domain. By stacking multiple Specformer layers, one can build a powerful spectral GNN.
arXiv Detail & Related papers (2023-03-02T07:36:23Z)
Vision Transformers are Robust Learners [65.91359312429147]
We study the robustness of the Vision Transformer (ViT) against common corruptions and perturbations, distribution shifts, and natural adversarial examples. We present analyses that provide both quantitative and qualitative indications to explain why ViTs are indeed more robust learners.
arXiv Detail & Related papers (2021-05-17T02:39:22Z)
Cross-Spectral Periocular Recognition with Conditional Adversarial Networks [59.17685450892182]
We propose Conditional Generative Adversarial Networks, trained to con-vert periocular images between visible and near-infrared spectra. We obtain a cross-spectral periocular performance of EER=1%, and GAR>99% @ FAR=1%, which is comparable to the state-of-the-art with the PolyU database.
arXiv Detail & Related papers (2020-08-26T15:02:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.