Surya: Foundation Model for Heliophysics
- URL: http://arxiv.org/abs/2508.14112v1
- Date: Mon, 18 Aug 2025 05:44:25 GMT
- Title: Surya: Foundation Model for Heliophysics
- Authors: Sujit Roy, Johannes Schmude, Rohit Lal, Vishal Gaur, Marcus Freitag, Julian Kuehnert, Theodore van Kessel, Dinesha V. Hegde, Andrés Muñoz-Jaramillo, Johannes Jakubik, Etienne Vos, Kshitiz Mandal, Ata Akbari Asanjan, Joao Lucas de Sousa Almeida, Amy Lin, Talwinder Singh, Kang Yang, Chetraj Pandey, Jinsu Hong, Berkay Aydin, Thorsten Kurth, Ryan McGranaghan, Spiridon Kasapis, Vishal Upendran, Shah Bahauddin, Daniel da Silva, Nikolai V. Pogorelov, Campbell Watson, Manil Maskey, Madhulika Guhathakurta, Juan Bernabe-Moreno, Rahul Ramachandran,
- Abstract summary: We introduce Surya, a 366M parameter foundation model for heliophysics designed to learn general-purpose solar representations.<n>We show its ability to forecast solar dynamics and flare events, while downstream fine-tuning with parameter-efficient Low-temporal AdaptationRank (LoRA) shows strong performance.<n>Its novel architecture and performance suggest that the model is able to learn the underlying physics behind solar evolution.
- Score: 3.5997539202699724
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Heliophysics is central to understanding and forecasting space weather events and solar activity. Despite decades of high-resolution observations from the Solar Dynamics Observatory (SDO), most models remain task-specific and constrained by scarce labeled data, limiting their capacity to generalize across solar phenomena. We introduce Surya, a 366M parameter foundation model for heliophysics designed to learn general-purpose solar representations from multi-instrument SDO observations, including eight Atmospheric Imaging Assembly (AIA) channels and five Helioseismic and Magnetic Imager (HMI) products. Surya employs a spatiotemporal transformer architecture with spectral gating and long--short range attention, pretrained on high-resolution solar image forecasting tasks and further optimized through autoregressive rollout tuning. Zero-shot evaluations demonstrate its ability to forecast solar dynamics and flare events, while downstream fine-tuning with parameter-efficient Low-Rank Adaptation (LoRA) shows strong performance on solar wind forecasting, active region segmentation, solar flare forecasting, and EUV spectra. Surya is the first foundation model in heliophysics that uses time advancement as a pretext task on full-resolution SDO data. Its novel architecture and performance suggest that the model is able to learn the underlying physics behind solar evolution.
Related papers
- Meteorological data and Sky Images meets Neural Models for Photovoltaic Power Forecasting [18.633528239379483]
This work develops a hybrid approach for short and long-term forecasting based on two studies with the same purpose.<n>A multimodal approach that combines images of the sky and photovoltaic energy history with meteorological data is proposed.<n>The main goal is to improve the accuracy of ramp event prediction, increase the robustness of forecasts in cloudy conditions, and extend capabilities beyond nowcasting.
arXiv Detail & Related papers (2026-02-17T18:14:15Z) - Ultra-short-term solar power forecasting by deep learning and data reconstruction [60.200987006598524]
We propose a deep-learning based ultra-short-term solar power prediction with data reconstruction.<n>We employ deep-learning models to capture long- and short-term dependencies towards the target prediction period.
arXiv Detail & Related papers (2025-09-21T14:22:35Z) - SuryaBench: Benchmark Dataset for Advancing Machine Learning in Heliophysics and Space Weather Prediction [2.288747975391298]
This paper introduces a high resolution, machine learning-ready heliophysics dataset derived from NASA's Solar Dynamics Observatory (SDO)<n>The dataset includes processed imagery from the Atmospheric Imaging Assembly (AIA) and Helioseismic and Magnetic Imager (HMI)<n>To ensure suitability for ML tasks, the data has been preprocessed, including correction of spacecraft roll angles, orbital adjustments, exposure normalization, and degradation compensation.
arXiv Detail & Related papers (2025-08-18T00:05:01Z) - CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer [47.65152457550307]
We propose the geometric-inspired Circular Transformer (CirT) to model the cyclic characteristic of the graticule.<n>Experiments on the Earth Reanalysis 5 (ERA5) reanalysis dataset demonstrate our model yields a significant improvement over the advanced data-driven models.
arXiv Detail & Related papers (2025-02-27T04:26:23Z) - Solar Active Regions Emergence Prediction Using Long Short-Term Memory
Networks [44.99833362998488]
We develop Long Short-Term Memory (LSTM) models to predict the formation of active regions (ARs) on the solar surface.
Time-series datasets of acoustic power and magnetic flux are used to train LSTM models on predicting continuum intensity, 12 hours in advance.
These novel machine learning (ML) models are able to capture variations of the acoustic power density associated with upcoming magnetic flux emergence and continuum intensity decrease.
arXiv Detail & Related papers (2024-09-25T23:09:46Z) - A Foundation Model for the Earth System [82.73624748093333]
We introduce Aurora, a large-scale foundation model for the Earth system trained on over a million hours of diverse data.
Aurora outperforms operational forecasts for air quality, ocean waves, tropical cyclone tracks, and high-resolution weather forecasting at orders of magnitude smaller computational expense than dedicated existing systems.
arXiv Detail & Related papers (2024-05-20T14:45:18Z) - FengWu-GHR: Learning the Kilometer-scale Medium-range Global Weather
Forecasting [56.73502043159699]
This work presents FengWu-GHR, the first data-driven global weather forecasting model running at the 0.09$circ$ horizontal resolution.
It introduces a novel approach that opens the door for operating ML-based high-resolution forecasts by inheriting prior knowledge from a low-resolution model.
The hindcast of weather prediction in 2022 indicates that FengWu-GHR is superior to the IFS-HRES.
arXiv Detail & Related papers (2024-01-28T13:23:25Z) - Improving day-ahead Solar Irradiance Time Series Forecasting by
Leveraging Spatio-Temporal Context [46.72071291175356]
Solar power harbors immense potential in mitigating climate change by substantially reducing CO$_2$ emissions.
However, the inherent variability of solar irradiance poses a significant challenge for seamlessly integrating solar power into the electrical grid.
In this paper, we put forth a deep learning architecture designed to harnesstemporal context using satellite data.
arXiv Detail & Related papers (2023-06-01T19:54:39Z) - A Comparative Study on Generative Models for High Resolution Solar
Observation Imaging [59.372588316558826]
This work investigates capabilities of current state-of-the-art generative models to accurately capture the data distribution behind observed solar activity states.
Using distributed training on supercomputers, we are able to train generative models for up to 1024x1024 resolution that produce high quality samples indistinguishable to human experts.
arXiv Detail & Related papers (2023-04-14T14:40:32Z) - Computational Solar Energy -- Ensemble Learning Methods for Prediction
of Solar Power Generation based on Meteorological Parameters in Eastern India [0.0]
It is important to estimate the amount of solar photovoltaic (PV) power generation for a specific geographical location.
In this paper, the impact of weather parameters on solar PV power generation is estimated by several Ensemble ML (EML) models like Bagging, Boosting, Stacking, and Voting.
The results demonstrate greater prediction accuracy of around 96% for Stacking and Voting models.
arXiv Detail & Related papers (2023-01-21T19:16:03Z) - Short term solar energy prediction by machine learning algorithms [0.47791962198275073]
We report daily prediction of solar energy by exploiting the strength of machine learning techniques.
Forecast models of base line regressors including linear, ridge, lasso, decision tree, random forest and artificial neural networks have been implemented.
It has been observed that improved accuracy is achieved through random forest and ridge regressor for both grid sizes.
arXiv Detail & Related papers (2020-10-25T17:56:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.