A Foundation Model for the Solar Dynamics Observatory
- URL: http://arxiv.org/abs/2410.02530v1
- Date: Thu, 3 Oct 2024 14:36:32 GMT
- Title: A Foundation Model for the Solar Dynamics Observatory
- Authors: James Walsh, Daniel G. Gass, Raul Ramos Pollan, Paul J. Wright, Richard Galvez, Noah Kasmanoff, Jason Naradowsky, Anne Spalding, James Parr, Atılım Güneş Baydin,
- Abstract summary: SDO-FM is a foundation model using data from NASA's Solar Dynamics Observatory (SDO) spacecraft.
This paper marks release of our pretrained models and embedding datasets, available to the community on Hugging Face and sdofm.org.
- Score: 2.63089646549647
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: SDO-FM is a foundation model using data from NASA's Solar Dynamics Observatory (SDO) spacecraft; integrating three separate instruments to encapsulate the Sun's complex physical interactions into a multi-modal embedding space. This model can be used to streamline scientific investigations involving SDO by making the enormous datasets more computationally accessible for heliophysics research and enable investigations that require instrument fusion. We discuss four key components: an ingestion pipeline to create machine learning ready datasets, the model architecture and training approach, resultant embeddings and fine-tunable models, and finally downstream fine-tuned applications. A key component of this effort has been to include subject matter specialists at each stage of development; reviewing the scientific value and providing guidance for model architecture, dataset, and training paradigm decisions. This paper marks release of our pretrained models and embedding datasets, available to the community on Hugging Face and sdofm.org.
Related papers
- Combining Physics-based and Data-driven Modeling for Building Energy Systems [5.437298646956505]
Building energy modeling plays a vital role in optimizing the operation of building energy systems.
Researchers are combining physics-based and data-driven models into hybrid approaches.
We evaluate four predominant hybrid approaches in building energy modeling through a real-world case study.
arXiv Detail & Related papers (2024-11-01T21:56:39Z) - Foundation Models for Remote Sensing and Earth Observation: A Survey [101.77425018347557]
This survey systematically reviews the emerging field of Remote Sensing Foundation Models (RSFMs)
It begins with an outline of their motivation and background, followed by an introduction of their foundational concepts.
We benchmark these models against publicly available datasets, discuss existing challenges, and propose future research directions.
arXiv Detail & Related papers (2024-10-22T01:08:21Z) - AI Foundation Model for Heliophysics: Applications, Design, and Implementation [1.2851259989174175]
Foundation models (FMs) are pre-trained on a large-scale datasets.
This paper provides our perspective on the criteria for designing an FM for heliophysics.
We believe that this is the first study to design an FM in the domain of heliophysics.
arXiv Detail & Related papers (2024-09-30T15:48:28Z) - On the Opportunities of (Re)-Exploring Atmospheric Science by Foundation Models: A Case Study [2.672038860046272]
Most state-of-the-art AI applications in atmospheric science are based on classic deep learning approaches.
This report explores how the state-of-the-art foundation model, i.e., GPT-4o, performs various atmospheric scientific tasks.
arXiv Detail & Related papers (2024-07-25T07:57:34Z) - Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development [67.55944651679864]
We present a novel sandbox suite tailored for integrated data-model co-development.
This sandbox provides a comprehensive experimental platform, enabling rapid iteration and insight-driven refinement of both data and models.
We also uncover fruitful insights gleaned from exhaustive benchmarks, shedding light on the critical interplay between data quality, diversity, and model behavior.
arXiv Detail & Related papers (2024-07-16T14:40:07Z) - Neural Plasticity-Inspired Multimodal Foundation Model for Earth Observation [48.66623377464203]
Our novel approach introduces the Dynamic One-For-All (DOFA) model, leveraging the concept of neural plasticity in brain science.
This dynamic hypernetwork, adjusting to different wavelengths, enables a single versatile Transformer jointly trained on data from five sensors to excel across 12 distinct Earth observation tasks.
arXiv Detail & Related papers (2024-03-22T17:11:47Z) - Interfacing Foundation Models' Embeddings [131.0352288172788]
We present FIND, a generalized interface for aligning foundation models' embeddings with unified image and dataset-level understanding spanning modality and granularity.
In light of the interleaved embedding space, we introduce FIND-Bench, which introduces new training and evaluation annotations to the COCO dataset for interleaved segmentation and retrieval.
arXiv Detail & Related papers (2023-12-12T18:58:02Z) - End-to-end Phase Field Model Discovery Combining Experimentation,
Crowdsourcing, Simulation and Learning [9.763339269757227]
We present Phase-Field-Lab platform for end-to-end phase field model discovery.
Phase-Field-Lab combines (i) a streamlined annotation tool which reduces the annotation time; (ii) an end-to-end neural model which automatically learns phase field models from data; and (iii) novel interfaces and visualizations.
Our platform is deployed in the analysis of nano-structure evolution in materials under extreme conditions.
arXiv Detail & Related papers (2023-09-13T22:44:04Z) - StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized
Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning.
This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models.
Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - The Open Catalyst 2020 (OC20) Dataset and Community Challenges [36.556154866045894]
Catalyst discovery and optimization is key to solving many societal and energy challenges.
It remains an open challenge to build models that can generalize across both elemental compositions of surfaces and adsorbates.
arXiv Detail & Related papers (2020-10-20T03:29:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.