Related papers: StarEmbed: Benchmarking Time Series Foundation Models on Astronomical Observations of Variable Stars

StarEmbed: Benchmarking Time Series Foundation Models on Astronomical Observations of Variable Stars

URL: http://arxiv.org/abs/2510.06200v1
Date: Tue, 07 Oct 2025 17:53:56 GMT
Title: StarEmbed: Benchmarking Time Series Foundation Models on Astronomical Observations of Variable Stars
Authors: Weijian Li, Hong-Yu Chen, Qinjie Lin, Nabeel Rehemtulla, Ved G. Shah, Dennis Wu, Adam A. Miller, Han Liu,
Abstract summary: Time series foundation models (TSFMs) are increasingly being adopted as highly-capable general-purpose time series representation learners.<n>We introduce StarEmbed, the first public benchmark for rigorous and standardized evaluation of state-of-the-art TSFMs on stellar time series observations.<n>We benchmark on three scientifically-motivated downstream tasks: unsupervised clustering, supervised classification, and out-of-distribution source detection.
Score: 12.329789568475045
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Time series foundation models (TSFMs) are increasingly being adopted as highly-capable general-purpose time series representation learners. Although their training corpora are vast, they exclude astronomical time series data. Observations of stars produce peta-scale time series with unique challenges including irregular sampling and heteroskedasticity. We introduce StarEmbed, the first public benchmark for rigorous and standardized evaluation of state-of-the-art TSFMs on stellar time series observations (``light curves''). We benchmark on three scientifically-motivated downstream tasks: unsupervised clustering, supervised classification, and out-of-distribution source detection. StarEmbed integrates a catalog of expert-vetted labels with multi-variate light curves from the Zwicky Transient Facility, yielding ~40k hand-labeled light curves spread across seven astrophysical classes. We evaluate the zero-shot representation capabilities of three TSFMs (MOIRAI, Chronos, Chronos-Bolt) and a domain-specific transformer (Astromer) against handcrafted feature extraction, the long-standing baseline in the astrophysics literature. Our results demonstrate that these TSFMs, especially the Chronos models, which are trained on data completely unlike the astronomical observations, can outperform established astrophysics-specific baselines in some tasks and effectively generalize to entirely new data. In particular, TSFMs deliver state-of-the-art performance on our out-of-distribution source detection benchmark. With the first benchmark of TSFMs on astronomical time series data, we test the limits of their generalization and motivate a paradigm shift in time-domain astronomy from using task-specific, fully supervised pipelines toward adopting generic foundation model representations for the analysis of peta-scale datasets from forthcoming observatories.

Related papers

It's TIME: Towards the Next Generation of Time Series Forecasting Benchmarks [87.7937890373758]
Time series foundation models (TSFMs) are revolutionizing the forecasting landscape from specific dataset modeling to generalizable task evaluation.<n>We introduce TIME, a next-generation task-centric benchmark comprising 50 fresh datasets and 98 forecasting tasks.<n>We propose a novel pattern-level evaluation perspective that moves beyond traditional dataset-level evaluations based on static meta labels.
arXiv Detail & Related papers (2026-02-12T16:31:01Z)
Universal Redundancies in Time Series Foundation Models [3.8551402560229806]
Time Series Foundation Models (TSFMs) leverage extensive pretraining to accurately predict unseen time series during inference.<n>We introduce a set of tools for mechanistic interpretability of TSFMs, including ablations of specific components and direct logit attribution on the residual stream.
arXiv Detail & Related papers (2026-02-02T03:53:46Z)
Simulation-Based Pretraining and Domain Adaptation for Astronomical Time Series with Minimal Labeled Data [0.12744523252873352]
We present a pre-training approach that leverages simulations, significantly reducing the need for labeled examples from real observations.<n>Our models, trained on simulated data from multiple astronomical surveys (ZTF and LSST), learn generalizable representations that transfer effectively to downstream tasks.<n>Remarkably, our models exhibit effective zero-shot transfer capabilities, achieving comparable performance on future telescope (LSST) simulations when trained solely on existing telescope (ZTF) data.
arXiv Detail & Related papers (2025-10-14T20:07:14Z)
Time-RA: Towards Time Series Reasoning for Anomaly with LLM Feedback [55.284574165467525]
Time-series Reasoning for Anomaly (Time-RA) transforms classical time series anomaly detection into a generative, reasoning-intensive task.<n>Also, we introduce the first real-world multimodal benchmark dataset, RATs40K, explicitly annotated for anomaly reasoning.
arXiv Detail & Related papers (2025-07-20T18:02:50Z)
StellarF: A Lora-Adapter Integrated Large Model Framework for Stellar Flare Forecasting with Historical & Statistical Data [3.901857423144103]
This study introduces StellarF (Stellar Flare Forecasting), a novel large model for stellar flare forecasting.<n>At its core, StellarF integrates an flare statistical information module with a historical flare record module, enabling multi-scale pattern recognition from observational data.<n>The proposed prediction paradigm establishes a novel methodological framework for advancing astrophysical research and cross-disciplinary applications.
arXiv Detail & Related papers (2025-07-15T04:59:22Z)
TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation [65.74990259650984]
We introduce TerraFM, a scalable self-supervised learning model that leverages globally distributed Sentinel-1 and Sentinel-2 imagery.<n>Our training strategy integrates local-global contrastive learning and introduces a dual-centering mechanism.<n>TerraFM achieves strong generalization on both classification and segmentation tasks, outperforming prior models on GEO-Bench and Copernicus-Bench.
arXiv Detail & Related papers (2025-06-06T17:59:50Z)
General Time-series Model for Universal Knowledge Representation of Multivariate Time-Series data [61.163542597764796]
We show that time series with different time granularities (or corresponding frequency resolutions) exhibit distinct joint distributions in the frequency domain.<n>A novel Fourier knowledge attention mechanism is proposed to enable learning time-aware representations from both the temporal and frequency domains.<n>An autoregressive blank infilling pre-training framework is incorporated to time series analysis for the first time, leading to a generative tasks agnostic pre-training strategy.
arXiv Detail & Related papers (2025-02-05T15:20:04Z)
AstroM$^3$: A self-supervised multimodal model for astronomy [0.0]
We propose AstroM$3$, a self-supervised pre-training approach that enables a model to learn from multiple modalities simultaneously. Specifically, we extend the CLIP (Contrastive Language-Image Pretraining) model to a trimodal setting, allowing the integration of time-series photometry data, spectra, and astrophysical metadata. Results demonstrate that CLIP pre-training improves classification performance for time-series photometry, where accuracy increases from 84.6% to 91.5%.
arXiv Detail & Related papers (2024-11-13T18:20:29Z)
Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts [103.725112190618]
This paper introduces Moirai-MoE, using a single input/output projection layer while delegating the modeling of diverse time series patterns to the sparse mixture of experts. Extensive experiments on 39 datasets demonstrate the superiority of Moirai-MoE over existing foundation models in both in-distribution and zero-shot scenarios.
arXiv Detail & Related papers (2024-10-14T13:01:11Z)
SpectralEarth: Training Hyperspectral Foundation Models at Scale [47.93167977587301]
We introduce SpectralEarth, a large-scale multitemporal dataset designed to pretrain hyperspectral foundation models.<n>We pretrain a series of foundation models on SpectralEarth, integrating a spectral adapter into classical vision backbones.<n>In tandem, we construct nine downstream datasets for land-cover, crop-type mapping, and tree-species classification.
arXiv Detail & Related papers (2024-08-15T22:55:59Z)
From Chaos to Clarity: Time Series Anomaly Detection in Astronomical Observations [6.903396830919462]
We propose a two-stage framework for unsupervised anomaly detection in astronomical observations. In the first stage, we employ a Transformer-based encoder-decoder architecture to learn the normal temporal patterns on each star. In the second stage, we enhance the graph neural network with a window-wise graph structure learning to tackle the occurrence of concurrent noise.
arXiv Detail & Related papers (2024-03-15T11:39:12Z)
Astroconformer: Inferring Surface Gravity of Stars from Stellar Light Curves with Transformer [1.122225892380515]
We introduce Astroconformer, a Transformer-based model to analyze stellar light curves from the Kepler mission. We demonstrate that Astrconformer can robustly infer the stellar surface gravity as a supervised task. We also show that the method can generalize to sparse cadence light curves from the Rubin Observatory.
arXiv Detail & Related papers (2022-07-06T16:22:37Z)
Deep Autoregressive Models with Spectral Attention [74.08846528440024]
We propose a forecasting architecture that combines deep autoregressive models with a Spectral Attention (SA) module. By characterizing in the spectral domain the embedding of the time series as occurrences of a random process, our method can identify global trends and seasonality patterns. Two spectral attention models, global and local to the time series, integrate this information within the forecast and perform spectral filtering to remove time series's noise.
arXiv Detail & Related papers (2021-07-13T11:08:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.