Astromer 2
- URL: http://arxiv.org/abs/2502.02717v1
- Date: Tue, 04 Feb 2025 20:56:14 GMT
- Title: Astromer 2
- Authors: Cristobal Donoso-Oliva, Ignacio Becker, Pavlos Protopapas, Guillermo Cabrera-Vives, Martina Cádiz-Leyton, Daniel Moreno-Cartagena,
- Abstract summary: We introduce Astromer 2 as an enhanced iteration of our self-supervised model for light curve analysis.
Astromer 2 is pretrained on 1.5 million single-band light curves from the MACHO survey using a self-supervised learning task.
Our results demonstrate that Astromer 2 significantly outperforms Astromer 1 across all evaluated scenarios.
- Score: 1.236974227340167
- License:
- Abstract: Foundational models have emerged as a powerful paradigm in deep learning field, leveraging their capacity to learn robust representations from large-scale datasets and effectively to diverse downstream applications such as classification. In this paper, we present Astromer 2 a foundational model specifically designed for extracting light curve embeddings. We introduce Astromer 2 as an enhanced iteration of our self-supervised model for light curve analysis. This paper highlights the advantages of its pre-trained embeddings, compares its performance with that of its predecessor, Astromer 1, and provides a detailed empirical analysis of its capabilities, offering deeper insights into the model's representations. Astromer 2 is pretrained on 1.5 million single-band light curves from the MACHO survey using a self-supervised learning task that predicts randomly masked observations within sequences. Fine-tuning on a smaller labeled dataset allows us to assess its performance in classification tasks. The quality of the embeddings is measured by the F1 score of an MLP classifier trained on Astromer-generated embeddings. Our results demonstrate that Astromer 2 significantly outperforms Astromer 1 across all evaluated scenarios, including limited datasets of 20, 100, and 500 samples per class. The use of weighted per-sample embeddings, which integrate intermediate representations from Astromer's attention blocks, is particularly impactful. Notably, Astromer 2 achieves a 15% improvement in F1 score on the ATLAS dataset compared to prior models, showcasing robust generalization to new datasets. This enhanced performance, especially with minimal labeled data, underscores the potential of Astromer 2 for more efficient and scalable light curve analysis.
Related papers
- Analysis of Zero Day Attack Detection Using MLP and XAI [0.0]
This paper analyzes Machine Learning (ML) and Deep Learning (DL) based approaches to create Intrusion Detection Systems (IDS)
The focus is on using the KDD99 dataset, which has the most research done among all the datasets for detecting zero-day attacks.
We evaluate the performance of four multilayer perceptron (MLP) trained on the KDD99 dataset, including baseline ML models, weighted ML models, truncated ML models, and weighted truncated ML models.
arXiv Detail & Related papers (2025-01-28T02:20:34Z) - SMPLest-X: Ultimate Scaling for Expressive Human Pose and Shape Estimation [81.36747103102459]
Expressive human pose and shape estimation (EHPS) unifies body, hands, and face motion capture with numerous applications.
Current state-of-the-art methods focus on training innovative architectural designs on confined datasets.
We investigate the impact of scaling up EHPS towards a family of generalist foundation models.
arXiv Detail & Related papers (2025-01-16T18:59:46Z) - DreamMask: Boosting Open-vocabulary Panoptic Segmentation with Synthetic Data [61.62554324594797]
We propose DreamMask, which explores how to generate training data in the open-vocabulary setting, and how to train the model with both real and synthetic data.
In general, DreamMask significantly simplifies the collection of large-scale training data, serving as a plug-and-play enhancement for existing methods.
For instance, when trained on COCO and tested on ADE20K, the model equipped with DreamMask outperforms the previous state-of-the-art by a substantial margin of 2.1% mIoU.
arXiv Detail & Related papers (2025-01-03T19:00:00Z) - AstroM$^3$: A self-supervised multimodal model for astronomy [0.0]
We propose AstroM$3$, a self-supervised pre-training approach that enables a model to learn from multiple modalities simultaneously.
Specifically, we extend the CLIP (Contrastive Language-Image Pretraining) model to a trimodal setting, allowing the integration of time-series photometry data, spectra, and astrophysical metadata.
Results demonstrate that CLIP pre-training improves classification performance for time-series photometry, where accuracy increases from 84.6% to 91.5%.
arXiv Detail & Related papers (2024-11-13T18:20:29Z) - AstroMLab 2: AstroLLaMA-2-70B Model and Benchmarking Specialised LLMs for Astronomy [4.729846733874557]
This study aims to quantitatively assess specialized LLMs in astronomy.
We find that the previously released AstroLLaMA series, based on LLaMA-2-7B, underperforms compared to the base model.
Despite the observed catastrophic forgetting in smaller models, our results indicate that continual pretraining on the 70B model can yield significant improvements.
arXiv Detail & Related papers (2024-09-29T16:02:22Z) - AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs [61.13296177652599]
This paper demonstrates that the optimal composition of training data from different domains is scale-dependent.
We introduce *AutoScale*, a novel, practical approach for optimizing data compositions at potentially large training data scales.
Our evaluation on GPT-2 Large and BERT pre-training demonstrates *AutoScale*'s effectiveness in improving training convergence and downstream performance.
arXiv Detail & Related papers (2024-07-29T17:06:30Z) - The Scaling Law in Stellar Light Curves [3.090476527764192]
We investigate the scaling law properties that emerge when learning from astronomical time series data using self-supervised techniques.
A self-supervised Transformer model achieves 3-10 times the sample efficiency compared to the state-of-the-art supervised learning model.
Our research lays the groundwork for analyzing stellar light curves by examining them through large-scale auto-regressive generative models.
arXiv Detail & Related papers (2024-05-27T13:31:03Z) - Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification [7.592813175419603]
We present a comprehensive evaluation of deep-learning and large language model (LLM) based models for the automatic classification of variable star light curves.
Special emphasis is placed on Cepheids, RR Lyrae, and eclipsing binaries, examining the influence of observational cadence and phase distribution on classification precision.
We unveil StarWhisper LightCurve (LC), an innovative Series comprising three LLM-based models: LLM, multimodal large language model (MLLM), and Large Audio Language Model (LALM)
arXiv Detail & Related papers (2024-04-16T17:35:25Z) - Convolutional Neural Networks for the classification of glitches in
gravitational-wave data streams [52.77024349608834]
We classify transient noise signals (i.e.glitches) and gravitational waves in data from the Advanced LIGO detectors.
We use models with a supervised learning approach, both trained from scratch using the Gravity Spy dataset.
We also explore a self-supervised approach, pre-training models with automatically generated pseudo-labels.
arXiv Detail & Related papers (2023-03-24T11:12:37Z) - FairIF: Boosting Fairness in Deep Learning via Influence Functions with
Validation Set Sensitive Attributes [51.02407217197623]
We propose a two-stage training algorithm named FAIRIF.
It minimizes the loss over the reweighted data set where the sample weights are computed.
We show that FAIRIF yields models with better fairness-utility trade-offs against various types of bias.
arXiv Detail & Related papers (2022-01-15T05:14:48Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.