Active learning with RESSPECT: Resource allocation for extragalactic
astronomical transients
- URL: http://arxiv.org/abs/2010.05941v2
- Date: Mon, 26 Oct 2020 21:05:03 GMT
- Title: Active learning with RESSPECT: Resource allocation for extragalactic
astronomical transients
- Authors: Noble Kennamer, Emille E. O. Ishida, Santiago Gonzalez-Gaitan, Rafael
S. de Souza, Alexander Ihler, Kara Ponder, Ricardo Vilalta, Anais Moller,
David O. Jones, Mi Dai, Alberto Krone-Martins, Bruno Quint, Sreevarsha
Sreejith, Alex I. Malz, Lluis Galbany (The LSST Dark Energy Science
Collaboration and the COIN collaboration)
- Abstract summary: RESSPECT project aims to enable the construction of optimized training samples for the Rubin Observatory Legacy Survey of Space and Time.
We test the robustness of active learning techniques in a realistic simulated astronomical data scenario.
- Score: 41.74772877196879
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent increase in volume and complexity of available astronomical data
has led to a wide use of supervised machine learning techniques. Active
learning strategies have been proposed as an alternative to optimize the
distribution of scarce labeling resources. However, due to the specific
conditions in which labels can be acquired, fundamental assumptions, such as
sample representativeness and labeling cost stability cannot be fulfilled. The
Recommendation System for Spectroscopic follow-up (RESSPECT) project aims to
enable the construction of optimized training samples for the Rubin Observatory
Legacy Survey of Space and Time (LSST), taking into account a realistic
description of the astronomical data environment. In this work, we test the
robustness of active learning techniques in a realistic simulated astronomical
data scenario. Our experiment takes into account the evolution of training and
pool samples, different costs per object, and two different sources of budget.
Results show that traditional active learning strategies significantly
outperform random sampling. Nevertheless, more complex batch strategies are not
able to significantly overcome simple uncertainty sampling techniques. Our
findings illustrate three important points: 1) active learning strategies are a
powerful tool to optimize the label-acquisition task in astronomy, 2) for
upcoming large surveys like LSST, such techniques allow us to tailor the
construction of the training sample for the first day of the survey, and 3) the
peculiar data environment related to the detection of astronomical transients
is a fertile ground that calls for the development of tailored machine learning
algorithms.
Related papers
- Enhancing Multivariate Time Series-based Solar Flare Prediction with Multifaceted Preprocessing and Contrastive Learning [0.9374652839580181]
Accurate solar flare prediction is crucial due to the significant risks that intense solar flares pose to astronauts, space equipment, and satellite communication systems.
Our research enhances solar flare prediction by utilizing advanced data preprocessing and classification methods.
arXiv Detail & Related papers (2024-09-21T05:00:34Z) - Semi-Supervised One-Shot Imitation Learning [83.94646047695412]
One-shot Imitation Learning aims to imbue AI agents with the ability to learn a new task from a single demonstration.
We introduce the semi-supervised OSIL problem setting, where the learning agent is presented with a large dataset of trajectories.
We develop an algorithm specifically applicable to this semi-supervised OSIL setting.
arXiv Detail & Related papers (2024-08-09T18:11:26Z) - Take the Bull by the Horns: Hard Sample-Reweighted Continual Training
Improves LLM Generalization [165.98557106089777]
A key challenge is to enhance the capabilities of large language models (LLMs) amid a looming shortage of high-quality training data.
Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets.
We then formalize this strategy into a principled framework of Instance-Reweighted Distributionally Robust Optimization.
arXiv Detail & Related papers (2024-02-22T04:10:57Z) - An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models [55.01592097059969]
Supervised finetuning on instruction datasets has played a crucial role in achieving the remarkable zero-shot generalization capabilities.
Active learning is effective in identifying useful subsets of samples to annotate from an unlabeled pool.
We propose using experimental design to circumvent the computational bottlenecks of active learning.
arXiv Detail & Related papers (2024-01-12T16:56:54Z) - deep-REMAP: Parameterization of Stellar Spectra Using Regularized
Multi-Task Learning [0.0]
Deep-Regularized Ensemble-based Multi-task Learning with Asymmetric Loss for Probabilistic Inference ($rmdeep-REMAP$)
We develop a novel framework that utilizes the rich synthetic spectra from the PHOENIX library and observational data from the MARVELS survey to accurately predict stellar atmospheric parameters.
arXiv Detail & Related papers (2023-11-07T05:41:48Z) - GenCo: An Auxiliary Generator from Contrastive Learning for Enhanced
Few-Shot Learning in Remote Sensing [9.504503675097137]
We introduce a generator-based contrastive learning framework (GenCo) that pre-trains backbones and simultaneously explores variants of feature samples.
In fine-tuning, the auxiliary generator can be used to enrich limited labeled data samples in feature space.
We demonstrate the effectiveness of our method in improving few-shot learning performance on two key remote sensing datasets.
arXiv Detail & Related papers (2023-07-27T03:59:19Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Sampling Through the Lens of Sequential Decision Making [9.101505546901999]
We propose a reward-guided sampling strategy called Adaptive Sample with Reward (ASR)
Our approach optimally adjusts the sampling process to achieve optimal performance.
Empirical results in information retrieval and clustering demonstrate ASR's superb performance across different datasets.
arXiv Detail & Related papers (2022-08-17T04:01:29Z) - Improving Astronomical Time-series Classification via Data Augmentation
with Generative Adversarial Networks [1.2891210250935146]
We propose a data augmentation methodology based on Generative Adrial Networks (GANs) to generate a variety of synthetic light curves from variable stars.
The classification accuracy of variable stars is improved significantly when training with synthetic data and testing with real data.
arXiv Detail & Related papers (2022-05-13T16:39:54Z) - TRAIL: Near-Optimal Imitation Learning with Suboptimal Data [100.83688818427915]
We present training objectives that use offline datasets to learn a factored transition model.
Our theoretical analysis shows that the learned latent action space can boost the sample-efficiency of downstream imitation learning.
To learn the latent action space in practice, we propose TRAIL (Transition-Reparametrized Actions for Imitation Learning), an algorithm that learns an energy-based transition model.
arXiv Detail & Related papers (2021-10-27T21:05:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.