Exoplanet Detection Using Machine Learning Models Trained on Synthetic Light Curves
- URL: http://arxiv.org/abs/2507.19520v1
- Date: Fri, 18 Jul 2025 17:25:25 GMT
- Title: Exoplanet Detection Using Machine Learning Models Trained on Synthetic Light Curves
- Authors: Ethan Lo, Dan C. Lo,
- Abstract summary: As of now there have been about 5,000 confirmed exoplanets since the late 1900s.<n>Recent machine learning (ML) has proven to be extremely valuable and efficient in various fields.<n>We report the results and potential benefits of various, well-known ML models in the discovery and validation of extrasolar planets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With manual searching processes, the rate at which scientists and astronomers discover exoplanets is slow because of inefficiencies that require an extensive time of laborious inspections. In fact, as of now there have been about only 5,000 confirmed exoplanets since the late 1900s. Recently, machine learning (ML) has proven to be extremely valuable and efficient in various fields, capable of processing massive amounts of data in addition to increasing its accuracy by learning. Though ML models for discovering exoplanets owned by large corporations (e.g. NASA) exist already, they largely depend on complex algorithms and supercomputers. In an effort to reduce such complexities, in this paper, we report the results and potential benefits of various, well-known ML models in the discovery and validation of extrasolar planets. The ML models that are examined in this study include logistic regression, k-nearest neighbors, and random forest. The dataset on which the models train and predict is acquired from NASA's Kepler space telescope. The initial results show promising scores for each model. However, potential biases and dataset imbalances necessitate the use of data augmentation techniques to further ensure fairer predictions and improved generalization. This study concludes that, in the context of searching for exoplanets, data augmentation techniques significantly improve the recall and precision, while the accuracy varies for each model.
Related papers
- Predictable Scale: Part II, Farseer: A Refined Scaling Law in Large Language Models [62.3458061002951]
We introduce Farseer, a novel and refined scaling law offering enhanced predictive accuracy across scales.<n>By systematically constructing a model loss surface $L(N,D)$, Farseer achieves a significantly better fit to empirical data than prior laws.<n>Our methodology yields accurate, robust, and highly generalizable predictions, demonstrating excellent extrapolation capabilities.
arXiv Detail & Related papers (2025-06-12T17:59:23Z) - Towards LLM Agents for Earth Observation [49.92444022073444]
We introduce datasetnamenospace, a benchmark of 140 yes/no questions from NASA Earth Observatory articles across 13 topics and 17 satellite sensors.<n>Using Google Earth Engine API as a tool, LLM agents can only achieve an accuracy of 33% because the code fails to run over 58% of the time.<n>We improve the failure rate for open models by fine-tuning synthetic data, allowing much smaller models to achieve comparable accuracy to much larger ones.
arXiv Detail & Related papers (2025-04-16T14:19:25Z) - Reward Finetuning for Faster and More Accurate Unsupervised Object
Discovery [64.41455104593304]
Reinforcement Learning from Human Feedback (RLHF) can improve machine learning models and align them with human preferences.
We propose to adapt similar RL-based methods to unsupervised object discovery.
We demonstrate that our approach is not only more accurate, but also orders of magnitudes faster to train.
arXiv Detail & Related papers (2023-10-29T17:03:12Z) - A Comparative Study on Generative Models for High Resolution Solar
Observation Imaging [59.372588316558826]
This work investigates capabilities of current state-of-the-art generative models to accurately capture the data distribution behind observed solar activity states.
Using distributed training on supercomputers, we are able to train generative models for up to 1024x1024 resolution that produce high quality samples indistinguishable to human experts.
arXiv Detail & Related papers (2023-04-14T14:40:32Z) - Predictive World Models from Real-World Partial Observations [66.80340484148931]
We present a framework for learning a probabilistic predictive world model for real-world road environments.
While prior methods require complete states as ground truth for learning, we present a novel sequential training method to allow HVAEs to learn to predict complete states from partially observed states only.
arXiv Detail & Related papers (2023-01-12T02:07:26Z) - Exoplanet Detection by Machine Learning with Data Augmentation [0.0]
Deep learning has potential to automate parts of the exoplanet detection pipeline.
Smallness of available datasets makes it difficult to realize the level of performance one expects from powerful network architectures.
We demonstrate that data augmentation has a potential to improve model performance for the exoplanet detection problem.
arXiv Detail & Related papers (2022-11-28T17:35:16Z) - Constructing Effective Machine Learning Models for the Sciences: A
Multidisciplinary Perspective [77.53142165205281]
We show how flexible non-linear solutions will not always improve upon manually adding transforms and interactions between variables to linear regression models.
We discuss how to recognize this before constructing a data-driven model and how such analysis can help us move to intrinsically interpretable regression models.
arXiv Detail & Related papers (2022-11-21T17:48:44Z) - ExoSGAN and ExoACGAN: Exoplanet Detection using Adversarial Training
Algorithms [0.0]
We use two variations of generative adversarial networks to detect transiting exoplanets in K2 data.
Our techniques are able to categorize the light curves with a recall and precision of 1.00 on the test data.
arXiv Detail & Related papers (2022-07-20T05:45:36Z) - Exoplanet atmosphere evolution: emulation with random forests [0.0]
Atmospheric mass-loss plays a leading role in sculpting the demographics of small, close-in exoplanets.
We implement random forests trained on atmospheric evolution models to predict a given planet's final radius and atmospheric mass.
Our new approach opens the door to highly sophisticated models of atmospheric evolution being used in demographic analysis.
arXiv Detail & Related papers (2021-10-28T14:39:19Z) - Automated identification of transiting exoplanet candidates in NASA
Transiting Exoplanets Survey Satellite (TESS) data with machine learning
methods [1.9491825010518622]
The AI/ML ThetaRay system is trained initially with Kepler exoplanetary data and validated with confirmed exoplanets.
By the application of ThetaRay to 10,803 light curves of threshold crossing events (TCEs) produced by the TESS mission, we uncover 39 new exoplanetary candidates.
arXiv Detail & Related papers (2021-02-20T12:28:39Z) - Exoplanet Detection using Machine Learning [0.0]
We introduce a new machine learning based technique to detect exoplanets using the transit method.
For Kepler data, the method is able to predict a planet with an AUC of 0.948, so that 94.8 per cent of the true planet signals are ranked higher than non-planet signals.
For the Transiting Exoplanet Survey Satellite (TESS) data, we found our method can classify light curves with an accuracy of 0.98, and is able to identify planets with a recall of 0.82 at a precision of 0.63.
arXiv Detail & Related papers (2020-11-28T14:06:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.