Cycle-StarNet: Bridging the gap between theory and data by leveraging
large datasets
- URL: http://arxiv.org/abs/2007.03109v3
- Date: Sat, 14 Nov 2020 04:00:37 GMT
- Title: Cycle-StarNet: Bridging the gap between theory and data by leveraging
large datasets
- Authors: Teaghan O'Briain, Yuan-Sen Ting, S\'ebastien Fabbro, Kwang M. Yi, Kim
Venn, Spencer Bialek
- Abstract summary: Current automated methods for analyzing spectra are either (a) data-driven, which requires prior knowledge of stellar parameters and elemental abundances, or (b) based on theoretical synthetic models that are susceptible to the gap between theory and practice.
We present a hybrid generative domain adaptation method that turns simulated stellar spectra into realistic spectra by applying unsupervised learning to large spectroscopic surveys.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The advancements in stellar spectroscopy data acquisition have made it
necessary to accomplish similar improvements in efficient data analysis
techniques. Current automated methods for analyzing spectra are either (a)
data-driven, which requires prior knowledge of stellar parameters and elemental
abundances, or (b) based on theoretical synthetic models that are susceptible
to the gap between theory and practice. In this study, we present a hybrid
generative domain adaptation method that turns simulated stellar spectra into
realistic spectra by applying unsupervised learning to large spectroscopic
surveys. We apply our technique to the APOGEE H-band spectra at R=22,500 and
the Kurucz synthetic models. As a proof of concept, two case studies are
presented. The first of which is the calibration of synthetic data to become
consistent with observations. To accomplish this, synthetic models are morphed
into spectra that resemble observations, thereby reducing the gap between
theory and observations. Fitting the observed spectra shows an improved average
reduced $\chi_R^2$ from 1.97 to 1.22, along with a reduced mean residual from
0.16 to -0.01 in normalized flux. The second case study is the identification
of the elemental source of missing spectral lines in the synthetic modelling. A
mock dataset is used to show that absorption lines can be recovered when they
are absent in one of the domains. This method can be applied to other fields,
which use large data sets and are currently limited by modelling accuracy. The
code used in this study is made publicly available on github.
Related papers
- Graph Generation via Spectral Diffusion [51.60814773299899]
We present GRASP, a novel graph generative model based on 1) the spectral decomposition of the graph Laplacian matrix and 2) a diffusion process.
Specifically, we propose to use a denoising model to sample eigenvectors and eigenvalues from which we can reconstruct the graph Laplacian and adjacency matrix.
Our permutation invariant model can also handle node features by concatenating them to the eigenvectors of each node.
arXiv Detail & Related papers (2024-02-29T09:26:46Z) - Data Augmentation Scheme for Raman Spectra with Highly Correlated
Annotations [0.23090185577016453]
We exploit the additive nature of spectra in order to generate additional data points from a given dataset that have statistically independent labels.
We show that training a CNN on these generated data points improves the performance on datasets where the annotations do not bear the same correlation as the dataset that was used for model training.
arXiv Detail & Related papers (2024-02-01T18:46:28Z) - Hodge-Aware Contrastive Learning [101.56637264703058]
Simplicial complexes prove effective in modeling data with multiway dependencies.
We develop a contrastive self-supervised learning approach for processing simplicial data.
arXiv Detail & Related papers (2023-09-14T00:40:07Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - The Manifold Hypothesis for Gradient-Based Explanations [55.01671263121624]
gradient-based explanation algorithms provide perceptually-aligned explanations.
We show that the more a feature attribution is aligned with the tangent space of the data, the more perceptually-aligned it tends to be.
We suggest that explanation algorithms should actively strive to align their explanations with the data manifold.
arXiv Detail & Related papers (2022-06-15T08:49:24Z) - Identifying charge density and dielectric environment of graphene using
Raman spectroscopy and deep learning [0.0]
The impact of the environment on graphene's properties can be evaluated by Raman spectroscopy.
We develop a deep learning model to overcome the effects of such variations and classify graphene Raman spectra according to different charge densities and dielectric environments.
arXiv Detail & Related papers (2022-02-25T00:25:01Z) - Simpler is better: spectral regularization and up-sampling techniques
for variational autoencoders [1.2234742322758418]
characterization of the spectral behavior of generative models based on neural networks remains an open issue.
Recent research has focused heavily on generative adversarial networks and the high-frequency discrepancies between real and generated images.
We propose a simple 2D Fourier transform-based spectral regularization loss for the Variational Autoencoders (VAEs)
arXiv Detail & Related papers (2022-01-19T11:49:57Z) - Unsupervised Machine Learning for Exploratory Data Analysis of Exoplanet
Transmission Spectra [68.8204255655161]
We focus on unsupervised techniques for analyzing spectral data from transiting exoplanets.
We show that there is a high degree of correlation in the spectral data, which calls for appropriate low-dimensional representations.
We uncover interesting structures in the principal component basis, namely, well-defined branches corresponding to different chemical regimes.
arXiv Detail & Related papers (2022-01-07T22:26:33Z) - Unsupervised Spectral Unmixing For Telluric Correction Using A Neural
Network Autoencoder [58.720142291102135]
We present a neural network autoencoder approach for extracting a telluric transmission spectrum from a large set of high-precision observed solar spectra from the HARPS-N radial velocity spectrograph.
arXiv Detail & Related papers (2021-11-17T12:54:48Z) - Sibling Regression for Generalized Linear Models [22.16690904610619]
Field observations form the basis of many scientific studies, especially in ecological and social sciences.
Despite efforts to conduct such surveys in a standardized way, observations can be prone to systematic measurement errors.
Existing non-parametric techniques for correcting such errors assume linear additive noise models.
We present an approach based on residual functions to address this limitation.
arXiv Detail & Related papers (2021-07-03T04:07:11Z) - Blind Source Separation for NMR Spectra with Negative Intensity [0.0]
We benchmark several blind source separation techniques for analysis of NMR spectral datasets containing negative intensity.
FastICA, SIMPLISMA, and NNMF are top-performing techniques.
The accuracy of FastICA and SIMPLISMA degrades quickly if excess (unreal) pure components are predicted.
arXiv Detail & Related papers (2020-02-07T20:57:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.