Related papers: A machine learning approach to galaxy properties: joint redshift-stellar mass probability distributions with Random Forest

A machine learning approach to galaxy properties: joint redshift-stellar mass probability distributions with Random Forest

URL: http://arxiv.org/abs/2012.05928v2
Date: Fri, 19 Feb 2021 10:38:57 GMT
Title: A machine learning approach to galaxy properties: joint redshift-stellar mass probability distributions with Random Forest
Authors: S. Mucesh, W. G. Hartley, A. Palmese, O. Lahav, L. Whiteway, A. F. L. Bluck, A. Alarcon, A. Amon, K. Bechtol, G. M. Bernstein, A. Carnero Rosell, M. Carrasco Kind, A. Choi, K. Eckert, S. Everett, D. Gruen, R. A. Gruendl, I. Harrison, E. M. Huff, N. Kuropatkin, I. Sevilla-Noarbe, E. Sheldon, B. Yanny, M. Aguena, S. Allam, D. Bacon, E. Bertin, S. Bhargava, D. Brooks, J. Carretero, F. J. Castander, C. Conselice, M. Costanzi, M. Crocce, L. N. da Costa, M. E. S. Pereira, J. De Vicente, S. Desai, H. T. Diehl, A. Drlica-Wagner, A. E. Evrard, I. Ferrero, B. Flaugher, P. Fosalba, J. Frieman, J. Garc\'ia-Bellido, E. Gaztanaga, D. W. Gerdes, J. Gschwend, G. Gutierrez, S. R. Hinton, D. L. Hollowood, K. Honscheid, D. J. James, K. Kuehn, M. Lima, H. Lin, M. A. G. Maia, P. Melchior, F. Menanteau, R. Miquel, R. Morgan, F. Paz-Chinch\'on, A. A. Plazas, E. Sanchez, V. Scarpine, M. Schubnell, S. Serrano, M. Smith, E. Suchyta, G. Tarle, D. Thomas, C. To, T. N. Varga, and R.D. Wilkinson
Abstract summary: We demonstrate that highly accurate joint redshift-stellar mass probability distribution functions (PDFs) can be obtained using the Random Forest (RF) machine learning algorithm. We use the Dark Energy Survey (DES), combined with the COSMOS2015 catalogue for redshifts and stellar masses. In addition to accuracy, the RF is extremely fast, able to compute joint PDFs for a million galaxies in just under $6$ min with consumer computer hardware.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We demonstrate that highly accurate joint redshift-stellar mass probability distribution functions (PDFs) can be obtained using the Random Forest (RF) machine learning (ML) algorithm, even with few photometric bands available. As an example, we use the Dark Energy Survey (DES), combined with the COSMOS2015 catalogue for redshifts and stellar masses. We build two ML models: one containing deep photometry in the $griz$ bands, and the second reflecting the photometric scatter present in the main DES survey, with carefully constructed representative training data in each case. We validate our joint PDFs for $10,699$ test galaxies by utilizing the copula probability integral transform and the Kendall distribution function, and their univariate counterparts to validate the marginals. Benchmarked against a basic set-up of the template-fitting code BAGPIPES, our ML-based method outperforms template fitting on all of our predefined performance metrics. In addition to accuracy, the RF is extremely fast, able to compute joint PDFs for a million galaxies in just under $6$ min with consumer computer hardware. Such speed enables PDFs to be derived in real time within analysis codes, solving potential storage issues. As part of this work we have developed GALPRO, a highly intuitive and efficient Python package to rapidly generate multivariate PDFs on-the-fly. GALPRO is documented and available for researchers to use in their cosmology and galaxy evolution studies.

Related papers

MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on Large Language Models [53.36415620647177]
Semi-structured sparsity offers a promising solution by strategically retaining $N$ elements out of every $M$ weights.<n>Existing (N:M)-compatible approaches typically fall into two categories: rule-based layerwise greedy search, which suffers from considerable errors, and gradient-driven learning, which incurs prohibitive training costs.<n>We propose a novel linear-space probabilistic framework named MaskPro, which aims to learn a prior categorical distribution for every $M$ consecutive weights and subsequently leverages this distribution to generate the (N:M)-sparsity throughout an $N$-way sampling
arXiv Detail & Related papers (2025-06-15T15:02:59Z)
Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining [56.58170370127227]
We show that optimal learning rate follows a power-law relationship with both model parameters and data sizes, while optimal batch size scales primarily with data sizes. This work is the first work that unifies different model shapes and structures, such as Mixture-of-Experts models and dense transformers.
arXiv Detail & Related papers (2025-03-06T18:58:29Z)
Automatic Machine Learning Framework to Study Morphological Parameters of AGN Host Galaxies within $z < 1.4$ in the Hyper Supreme-Cam Wide Survey [4.6218496439194805]
We present a machine learning framework to estimate posterior distributions of bulge-to-total light ratio, half-light radius, and flux for AGN host galaxies. We use PSFGAN to decompose the AGN point source light from its host galaxy, and invoke the Galaxy Morphology Posterior Estimation Network (GaMPEN) to estimate morphological parameters. Our framework runs at least three orders of magnitude faster than traditional light-profile fitting methods.
arXiv Detail & Related papers (2025-01-27T03:04:34Z)
PICZL: Image-based Photometric Redshifts for AGN [1.5194351731792657]
We introduce PICZL, a machine-learning algorithm leveraging an ensemble of CNNs. PICZL integrates distinct SED features from images with those obtained from catalog-level data. On a validation sample of 8098 AGN, PICZL achieves a variance $sigma_textrmNMAD$ of 4.5% with an outlier fraction $eta$ of 5.6%.
arXiv Detail & Related papers (2024-11-11T19:01:08Z)
PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling [63.93112754821312]
Document understanding is a challenging task to process and comprehend large amounts of textual and visual information. Recent advances in Large Language Models (LLMs) have significantly improved the performance of this task. We introduce PDF-WuKong, a multimodal large language model (MLLM) which is designed to enhance multimodal question-answering (QA) for long PDF documents.
arXiv Detail & Related papers (2024-10-08T12:17:42Z)
Equivariance via Minimal Frame Averaging for More Symmetries and Efficiency [48.81897136561015]
Minimal Frame Averaging (MFA) is a mathematical framework for constructing provably minimal frames that are exactly equivariant. Results demonstrate the efficiency and effectiveness of encoding symmetries via MFA across a diverse range of tasks.
arXiv Detail & Related papers (2024-06-11T15:58:56Z)
Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control [66.78146440275093]
Learned retrieval (LSR) is a family of neural methods that encode queries and documents into sparse lexical vectors. We explore the application of LSR to the multi-modal domain, with a focus on text-image retrieval. Current approaches like LexLIP and STAIR require complex multi-step training on massive datasets. Our proposed approach efficiently transforms dense vectors from a frozen dense model into sparse lexical vectors.
arXiv Detail & Related papers (2024-02-27T14:21:56Z)
ProbVLM: Probabilistic Adapter for Frozen Vision-Language Models [69.50316788263433]
We propose ProbVLM, a probabilistic adapter that estimates probability distributions for the embeddings of pre-trained vision-language models. We quantify the calibration of embedding uncertainties in retrieval tasks and show that ProbVLM outperforms other methods. We present a novel technique for visualizing the embedding distributions using a large-scale pre-trained latent diffusion model.
arXiv Detail & Related papers (2023-07-01T18:16:06Z)
AQuaMaM: An Autoregressive, Quaternion Manifold Model for Rapidly Estimating Complex SO(3) Distributions [0.6526824510982799]
AQuaMaM is a neural network capable of both learning complex distributions on the rotation manifold and calculating exact likelihoods for query rotations in a single forward pass. When trained on a constructed dataset of 500,000 renders of a die in different rotations, AQuaMaM reaches a test log-likelihood 14% higher than IPDF. Compared to IPDF, AQuaMaM uses 24% fewer parameters, has a prediction throughput 52$times$ faster on a single GPU, and converges in a similar amount of time during training.
arXiv Detail & Related papers (2023-01-21T00:40:21Z)
Generalized Differentiable RANSAC [95.95627475224231]
$nabla$-RANSAC is a differentiable RANSAC that allows learning the entire randomized robust estimation pipeline. $nabla$-RANSAC is superior to the state-of-the-art in terms of accuracy while running at a similar speed to its less accurate alternatives.
arXiv Detail & Related papers (2022-12-26T15:13:13Z)
Uncertainty Inspired RGB-D Saliency Detection [70.50583438784571]
We propose the first framework to employ uncertainty for RGB-D saliency detection by learning from the data labeling process. Inspired by the saliency data labeling process, we propose a generative architecture to achieve probabilistic RGB-D saliency detection. Results on six challenging RGB-D benchmark datasets show our approach's superior performance in learning the distribution of saliency maps.
arXiv Detail & Related papers (2020-09-07T13:01:45Z)
Gravitational-wave parameter estimation with autoregressive neural network flows [0.0]
We introduce the use of autoregressive normalizing flows for rapid likelihood-free inference of binary black hole system parameters from gravitational-wave data with deep neural networks. A normalizing flow is an invertible mapping on a sample space that can be used to induce a transformation from a simple probability distribution to a more complex one. We build a more powerful latent variable model by incorporating autoregressive flows within the variational autoencoder framework.
arXiv Detail & Related papers (2020-02-18T15:44:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.