Data Compression and Inference in Cosmology with Self-Supervised Machine
Learning
- URL: http://arxiv.org/abs/2308.09751v2
- Date: Thu, 14 Dec 2023 20:12:37 GMT
- Title: Data Compression and Inference in Cosmology with Self-Supervised Machine
Learning
- Authors: Aizhan Akhmetzhanova, Siddharth Mishra-Sharma, Cora Dvorkin
- Abstract summary: We introduce a method that leverages the paradigm of self-supervised machine learning in a novel manner to construct representative summaries of massive datasets.
Deploying the method on hydrodynamical cosmological simulations, we show that it can deliver highly informative summaries.
Results indicate that self-supervised machine learning techniques offer a promising new approach for compression of cosmological data as well its analysis.
- Score: 0.86325068644655
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The influx of massive amounts of data from current and upcoming cosmological
surveys necessitates compression schemes that can efficiently summarize the
data with minimal loss of information. We introduce a method that leverages the
paradigm of self-supervised machine learning in a novel manner to construct
representative summaries of massive datasets using simulation-based
augmentations. Deploying the method on hydrodynamical cosmological simulations,
we show that it can deliver highly informative summaries, which can be used for
a variety of downstream tasks, including precise and accurate parameter
inference. We demonstrate how this paradigm can be used to construct summary
representations that are insensitive to prescribed systematic effects, such as
the influence of baryonic physics. Our results indicate that self-supervised
machine learning techniques offer a promising new approach for compression of
cosmological data as well its analysis.
Related papers
- Discovering Interpretable Physical Models using Symbolic Regression and
Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models.
DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems.
We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z) - A spectrum of physics-informed Gaussian processes for regression in
engineering [0.0]
Despite the growing availability of sensing and data in general, we remain unable to fully characterise many in-service engineering systems and structures from a purely data-driven approach.
This paper pursues the combination of machine learning technology and physics-based reasoning to enhance our ability to make predictive models with limited data.
arXiv Detail & Related papers (2023-09-19T14:39:03Z) - Learning minimal representations of stochastic processes with
variational autoencoders [52.99137594502433]
We introduce an unsupervised machine learning approach to determine the minimal set of parameters required to describe a process.
Our approach enables for the autonomous discovery of unknown parameters describing processes.
arXiv Detail & Related papers (2023-07-21T14:25:06Z) - Addressing computational challenges in physical system simulations with
machine learning [0.0]
We present a machine learning-based data generator framework tailored to aid researchers who utilize simulations to examine various physical systems or processes.
Our approach involves a two-step process: first, we train a supervised predictive model using a limited simulated dataset to predict simulation outcomes.
Subsequently, a reinforcement learning agent is trained to generate accurate, simulation-like data by leveraging the supervised model.
arXiv Detail & Related papers (2023-05-16T17:31:50Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Advancing Reacting Flow Simulations with Data-Driven Models [50.9598607067535]
Key to effective use of machine learning tools in multi-physics problems is to couple them to physical and computer models.
The present chapter reviews some of the open opportunities for the application of data-driven reduced-order modeling of combustion systems.
arXiv Detail & Related papers (2022-09-05T16:48:34Z) - Learning Transport Processes with Machine Intelligence [0.0]
We present a machine learning based approach to address the study of transport processes.
Our model is capable of learning latent representations of the transport process substantially closer to the ground truth than expected.
arXiv Detail & Related papers (2021-09-27T14:49:22Z) - A Scaling Law for Synthetic-to-Real Transfer: A Measure of Pre-Training [52.93808218720784]
Synthetic-to-real transfer learning is a framework in which we pre-train models with synthetically generated images and ground-truth annotations for real tasks.
Although synthetic images overcome the data scarcity issue, it remains unclear how the fine-tuning performance scales with pre-trained models.
We observe a simple and general scaling law that consistently describes learning curves in various tasks, models, and complexities of synthesized pre-training data.
arXiv Detail & Related papers (2021-08-25T02:29:28Z) - Using Data Assimilation to Train a Hybrid Forecast System that Combines
Machine-Learning and Knowledge-Based Components [52.77024349608834]
We consider the problem of data-assisted forecasting of chaotic dynamical systems when the available data is noisy partial measurements.
We show that by using partial measurements of the state of the dynamical system, we can train a machine learning model to improve predictions made by an imperfect knowledge-based model.
arXiv Detail & Related papers (2021-02-15T19:56:48Z) - Using machine-learning modelling to understand macroscopic dynamics in a
system of coupled maps [0.0]
We consider a case study the macroscopic motion emerging from a system of globally coupled maps.
We build a coarse-grained Markov process for the macroscopic dynamics both with a machine learning approach and with a direct numerical computation of the transition probability of the coarse-grained process.
We are able to infer important information about the effective dimension of the attractor, the persistence of memory effects and the multi-scale structure of the dynamics.
arXiv Detail & Related papers (2020-11-08T15:38:12Z) - Simulation-based inference methods for particle physics [12.451050883955071]
We explain why the likelihood function of high-dimensional LHC data cannot be explicitly evaluated, why this matters for data analysis, and reframe what the field has traditionally done to circumvent this problem.
We then review new simulation-based inference methods that let us directly analyze high-dimensional data by combining machine learning techniques and information from the simulator.
arXiv Detail & Related papers (2020-10-13T14:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.