The causal structure of galactic astrophysics
- URL: http://arxiv.org/abs/2510.01112v2
- Date: Sat, 08 Nov 2025 12:15:38 GMT
- Title: The causal structure of galactic astrophysics
- Authors: Harry Desmond, Joseph Ramsey,
- Abstract summary: Data-driven astrophysics currently relies on the detection and characterisation of correlations between objects' properties.<n>This process fails to utilise information in the data that forms a crucial part of the theories' predictions.<n>We propose to recover this information through causal discovery.
- Score: 0.0687531213383208
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data-driven astrophysics currently relies on the detection and characterisation of correlations between objects' properties, which are then used to test physical theories that make predictions for them. This process fails to utilise information in the data that forms a crucial part of the theories' predictions, namely which variables are directly correlated (as opposed to accidentally correlated through others), the directions of these determinations, and the presence or absence of confounders that correlate variables in the dataset but are themselves absent from it. We propose to recover this information through causal discovery, a well-developed methodology for inferring the causal structure of datasets that is however almost entirely unknown to astrophysics. We develop a causal discovery algorithm suitable for large astrophysical datasets and illustrate it on $\sim$5$\times10^5$ low-redshift galaxies from the Nasa Sloan Atlas, demonstrating its ability to distinguish physical mechanisms that are degenerate on the basis of correlations alone.
Related papers
- Coarsening Causal DAG Models [0.0]
We propose an efficient, provably consistent algorithm for learning abstract causal graphs from interventional data with unknown intervention targets.<n>As proof of concept, we apply our algorithm on synthetic and real datasets with known ground truths.
arXiv Detail & Related papers (2026-01-15T15:56:20Z) - Interpreting deep learning-based stellar mass estimation via causal analysis and mutual information decomposition [8.62611856496081]
Using data from the Sloan Digital Sky Survey (SDSS) and the Wide-field Infrared Survey Explorer (WISE), we obtained meaningful results that provide physical interpretations for image-based models.<n>Our work demonstrates the gains from combining deep learning with interpretability techniques, and holds promise in promoting more data-driven astrophysical research.
arXiv Detail & Related papers (2025-09-28T14:17:25Z) - Physics-Guided Dual Implicit Neural Representations for Source Separation [70.38762322922211]
We develop a self-supervised machine-learning approach for source separation using a dual implicit neural representation framework.<n>Our method learns directly from the raw data by minimizing a reconstruction-based loss function.<n>Our method offers a versatile framework for addressing source separation problems across diverse domains.
arXiv Detail & Related papers (2025-07-07T17:56:31Z) - Sparse mixed linear modeling with anchor-based guidance for high-entropy alloy discovery [0.12499537119440242]
In this study, we focus on local data structures that emerge from the greedy search behavior inherent to experimental data acquisition.<n>We develop an algorithm that simultaneously performs prediction and feature selection.<n>Through a case study on high-entropy alloys, this study introduces a method that combines anchor-guided clustering and sparse linear modeling.
arXiv Detail & Related papers (2025-04-29T01:44:15Z) - Causal Discovery from Data Assisted by Large Language Models [50.193740129296245]
It is essential to integrate experimental data with prior domain knowledge for knowledge driven discovery.<n>Here we demonstrate this approach by combining high-resolution scanning transmission electron microscopy (STEM) data with insights derived from large language models (LLMs)<n>By fine-tuning ChatGPT on domain-specific literature, we construct adjacency matrices for Directed Acyclic Graphs (DAGs) that map the causal relationships between structural, chemical, and polarization degrees of freedom in Sm-doped BiFeO3 (SmBFO)
arXiv Detail & Related papers (2025-03-18T02:14:49Z) - A method based on Generative Adversarial Networks for disentangling physical and chemical properties of stars in astronomical spectra [0.16385815610837165]
In this work, an encoder-decoder architecture has been designed, where adversarial training is used in the context of astrophysical spectral analysis.
A scheme of deep learning is used with the aim of unraveling in the latent space the desired parameters of the rest of the information contained in the data.
To test the effectiveness of the method, synthetic astronomical data are used from the APOGEE and Gaia surveys.
arXiv Detail & Related papers (2024-11-08T20:45:09Z) - Unsupervised Pairwise Causal Discovery on Heterogeneous Data using Mutual Information Measures [49.1574468325115]
Causal Discovery is a technique that tackles the challenge by analyzing the statistical properties of the constituent variables.
We question the current (possibly misleading) baseline results on the basis that they were obtained through supervised learning.
In consequence, we approach this problem in an unsupervised way, using robust Mutual Information measures.
arXiv Detail & Related papers (2024-08-01T09:11:08Z) - Sample, estimate, aggregate: A recipe for causal discovery foundation models [28.116832159265964]
Causal discovery has the potential to uncover mechanistic insights from biological experiments.<n>We propose a supervised model trained on large-scale, synthetic data to predict causal graphs.<n>Our approach is enabled by the observation that typical errors in the outputs of a discovery algorithm remain comparable across datasets.
arXiv Detail & Related papers (2024-02-02T21:57:58Z) - A Causal Framework for Decomposing Spurious Variations [68.12191782657437]
We develop tools for decomposing spurious variations in Markovian and Semi-Markovian models.
We prove the first results that allow a non-parametric decomposition of spurious effects.
The described approach has several applications, ranging from explainable and fair AI to questions in epidemiology and medicine.
arXiv Detail & Related papers (2023-06-08T09:40:28Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Discovering Latent Causal Variables via Mechanism Sparsity: A New
Principle for Nonlinear ICA [81.4991350761909]
Independent component analysis (ICA) refers to an ensemble of methods which formalize this goal and provide estimation procedure for practical application.
We show that the latent variables can be recovered up to a permutation if one regularizes the latent mechanisms to be sparse.
arXiv Detail & Related papers (2021-07-21T14:22:14Z) - Learning Causal Models Online [103.87959747047158]
Predictive models can rely on spurious correlations in the data for making predictions.
One solution for achieving strong generalization is to incorporate causal structures in the models.
We propose an online algorithm that continually detects and removes spurious features.
arXiv Detail & Related papers (2020-06-12T20:49:20Z) - Causal Discovery from Incomplete Data: A Deep Learning Approach [21.289342482087267]
Imputated Causal Learning is proposed to perform iterative missing data imputation and causal structure discovery.
We show that ICL can outperform state-of-the-art methods under different missing data mechanisms.
arXiv Detail & Related papers (2020-01-15T14:28:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.