Related papers: Using causal inference to avoid fallouts in data-driven parametric analysis: a case study in the architecture, engineering, and construction industry

Using causal inference to avoid fallouts in data-driven parametric analysis: a case study in the architecture, engineering, and construction industry

URL: http://arxiv.org/abs/2309.11509v1
Date: Mon, 11 Sep 2023 13:54:58 GMT
Title: Using causal inference to avoid fallouts in data-driven parametric analysis: a case study in the architecture, engineering, and construction industry
Authors: Xia Chen, Ruiji Sun, Ueli Saluz, Stefano Schiavon, Philipp Geyer
Abstract summary: The decision-making process in real-world implementations has been affected by a growing reliance on data-driven models. We investigated the synergetic pattern between the data-driven methods, empirical domain knowledge, and first-principles simulations.
Score: 0.7566148383213173
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The decision-making process in real-world implementations has been affected by a growing reliance on data-driven models. We investigated the synergetic pattern between the data-driven methods, empirical domain knowledge, and first-principles simulations. We showed the potential risk of biased results when using data-driven models without causal analysis. Using a case study assessing the implication of several design solutions on the energy consumption of a building, we proved the necessity of causal analysis during the data-driven modeling process. We concluded that: (a) Data-driven models' accuracy assessment or domain knowledge screening may not rule out biased and spurious results; (b) Data-driven models' feature selection should involve careful consideration of causal relationships, especially colliders; (c) Causal analysis results can be used as an aid to first-principles simulation design and parameter checking to avoid cognitive biases. We proved the benefits of causal analysis when applied to data-driven models in building engineering.

Related papers

Do-PFN: In-Context Learning for Causal Effect Estimation [75.62771416172109]
We show that Prior-data fitted networks (PFNs) can be pre-trained on synthetic data to predict outcomes.<n>Our approach allows for the accurate estimation of causal effects without knowledge of the underlying causal graph.
arXiv Detail & Related papers (2025-06-06T12:43:57Z)
A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training Loops [55.07063067759609]
High-quality data is essential for training large generative models, yet the vast reservoir of real data available online has become nearly depleted. Models increasingly generate their own data for further training, forming Self-consuming Training Loops (STLs) Some models degrade or even collapse, while others successfully avoid these failures, leaving a significant gap in theoretical understanding.
arXiv Detail & Related papers (2025-02-26T06:18:13Z)
Comparing analytic and data-driven approaches to parameter identifiability: A power systems case study [41.94295877935867]
We report on a study comparing and contrasting analytical and data-driven approaches to quantify parameter identifiability. We use the infinite bus synchronous generator model, a well-understood model from the power systems domain, as our benchmark problem. We compare these results to those arrived at through data-driven manifold learning schemes: Output Diffusion - Maps and Geometric Harmonics.
arXiv Detail & Related papers (2024-12-24T19:33:12Z)
Tests for model misspecification in simulation-based inference: from local distortions to global model checks [2.0209172586699173]
We provide a solid and flexible foundation for a wide range of model discrepancy analysis tasks. We make explicit analytic connections to classical techniques: anomaly detection, model validation, and goodness-of-fit residual analysis. We show how to conduct such a distortion-driven model misspecification test for real gravitational wave data, specifically on the event GW150914.
arXiv Detail & Related papers (2024-12-19T17:48:03Z)
CAnDOIT: Causal Discovery with Observational and Interventional Data from Time-Series [4.008958683836471]
CAnDOIT is a causal discovery method to reconstruct causal models using both observational and interventional data. The use of interventional data in the causal analysis is crucial for real-world applications, such as robotics. A Python implementation of CAnDOIT has also been developed and is publicly available on GitHub.
arXiv Detail & Related papers (2024-10-03T13:57:08Z)
How much do we really know about Structure Learning from i.i.d. Data? Interpretable, multi-dimensional Performance Indicator for Causal Discovery [3.8443430569753025]
causal discovery from observational data imposes strict identifiability assumptions on the formulation of structural equations utilized in the data generating process. Motivated by the lack of unified performance assessment framework, we introduce an interpretable, six-dimensional evaluation metric, i.e., distance to optimal solution (DOS) This is the first research to assess the performance of structure learning algorithms from seven different families on increasing percentage of non-identifiable, nonlinear causal patterns.
arXiv Detail & Related papers (2024-09-28T15:03:49Z)
Estimating Causal Effects from Learned Causal Networks [56.14597641617531]
We propose an alternative paradigm for answering causal-effect queries over discrete observable variables. We learn the causal Bayesian network and its confounding latent variables directly from the observational data. We show that this emphmodel completion learning approach can be more effective than estimand approaches.
arXiv Detail & Related papers (2024-08-26T08:39:09Z)
Revisiting Spurious Correlation in Domain Generalization [12.745076668687748]
We build a structural causal model (SCM) to describe the causality within data generation process. We further conduct a thorough analysis of the mechanisms underlying spurious correlation. In this regard, we propose to control confounding bias in OOD generalization by introducing a propensity score weighted estimator.
arXiv Detail & Related papers (2024-06-17T13:22:00Z)
SLEM: Machine Learning for Path Modeling and Causal Inference with Super Learner Equation Modeling [3.988614978933934]
Causal inference is a crucial goal of science, enabling researchers to arrive at meaningful conclusions using observational data. Path models, Structural Equation Models (SEMs) and Directed Acyclic Graphs (DAGs) provide a means to unambiguously specify assumptions regarding the causal structure underlying a phenomenon. We propose Super Learner Equation Modeling, a path modeling technique integrating machine learning Super Learner ensembles.
arXiv Detail & Related papers (2023-08-08T16:04:42Z)
Causal Disentangled Variational Auto-Encoder for Preference Understanding in Recommendation [50.93536377097659]
This paper introduces the Causal Disentangled Variational Auto-Encoder (CaD-VAE), a novel approach for learning causal disentangled representations from interaction data in recommender systems. The approach utilizes structural causal models to generate causal representations that describe the causal relationship between latent factors.
arXiv Detail & Related papers (2023-04-17T00:10:56Z)
Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions [59.284907093349425]
Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models. We provide a language for describing how training data influences predictions, through a causal framework. Our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone.
arXiv Detail & Related papers (2022-07-28T17:36:24Z)
Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test. We train a variational inference model to predict the causal structure from observational/interventional data. Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z)
Estimation of Bivariate Structural Causal Models by Variational Gaussian Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models. One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z)
Domain adaptation under structural causal models [2.627046865670577]
Domain adaptation (DA) arises when the source data used to train a model is different from the target data used to test the model. Recent advances in DA have mainly been application-driven. We propose a theoretical framework via structural causal models that enables analysis and comparison of the prediction performance of DA methods.
arXiv Detail & Related papers (2020-10-29T17:09:34Z)
How Training Data Impacts Performance in Learning-based Control [67.7875109298865]
This paper derives an analytical relationship between the density of the training data and the control performance. We formulate a quality measure for the data set, which we refer to as $rho$-gap. We show how the $rho$-gap can be applied to a feedback linearizing control law.
arXiv Detail & Related papers (2020-05-25T12:13:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.