Using causal inference to avoid fallouts in data-driven parametric
analysis: a case study in the architecture, engineering, and construction
industry
- URL: http://arxiv.org/abs/2309.11509v1
- Date: Mon, 11 Sep 2023 13:54:58 GMT
- Title: Using causal inference to avoid fallouts in data-driven parametric
analysis: a case study in the architecture, engineering, and construction
industry
- Authors: Xia Chen, Ruiji Sun, Ueli Saluz, Stefano Schiavon, Philipp Geyer
- Abstract summary: The decision-making process in real-world implementations has been affected by a growing reliance on data-driven models.
We investigated the synergetic pattern between the data-driven methods, empirical domain knowledge, and first-principles simulations.
- Score: 0.7566148383213173
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The decision-making process in real-world implementations has been affected
by a growing reliance on data-driven models. We investigated the synergetic
pattern between the data-driven methods, empirical domain knowledge, and
first-principles simulations. We showed the potential risk of biased results
when using data-driven models without causal analysis. Using a case study
assessing the implication of several design solutions on the energy consumption
of a building, we proved the necessity of causal analysis during the
data-driven modeling process. We concluded that: (a) Data-driven models'
accuracy assessment or domain knowledge screening may not rule out biased and
spurious results; (b) Data-driven models' feature selection should involve
careful consideration of causal relationships, especially colliders; (c) Causal
analysis results can be used as an aid to first-principles simulation design
and parameter checking to avoid cognitive biases. We proved the benefits of
causal analysis when applied to data-driven models in building engineering.
Related papers
- Revisiting Spurious Correlation in Domain Generalization [12.745076668687748]
We build a structural causal model (SCM) to describe the causality within data generation process.
We further conduct a thorough analysis of the mechanisms underlying spurious correlation.
In this regard, we propose to control confounding bias in OOD generalization by introducing a propensity score weighted estimator.
arXiv Detail & Related papers (2024-06-17T13:22:00Z) - Discovering Interpretable Physical Models using Symbolic Regression and
Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models.
DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems.
We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z) - SLEM: Machine Learning for Path Modeling and Causal Inference with Super
Learner Equation Modeling [3.988614978933934]
Causal inference is a crucial goal of science, enabling researchers to arrive at meaningful conclusions using observational data.
Path models, Structural Equation Models (SEMs) and Directed Acyclic Graphs (DAGs) provide a means to unambiguously specify assumptions regarding the causal structure underlying a phenomenon.
We propose Super Learner Equation Modeling, a path modeling technique integrating machine learning Super Learner ensembles.
arXiv Detail & Related papers (2023-08-08T16:04:42Z) - Causal Disentangled Variational Auto-Encoder for Preference
Understanding in Recommendation [50.93536377097659]
This paper introduces the Causal Disentangled Variational Auto-Encoder (CaD-VAE), a novel approach for learning causal disentangled representations from interaction data in recommender systems.
The approach utilizes structural causal models to generate causal representations that describe the causal relationship between latent factors.
arXiv Detail & Related papers (2023-04-17T00:10:56Z) - Measuring Causal Effects of Data Statistics on Language Model's
`Factual' Predictions [59.284907093349425]
Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models.
We provide a language for describing how training data influences predictions, through a causal framework.
Our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone.
arXiv Detail & Related papers (2022-07-28T17:36:24Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - Domain adaptation under structural causal models [2.627046865670577]
Domain adaptation (DA) arises when the source data used to train a model is different from the target data used to test the model.
Recent advances in DA have mainly been application-driven.
We propose a theoretical framework via structural causal models that enables analysis and comparison of the prediction performance of DA methods.
arXiv Detail & Related papers (2020-10-29T17:09:34Z) - Causal Inference with Deep Causal Graphs [0.0]
Parametric causal modelling techniques rarely provide functionality for counterfactual estimation.
Deep Causal Graphs is an abstract specification of the required functionality for a neural network to model causal distributions.
We demonstrate its expressive power in modelling complex interactions and showcase applications to machine learning explainability and fairness.
arXiv Detail & Related papers (2020-06-15T13:03:33Z) - How Training Data Impacts Performance in Learning-based Control [67.7875109298865]
This paper derives an analytical relationship between the density of the training data and the control performance.
We formulate a quality measure for the data set, which we refer to as $rho$-gap.
We show how the $rho$-gap can be applied to a feedback linearizing control law.
arXiv Detail & Related papers (2020-05-25T12:13:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.