Interpretability and causal discovery of the machine learning models to
predict the production of CBM wells after hydraulic fracturing
- URL: http://arxiv.org/abs/2212.10718v1
- Date: Wed, 21 Dec 2022 02:06:26 GMT
- Title: Interpretability and causal discovery of the machine learning models to
predict the production of CBM wells after hydraulic fracturing
- Authors: Chao Min, Guoquan Wen, Liangjie Gou, Xiaogang Li, Zhaozhong Yang
- Abstract summary: A novel methodology is proposed to discover the latent causality from observed data.
Based on the theory of causal discovery, a causal graph is derived with explicit input, output, treatment and confounding variables.
SHAP is employed to analyze the influence of the factors on the production capability, which indirectly interprets the machine learning models.
- Score: 0.5512295869673146
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning approaches are widely studied in the production prediction
of CBM wells after hydraulic fracturing, but merely used in practice due to the
low generalization ability and the lack of interpretability. A novel
methodology is proposed in this article to discover the latent causality from
observed data, which is aimed at finding an indirect way to interpret the
machine learning results. Based on the theory of causal discovery, a causal
graph is derived with explicit input, output, treatment and confounding
variables. Then, SHAP is employed to analyze the influence of the factors on
the production capability, which indirectly interprets the machine learning
models. The proposed method can capture the underlying nonlinear relationship
between the factors and the output, which remedies the limitation of the
traditional machine learning routines based on the correlation analysis of
factors. The experiment on the data of CBM shows that the detected relationship
between the production and the geological/engineering factors by the presented
method, is coincident with the actual physical mechanism. Meanwhile, compared
with traditional methods, the interpretable machine learning models have better
performance in forecasting production capability, averaging 20% improvement in
accuracy.
Related papers
- Revisiting Spurious Correlation in Domain Generalization [12.745076668687748]
We build a structural causal model (SCM) to describe the causality within data generation process.
We further conduct a thorough analysis of the mechanisms underlying spurious correlation.
In this regard, we propose to control confounding bias in OOD generalization by introducing a propensity score weighted estimator.
arXiv Detail & Related papers (2024-06-17T13:22:00Z) - CogDPM: Diffusion Probabilistic Models via Cognitive Predictive Coding [62.075029712357]
This work introduces the Cognitive Diffusion Probabilistic Models (CogDPM)
CogDPM features a precision estimation method based on the hierarchical sampling capabilities of diffusion models and weight the guidance with precision weights estimated by the inherent property of diffusion models.
We apply CogDPM to real-world prediction tasks using the United Kindom precipitation and surface wind datasets.
arXiv Detail & Related papers (2024-05-03T15:54:50Z) - Replication Study: Enhancing Hydrological Modeling with Physics-Guided
Machine Learning [0.0]
Current hydrological modeling methods combine data-driven Machine Learning algorithms and traditional physics-based models.
Despite the accuracy of ML in outcome prediction, the integration of scientific knowledge is crucial for reliable predictions.
This study introduces a Physics Informed Machine Learning model, which merges the process understanding of conceptual hydrological models with the predictive efficiency of ML algorithms.
arXiv Detail & Related papers (2024-02-21T16:26:59Z) - SLEM: Machine Learning for Path Modeling and Causal Inference with Super
Learner Equation Modeling [3.988614978933934]
Causal inference is a crucial goal of science, enabling researchers to arrive at meaningful conclusions using observational data.
Path models, Structural Equation Models (SEMs) and Directed Acyclic Graphs (DAGs) provide a means to unambiguously specify assumptions regarding the causal structure underlying a phenomenon.
We propose Super Learner Equation Modeling, a path modeling technique integrating machine learning Super Learner ensembles.
arXiv Detail & Related papers (2023-08-08T16:04:42Z) - Neuro-Causal Factor Analysis [18.176375611711396]
We introduce a framework for Neuro-Causal Factor Analysis (NCFA)
NCFA identifies factors via latent causal discovery methods and then uses a variational autoencoder (VAE)
We evaluate NCFA on real and synthetic data sets, finding that it performs comparably to standard VAEs on data reconstruction tasks.
arXiv Detail & Related papers (2023-05-31T12:41:20Z) - Robust Output Analysis with Monte-Carlo Methodology [0.0]
In predictive modeling with simulation or machine learning, it is critical to accurately assess the quality of estimated values.
We propose a unified output analysis framework for simulation and machine learning outputs through the lens of Monte Carlo sampling.
arXiv Detail & Related papers (2022-07-27T16:21:59Z) - How robust are pre-trained models to distribution shift? [82.08946007821184]
We show how spurious correlations affect the performance of popular self-supervised learning (SSL) and auto-encoder based models (AE)
We develop a novel evaluation scheme with the linear head trained on out-of-distribution (OOD) data, to isolate the performance of the pre-trained models from a potential bias of the linear head used for evaluation.
arXiv Detail & Related papers (2022-06-17T16:18:28Z) - Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z) - Latent Causal Invariant Model [128.7508609492542]
Current supervised learning can learn spurious correlation during the data-fitting process.
We propose a Latent Causal Invariance Model (LaCIM) which pursues causal prediction.
arXiv Detail & Related papers (2020-11-04T10:00:27Z) - Double Robust Representation Learning for Counterfactual Prediction [68.78210173955001]
We propose a novel scalable method to learn double-robust representations for counterfactual predictions.
We make robust and efficient counterfactual predictions for both individual and average treatment effects.
The algorithm shows competitive performance with the state-of-the-art on real world and synthetic data.
arXiv Detail & Related papers (2020-10-15T16:39:26Z) - Predictive modeling approaches in laser-based material processing [59.04160452043105]
This study aims to automate and forecast the effect of laser processing on material structures.
The focus is centred on the performance of representative statistical and machine learning algorithms.
Results can set the basis for a systematic methodology towards reducing material design, testing and production cost.
arXiv Detail & Related papers (2020-06-13T17:28:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.