Estimating oil and gas recovery factors via machine learning:
Database-dependent accuracy and reliability
- URL: http://arxiv.org/abs/2210.12491v1
- Date: Sat, 22 Oct 2022 16:25:49 GMT
- Title: Estimating oil and gas recovery factors via machine learning:
Database-dependent accuracy and reliability
- Authors: Alireza Roustazadeh, Behzad Ghanbarian, Mohammad B. Shadmand, Vahid
Taslimitehrani, Larry W. Lake
- Abstract summary: A key reservoir property is hydrocarbon recovery factor (RF) whose accurate estimation would provide decisive insights to drilling and production strategies.
This study aims to estimate the hydrocarbon RF for exploration from various reservoir characteristics, such as porosity, permeability, pressure, and water saturation via the machine learning (ML) approach.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With recent advances in artificial intelligence, machine learning (ML)
approaches have become an attractive tool in petroleum engineering,
particularly for reservoir characterizations. A key reservoir property is
hydrocarbon recovery factor (RF) whose accurate estimation would provide
decisive insights to drilling and production strategies. Therefore, this study
aims to estimate the hydrocarbon RF for exploration from various reservoir
characteristics, such as porosity, permeability, pressure, and water saturation
via the ML. We applied three regression-based models including the extreme
gradient boosting (XGBoost), support vector machine (SVM), and stepwise
multiple linear regression (MLR) and various combinations of three databases to
construct ML models and estimate the oil and/or gas RF. Using two databases and
the cross-validation method, we evaluated the performance of the ML models. In
each iteration 90 and 10% of the data were respectively used to train and test
the models. The third independent database was then used to further assess the
constructed models. For both oil and gas RFs, we found that the XGBoost model
estimated the RF for the train and test datasets more accurately than the SVM
and MLR models. However, the performance of all the models were unsatisfactory
for the independent databases. Results demonstrated that the ML algorithms were
highly dependent and sensitive to the databases based on which they were
trained. Statistical tests revealed that such unsatisfactory performances were
because the distributions of input features and target variables in the train
datasets were significantly different from those in the independent databases
(p-value < 0.05).
Related papers
- When More Data Hurts: Optimizing Data Coverage While Mitigating Diversity Induced Underfitting in an Ultra-Fast Machine-Learned Potential [0.0]
This study investigates how training data diversity affects the performance of machine-learned interatomic potentials (MLIPs)
We employ expert and autonomously generated data to create the training data and fit four force-field variants to subsets of the data.
Our findings reveal a critical balance in training data diversity: insufficient diversity hinders generalization, while excessive diversity can exceed the MLIP's learning capacity.
arXiv Detail & Related papers (2024-09-11T20:45:44Z) - Retrosynthesis prediction enhanced by in-silico reaction data
augmentation [66.5643280109899]
We present RetroWISE, a framework that employs a base model inferred from real paired data to perform in-silico reaction generation and augmentation.
On three benchmark datasets, RetroWISE achieves the best overall performance against state-of-the-art models.
arXiv Detail & Related papers (2024-01-31T07:40:37Z) - Machine Learning Data Suitability and Performance Testing Using Fault
Injection Testing Framework [0.0]
This paper presents the Fault Injection for Undesirable Learning in input Data (FIUL-Data) testing framework.
Data mutators explore vulnerabilities of ML systems against the effects of different fault injections.
This paper evaluates the framework using data from analytical chemistry, comprising retention time measurements of anti-sense oligonucleotides.
arXiv Detail & Related papers (2023-09-20T12:58:35Z) - The Languini Kitchen: Enabling Language Modelling Research at Different
Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours.
We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length.
This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Prediction of SLAM ATE Using an Ensemble Learning Regression Model and
1-D Global Pooling of Data Characterization [3.4399698738841553]
We introduce a novel method for predicting SLAM localization error based on the characterization of raw sensor inputs.
The proposed method relies on using a random forest regression model trained on 1-D global pooled features that are generated from characterized raw sensor data.
The paper also studies the impact of 12 different 1-D global pooling functions on regression quality, and the superiority of 1-D global averaging is quantitatively proven.
arXiv Detail & Related papers (2023-03-01T16:12:47Z) - Estimating oil recovery factor using machine learning: Applications of
XGBoost classification [0.0]
In petroleum engineering, it is essential to determine the ultimate recovery factor, RF, particularly before exploitation and exploration.
We, therefore, applied machine learning (ML), using readily available features, to estimate oil RF for ten classes defined in this study.
arXiv Detail & Related papers (2022-10-28T18:21:25Z) - Learning Large-scale Subsurface Simulations with a Hybrid Graph Network
Simulator [57.57321628587564]
We introduce Hybrid Graph Network Simulator (HGNS) for learning reservoir simulations of 3D subsurface fluid flows.
HGNS consists of a subsurface graph neural network (SGNN) to model the evolution of fluid flows, and a 3D-U-Net to model the evolution of pressure.
Using an industry-standard subsurface flow dataset (SPE-10) with 1.1 million cells, we demonstrate that HGNS is able to reduce the inference time up to 18 times compared to standard subsurface simulators.
arXiv Detail & Related papers (2022-06-15T17:29:57Z) - Prediction of liquid fuel properties using machine learning models with
Gaussian processes and probabilistic conditional generative learning [56.67751936864119]
The present work aims to construct cheap-to-compute machine learning (ML) models to act as closure equations for predicting the physical properties of alternative fuels.
Those models can be trained using the database from MD simulations and/or experimental measurements in a data-fusion-fidelity approach.
The results show that ML models can predict accurately the fuel properties of a wide range of pressure and temperature conditions.
arXiv Detail & Related papers (2021-10-18T14:43:50Z) - Fast, Accurate, and Simple Models for Tabular Data via Augmented
Distillation [97.42894942391575]
We propose FAST-DAD to distill arbitrarily complex ensemble predictors into individual models like boosted trees, random forests, and deep networks.
Our individual distilled models are over 10x faster and more accurate than ensemble predictors produced by AutoML tools like H2O/AutoSklearn.
arXiv Detail & Related papers (2020-06-25T09:57:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.