Machine learning surrogates for efficient hydrologic modeling: Insights from stochastic simulations of managed aquifer recharge
- URL: http://arxiv.org/abs/2407.20902v2
- Date: Sun, 09 Feb 2025 20:06:31 GMT
- Title: Machine learning surrogates for efficient hydrologic modeling: Insights from stochastic simulations of managed aquifer recharge
- Authors: Timothy Dai, Kate Maher, Zach Perzan,
- Abstract summary: We show that machine learning surrogate models can achieve under 10% mean absolute percentage error.
We apply this workflow to simulations of variably saturated groundwater flow at a prospective managed aquifer recharge site.
ML surrogate models can achieve under 10% mean absolute percentage error and yield order-of-magnitude runtime savings.
- Score: 0.0
- License:
- Abstract: Process-based hydrologic models are invaluable tools for understanding the terrestrial water cycle and addressing modern water resources problems. However, many hydrologic models are computationally expensive and, depending on the resolution and scale, simulations can take on the order of hours to days to complete. While techniques such as uncertainty quantification and optimization have become valuable tools for supporting management decisions, these analyses typically require hundreds of model simulations, which are too computationally expensive to perform with a process-based hydrologic model. To address this gap, we assess a hybrid modeling workflow in which a process-based model is used to generate an initial set of simulations and a machine learning (ML) surrogate model is then trained to perform the remaining simulations required for downstream analysis. As a case study, we apply this workflow to simulations of variably saturated groundwater flow at a prospective managed aquifer recharge site. We compare the accuracy and computational efficiency of several ML architectures, including deep convolutional networks, recurrent neural networks, vision transformers, and networks with Fourier transforms. Our results demonstrate that ML surrogate models can achieve under 10% mean absolute percentage error and yield order-of-magnitude runtime savings over process-based models. Building on these findings, we examine the impacts of key modeling choices on surrogate model accuracy and efficiency. Results show that a normalized loss function improves training stability, while min-max data normalization can significantly reduce error up to a factor of 10 when compared to other treatments such as Z-score and no normalization. Downsampling input features using an autoencoder also decreases memory requirements by training with tensors 4% their original size. By reducing computational costs and...
Related papers
- Improved Long Short-Term Memory-based Wastewater Treatment Simulators for Deep Reinforcement Learning [0.0]
We implement two methods to improve the trained models for wastewater treatment data.
The experimental results showed that implementing these methods can improve the behavior of simulators in terms of Dynamic Time Warping throughout a year.
arXiv Detail & Related papers (2024-03-22T10:20:09Z) - Data efficient surrogate modeling for engineering design: Ensemble-free
batch mode deep active learning for regression [0.6021787236982659]
We propose a simple and scalable approach for active learning that works in a student-teacher manner to train a surrogate model.
By using this proposed approach, we are able to achieve the same level of surrogate accuracy as the other baselines like DBAL and Monte Carlo sampling.
arXiv Detail & Related papers (2022-11-16T02:31:57Z) - An Adversarial Active Sampling-based Data Augmentation Framework for
Manufacturable Chip Design [55.62660894625669]
Lithography modeling is a crucial problem in chip design to ensure a chip design mask is manufacturable.
Recent developments in machine learning have provided alternative solutions in replacing the time-consuming lithography simulations with deep neural networks.
We propose a litho-aware data augmentation framework to resolve the dilemma of limited data and improve the machine learning model performance.
arXiv Detail & Related papers (2022-10-27T20:53:39Z) - Real-to-Sim: Predicting Residual Errors of Robotic Systems with Sparse
Data using a Learning-based Unscented Kalman Filter [65.93205328894608]
We learn the residual errors between a dynamic and/or simulator model and the real robot.
We show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
arXiv Detail & Related papers (2022-09-07T15:15:12Z) - TunaOil: A Tuning Algorithm Strategy for Reservoir Simulation Workloads [0.9940728137241215]
TunaOil is a new methodology to enhance the search for optimal numerical parameters of reservoir flow simulations.
We leverage ensembles of models in different oracles to extract information from each simulation and optimize the numerical parameters in their subsequent runs.
Our experiments show that the predictions can improve the overall workload on average by 31%.
arXiv Detail & Related papers (2022-08-04T12:11:13Z) - Physics-informed machine learning with differentiable programming for
heterogeneous underground reservoir pressure management [64.17887333976593]
Avoiding over-pressurization in subsurface reservoirs is critical for applications like CO2 sequestration and wastewater injection.
Managing the pressures by controlling injection/extraction are challenging because of complex heterogeneity in the subsurface.
We use differentiable programming with a full-physics model and machine learning to determine the fluid extraction rates that prevent over-pressurization.
arXiv Detail & Related papers (2022-06-21T20:38:13Z) - Learning Large-scale Subsurface Simulations with a Hybrid Graph Network
Simulator [57.57321628587564]
We introduce Hybrid Graph Network Simulator (HGNS) for learning reservoir simulations of 3D subsurface fluid flows.
HGNS consists of a subsurface graph neural network (SGNN) to model the evolution of fluid flows, and a 3D-U-Net to model the evolution of pressure.
Using an industry-standard subsurface flow dataset (SPE-10) with 1.1 million cells, we demonstrate that HGNS is able to reduce the inference time up to 18 times compared to standard subsurface simulators.
arXiv Detail & Related papers (2022-06-15T17:29:57Z) - Multi-fidelity Hierarchical Neural Processes [79.0284780825048]
Multi-fidelity surrogate modeling reduces the computational cost by fusing different simulation outputs.
We propose Multi-fidelity Hierarchical Neural Processes (MF-HNP), a unified neural latent variable model for multi-fidelity surrogate modeling.
We evaluate MF-HNP on epidemiology and climate modeling tasks, achieving competitive performance in terms of accuracy and uncertainty estimation.
arXiv Detail & Related papers (2022-06-10T04:54:13Z) - Differentiable, learnable, regionalized process-based models with
physical outputs can approach state-of-the-art hydrologic prediction accuracy [1.181206257787103]
We show that differentiable, learnable, process-based models (called delta models here) can approach the performance level of LSTM for the intensively-observed variable (streamflow) with regionalized parameterization.
We use a simple hydrologic model HBV as the backbone and use embedded neural networks, which can only be trained in a differentiable programming framework.
arXiv Detail & Related papers (2022-03-28T15:06:53Z) - Accelerating Part-Scale Simulation in Liquid Metal Jet Additive
Manufacturing via Operator Learning [0.0]
Part-scale predictions require many small-scale simulations.
A model describing droplet coalescence for LMJ may include coupled incompressible fluid flow, heat transfer, and phase change equations.
We apply an operator learning approach to learn a mapping between initial and final states of the droplet coalescence process.
arXiv Detail & Related papers (2022-02-02T17:24:16Z) - Deep Bayesian Active Learning for Accelerating Stochastic Simulation [74.58219903138301]
Interactive Neural Process (INP) is a deep active learning framework for simulations and with active learning approaches.
For active learning, we propose a novel acquisition function, Latent Information Gain (LIG), calculated in the latent space of NP based models.
The results demonstrate STNP outperforms the baselines in the learning setting and LIG achieves the state-of-the-art for active learning.
arXiv Detail & Related papers (2021-06-05T01:31:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.