An interpretable prediction model for longitudinal dispersion
coefficient in natural streams based on evolutionary symbolic regression
network
- URL: http://arxiv.org/abs/2106.11026v1
- Date: Thu, 17 Jun 2021 07:06:05 GMT
- Title: An interpretable prediction model for longitudinal dispersion
coefficient in natural streams based on evolutionary symbolic regression
network
- Authors: Yifeng Zhao, Zicheng Liu, Pei Zhang, Stan Z. Li, S.A. Galindo-Torres
- Abstract summary: Various methods have been proposed for predictions of longitudinal dispersion coefficient(LDC)
In this paper, we first present an in-depth analysis of those methods and find out their defects.
We then design a novel symbolic regression method called evolutionary symbolic regression network(ESRN)
- Score: 30.99493442296212
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A better understanding of dispersion in natural streams requires knowledge of
longitudinal dispersion coefficient(LDC). Various methods have been proposed
for predictions of LDC. Those studies can be grouped into three types:
analytical, statistical and ML-driven researches(Implicit and explicit).
However, a comprehensive evaluation of them is still lacking. In this paper, we
first present an in-depth analysis of those methods and find out their defects.
This is carried out on an extensive database composed of 660 samples of
hydraulic and channel properties worldwide. The reliability and
representativeness of utilized data are enhanced through the deployment of the
Subset Selection of Maximum Dissimilarity(SSMD) for testing set selection and
the Inter Quartile Range(IQR) for removal of the outlier. The evaluation
reveals the rank of those methods as: ML-driven method > the statistical method
> the analytical method. Whereas implicit ML-driven methods are black-boxes in
nature, explicit ML-driven methods have more potential in prediction of LDC.
Besides, overfitting is a universal problem in existing models. Those models
also suffer from a fixed parameter combination. To establish an interpretable
model for LDC prediction with higher performance, we then design a novel
symbolic regression method called evolutionary symbolic regression
network(ESRN). It is a combination of genetic algorithms and neural networks.
Strategies are introduced to avoid overfitting and explore more parameter
combinations. Results show that the ESRN model has superiorities over other
existing symbolic models in performance. The proposed model is suitable for
practical engineering problems due to its advantage in low requirement of
parameters (only w and U* are required). It can provide convincing solutions
for situations where the field test cannot be carried out or limited field
information can be obtained.
Related papers
- Total Uncertainty Quantification in Inverse PDE Solutions Obtained with Reduced-Order Deep Learning Surrogate Models [50.90868087591973]
We propose an approximate Bayesian method for quantifying the total uncertainty in inverse PDE solutions obtained with machine learning surrogate models.
We test the proposed framework by comparing it with the iterative ensemble smoother and deep ensembling methods for a non-linear diffusion equation.
arXiv Detail & Related papers (2024-08-20T19:06:02Z) - Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Out of the Ordinary: Spectrally Adapting Regression for Covariate Shift [12.770658031721435]
We propose a method for adapting the weights of the last layer of a pre-trained neural regression model to perform better on input data originating from a different distribution.
We demonstrate how this lightweight spectral adaptation procedure can improve out-of-distribution performance for synthetic and real-world datasets.
arXiv Detail & Related papers (2023-12-29T04:15:58Z) - Toward Physically Plausible Data-Driven Models: A Novel Neural Network
Approach to Symbolic Regression [2.7071541526963805]
This paper proposes a novel neural network-based symbolic regression method.
It constructs physically plausible models based on even very small training data sets and prior knowledge about the system.
We experimentally evaluate the approach on four test systems: the TurtleBot 2 mobile robot, the magnetic manipulation system, the equivalent resistance of two resistors in parallel, and the longitudinal force of the anti-lock braking system.
arXiv Detail & Related papers (2023-02-01T22:05:04Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - Variational Inference with NoFAS: Normalizing Flow with Adaptive
Surrogate for Computationally Expensive Models [7.217783736464403]
Use of sampling-based approaches such as Markov chain Monte Carlo may become intractable when each likelihood evaluation is computationally expensive.
New approaches combining variational inference with normalizing flow are characterized by a computational cost that grows only linearly with the dimensionality of the latent variable space.
We propose Normalizing Flow with Adaptive Surrogate (NoFAS), an optimization strategy that alternatively updates the normalizing flow parameters and the weights of a neural network surrogate model.
arXiv Detail & Related papers (2021-08-28T14:31:45Z) - A Data-driven feature selection and machine-learning model benchmark for
the prediction of longitudinal dispersion coefficient [29.58577229101903]
An accurate prediction on Longitudinal Dispersion(LD) coefficient can produce a performance leap in related simulation.
In this study, a global optimal feature set was proposed through numerical comparison of the distilled local optimums in performance with representative ML models.
Results show that the support vector machine has significantly better performance than other models.
arXiv Detail & Related papers (2021-07-16T09:50:38Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Variable selection with missing data in both covariates and outcomes:
Imputation and machine learning [1.0333430439241666]
The missing data issue is ubiquitous in health studies.
Machine learning methods weaken parametric assumptions.
XGBoost and BART have the overall best performance across various settings.
arXiv Detail & Related papers (2021-04-06T20:18:29Z) - Identification of Probability weighted ARX models with arbitrary domains [75.91002178647165]
PieceWise Affine models guarantees universal approximation, local linearity and equivalence to other classes of hybrid system.
In this work, we focus on the identification of PieceWise Auto Regressive with eXogenous input models with arbitrary regions (NPWARX)
The architecture is conceived following the Mixture of Expert concept, developed within the machine learning field.
arXiv Detail & Related papers (2020-09-29T12:50:33Z) - Localized Debiased Machine Learning: Efficient Inference on Quantile
Treatment Effects and Beyond [69.83813153444115]
We consider an efficient estimating equation for the (local) quantile treatment effect ((L)QTE) in causal inference.
Debiased machine learning (DML) is a data-splitting approach to estimating high-dimensional nuisances.
We propose localized debiased machine learning (LDML), which avoids this burdensome step.
arXiv Detail & Related papers (2019-12-30T14:42:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.