Prediction Intervals and Confidence Regions for Symbolic Regression
Models based on Likelihood Profiles
- URL: http://arxiv.org/abs/2209.06454v1
- Date: Wed, 14 Sep 2022 07:07:55 GMT
- Title: Prediction Intervals and Confidence Regions for Symbolic Regression
Models based on Likelihood Profiles
- Authors: Fabricio Olivetti de Franca and Gabriel Kronberger
- Abstract summary: Quantification of uncertainty of regression models is important for the interpretation of models and for decision making.
The linear approximation and so-called likelihood profiles are well-known possibilities for the calculation of confidence and prediction intervals.
These simple and effective techniques have been completely ignored so far in the genetic programming literature.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Symbolic regression is a nonlinear regression method which is commonly
performed by an evolutionary computation method such as genetic programming.
Quantification of uncertainty of regression models is important for the
interpretation of models and for decision making. The linear approximation and
so-called likelihood profiles are well-known possibilities for the calculation
of confidence and prediction intervals for nonlinear regression models. These
simple and effective techniques have been completely ignored so far in the
genetic programming literature. In this work we describe the calculation of
likelihood profiles in details and also provide some illustrative examples with
models created with three different symbolic regression algorithms on two
different datasets. The examples highlight the importance of the likelihood
profiles to understand the limitations of symbolic regression models and to
help the user taking an informed post-prediction decision.
Related papers
- Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling.
Yet their widespread adoption poses challenges regarding data attribution and interpretability.
In this paper, we aim to help address such challenges by developing an textitinfluence functions framework.
arXiv Detail & Related papers (2024-10-17T17:59:02Z) - Branch and Bound to Assess Stability of Regression Coefficients in Uncertain Models [0.6990493129893112]
We introduce our algorithm, along with supporting mathematical results, an example application, and a link to our computer code.
It helps researchers summarize high-dimensional data and assess the stability of regression coefficients in uncertain models.
arXiv Detail & Related papers (2024-08-19T01:37:14Z) - Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Adaptive Optimization for Prediction with Missing Data [6.800113478497425]
We show that some adaptive linear regression models are equivalent to learning an imputation rule and a downstream linear regression model simultaneously.
In settings where data is strongly not missing at random, our methods achieve a 2-10% improvement in out-of-sample accuracy.
arXiv Detail & Related papers (2024-02-02T16:35:51Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - Engression: Extrapolation through the Lens of Distributional Regression [2.519266955671697]
We propose a neural network-based distributional regression methodology called engression'
An engression model is generative in the sense that we can sample from the fitted conditional distribution and is also suitable for high-dimensional outcomes.
We show that engression can successfully perform extrapolation under some assumptions such as monotonicity, whereas traditional regression approaches such as least-squares or quantile regression fall short under the same assumptions.
arXiv Detail & Related papers (2023-07-03T08:19:00Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - Logistic Regression Through the Veil of Imprecise Data [0.0]
Logistic regression is an important statistical tool for assessing the probability of an outcome based upon some predictive variables.
Standard methods can only deal with precisely known data, however many datasets have uncertainties which traditional methods either reduce to a single point or completely disregarded.
arXiv Detail & Related papers (2021-06-01T13:51:46Z) - Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing
Regressions In NLP Model Updates [68.09049111171862]
This work focuses on quantifying, reducing and analyzing regression errors in the NLP model updates.
We formulate the regression-free model updates into a constrained optimization problem.
We empirically analyze how model ensemble reduces regression.
arXiv Detail & Related papers (2021-05-07T03:33:00Z) - Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware
Regression [91.3373131262391]
Uncertainty is the only certainty there is.
Traditionally, the direct regression formulation is considered and the uncertainty is modeled by modifying the output space to a certain family of probabilistic distributions.
How to model the uncertainty within the present-day technologies for regression remains an open issue.
arXiv Detail & Related papers (2021-03-25T06:56:09Z) - Symbolic Regression Driven by Training Data and Prior Knowledge [0.0]
In symbolic regression, the search for analytic models is driven purely by the prediction error observed on the training data samples.
We propose a multi-objective symbolic regression approach that is driven by both the training data and the prior knowledge of the properties the desired model should manifest.
arXiv Detail & Related papers (2020-04-24T19:15:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.