Exhaustive Symbolic Regression
- URL: http://arxiv.org/abs/2211.11461v2
- Date: Mon, 29 May 2023 09:17:33 GMT
- Title: Exhaustive Symbolic Regression
- Authors: Deaglan J. Bartlett, Harry Desmond and Pedro G. Ferreira
- Abstract summary: Exhaustive Symbolic Regression (ESR) is a rigorous method for combining preferences into a single objective.
We apply it to a catalogue of cosmic chronometers and the Pantheon+ sample of supernovae to learn the Hubble rate.
We make our code and full equation sets publicly available.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Symbolic Regression (SR) algorithms attempt to learn analytic expressions
which fit data accurately and in a highly interpretable manner. Conventional SR
suffers from two fundamental issues which we address here. First, these methods
search the space stochastically (typically using genetic programming) and hence
do not necessarily find the best function. Second, the criteria used to select
the equation optimally balancing accuracy with simplicity have been variable
and subjective. To address these issues we introduce Exhaustive Symbolic
Regression (ESR), which systematically and efficiently considers all possible
equations -- made with a given basis set of operators and up to a specified
maximum complexity -- and is therefore guaranteed to find the true optimum (if
parameters are perfectly optimised) and a complete function ranking subject to
these constraints. We implement the minimum description length principle as a
rigorous method for combining these preferences into a single objective. To
illustrate the power of ESR we apply it to a catalogue of cosmic chronometers
and the Pantheon+ sample of supernovae to learn the Hubble rate as a function
of redshift, finding $\sim$40 functions (out of 5.2 million trial functions)
that fit the data more economically than the Friedmann equation. These
low-redshift data therefore do not uniquely prefer the expansion history of the
standard model of cosmology. We make our code and full equation sets publicly
available.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Highly Adaptive Ridge [84.38107748875144]
We propose a regression method that achieves a $n-2/3$ dimension-free L2 convergence rate in the class of right-continuous functions with square-integrable sectional derivatives.
Har is exactly kernel ridge regression with a specific data-adaptive kernel based on a saturated zero-order tensor-product spline basis expansion.
We demonstrate empirical performance better than state-of-the-art algorithms for small datasets in particular.
arXiv Detail & Related papers (2024-10-03T17:06:06Z) - Scalable Sparse Regression for Model Discovery: The Fast Lane to Insight [0.0]
Sparse regression applied to symbolic libraries has quickly emerged as a powerful tool for learning governing equations directly from data.
I present a general purpose, model sparse regression algorithm that extends a recently proposed exhaustive search.
It is intended to maintain agnostic sensitivity to small coefficients and be of reasonable computational cost for large symbolic libraries.
arXiv Detail & Related papers (2024-05-14T18:09:43Z) - Deep Generative Symbolic Regression [83.04219479605801]
Symbolic regression aims to discover concise closed-form mathematical equations from data.
Existing methods, ranging from search to reinforcement learning, fail to scale with the number of input variables.
We propose an instantiation of our framework, Deep Generative Symbolic Regression.
arXiv Detail & Related papers (2023-12-30T17:05:31Z) - Globally Convergent Accelerated Algorithms for Multilinear Sparse
Logistic Regression with $\ell_0$-constraints [2.323238724742687]
Multilinear logistic regression serves as a powerful tool for the analysis of multidimensional data.
We propose an Accelerated Proximal Alternating Minim-MLSR model to solve the $ell_0$-MLSR.
We also demonstrate that APALM$+$ is globally convergent to a first-order critical point as well as to establish convergence by using the Kurdy-Lojasiewicz property.
arXiv Detail & Related papers (2023-09-17T11:05:08Z) - Maximize to Explore: One Objective Function Fusing Estimation, Planning,
and Exploration [87.53543137162488]
We propose an easy-to-implement online reinforcement learning (online RL) framework called textttMEX.
textttMEX integrates estimation and planning components while balancing exploration exploitation automatically.
It can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards.
arXiv Detail & Related papers (2023-05-29T17:25:26Z) - Priors for symbolic regression [0.0]
We develop methods to incorporate detailed prior information on both functions and their parameters into symbolic regression.
Our prior on the structure of a function is based on a $n$-gram language model.
We also develop a formalism based on the Fractional Bayes Factor to treat numerical parameter priors.
arXiv Detail & Related papers (2023-04-13T08:29:16Z) - Dual-sPLS: a family of Dual Sparse Partial Least Squares regressions for
feature selection and prediction with tunable sparsity; evaluation on
simulated and near-infrared (NIR) data [1.6099403809839032]
The variant presented in this paper, Dual-sPLS, generalizes the classical PLS1 algorithm.
It provides balance between accurate prediction and efficient interpretation.
Code is provided as an open-source package in R.
arXiv Detail & Related papers (2023-01-17T21:50:35Z) - Sharper Rates and Flexible Framework for Nonconvex SGD with Client and
Data Sampling [64.31011847952006]
We revisit the problem of finding an approximately stationary point of the average of $n$ smooth and possibly non-color functions.
We generalize the $smallsfcolorgreen$ so that it can provably work with virtually any sampling mechanism.
We provide the most general and most accurate analysis of optimal bound in the smooth non-color regime.
arXiv Detail & Related papers (2022-06-05T21:32:33Z) - Human-in-the-loop: Provably Efficient Preference-based Reinforcement
Learning with General Function Approximation [107.54516740713969]
We study human-in-the-loop reinforcement learning (RL) with trajectory preferences.
Instead of receiving a numeric reward at each step, the agent only receives preferences over trajectory pairs from a human overseer.
We propose the first optimistic model-based algorithm for PbRL with general function approximation.
arXiv Detail & Related papers (2022-05-23T09:03:24Z) - Minimum discrepancy principle strategy for choosing $k$ in $k$-NN regression [2.0411082897313984]
We present a novel data-driven strategy to choose the hyper parameter $k$ in the $k$-NN regression estimator without using any hold-out data.
We propose using an easily implemented in practice strategy based on the idea of early stopping and the minimum discrepancy principle.
arXiv Detail & Related papers (2020-08-20T00:13:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.