Local Optimization Often is Ill-conditioned in Genetic Programming for
Symbolic Regression
- URL: http://arxiv.org/abs/2209.00942v1
- Date: Fri, 2 Sep 2022 10:39:26 GMT
- Title: Local Optimization Often is Ill-conditioned in Genetic Programming for
Symbolic Regression
- Authors: Gabriel Kronberger
- Abstract summary: We use a singular value decomposition of NLS Jacobian matrices to determine the numeric rank and the condition number.
Our results show that rank-deficient and ill-conditioned Jacobian matrices occur frequently and for all datasets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gradient-based local optimization has been shown to improve results of
genetic programming (GP) for symbolic regression. Several state-of-the-art GP
implementations use iterative nonlinear least squares (NLS) algorithms such as
the Levenberg-Marquardt algorithm for local optimization. The effectiveness of
NLS algorithms depends on appropriate scaling and conditioning of the
optimization problem. This has so far been ignored in symbolic regression and
GP literature. In this study we use a singular value decomposition of NLS
Jacobian matrices to determine the numeric rank and the condition number. We
perform experiments with a GP implementation and six different benchmark
datasets. Our results show that rank-deficient and ill-conditioned Jacobian
matrices occur frequently and for all datasets. The issue is less extreme when
restricting GP tree size and when using many non-linear functions in the
function set.
Related papers
- Taylor Genetic Programming for Symbolic Regression [5.371028373792346]
Genetic programming (GP) is a commonly used approach to solve symbolic regression (SR) problems.
We propose Taylor genetic programming (TaylorGP) to approximate the symbolic equation that fits the dataset.
TaylorGP not only has higher accuracy than the nine baseline methods, but also is faster in finding stable results.
arXiv Detail & Related papers (2022-04-28T13:43:39Z) - Alternating Mahalanobis Distance Minimization for Stable and Accurate CP
Decomposition [4.847980206213335]
We introduce a new formulation for deriving singular values and vectors of a tensor by considering the critical points of a function different from what is used in the previous work.
We show that a subsweep of this algorithm can achieve a superlinear convergence rate for exact CPD with known rank.
We then view the algorithm as optimizing a Mahalanobis distance with respect to each factor with ground metric dependent on the other factors.
arXiv Detail & Related papers (2022-04-14T19:56:36Z) - Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with
Variance Reduction and its Application to Optimization [50.83356836818667]
gradient Langevin Dynamics is one of the most fundamental algorithms to solve non-eps optimization problems.
In this paper, we show two variants of this kind, namely the Variance Reduced Langevin Dynamics and the Recursive Gradient Langevin Dynamics.
arXiv Detail & Related papers (2022-03-30T11:39:00Z) - Scaling Gaussian Process Optimization by Evaluating a Few Unique
Candidates Multiple Times [119.41129787351092]
We show that sequential black-box optimization based on GPs can be made efficient by sticking to a candidate solution for multiple evaluation steps.
We modify two well-established GP-Opt algorithms, GP-UCB and GP-EI to adapt rules from batched GP-Opt.
arXiv Detail & Related papers (2022-01-30T20:42:14Z) - Accelerating Genetic Programming using GPUs [0.0]
Genetic Programming (GP) has multiple applications in machine learning such as curve fitting, data modelling, feature selection, classification etc.
This paper describes a GPU accelerated stack-based variant of the generational GP algorithm which can be used for symbolic regression and binary classification.
arXiv Detail & Related papers (2021-10-15T06:13:01Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - Symbolic Regression using Mixed-Integer Nonlinear Optimization [9.638685454900047]
The Symbolic Regression (SR) problem is a hard problem in machine learning.
We propose a hybrid algorithm that combines mixed-integer nonlinear optimization with explicit enumeration.
We show that our algorithm is competitive, for some synthetic data sets, with a state-of-the-art SR software and a recent physics-inspired method called AI Feynman.
arXiv Detail & Related papers (2020-06-11T20:53:17Z) - Improved SVRG for quadratic functions [0.0]
We analyse an iterative algorithm to minimize quadratic functions whose Hessian matrix $H$ is the expectation of a random symmetric $dtimes d$ matrix.
In several applications, including least-squares regressions, ridge regressions, linear discriminant analysis and regularized linear discriminant analysis, the running time of each iteration is proportional to $d$.
arXiv Detail & Related papers (2020-06-01T15:28:08Z) - Global Optimization of Gaussian processes [52.77024349608834]
We propose a reduced-space formulation with trained Gaussian processes trained on few data points.
The approach also leads to significantly smaller and computationally cheaper sub solver for lower bounding.
In total, we reduce time convergence by orders of orders of the proposed method.
arXiv Detail & Related papers (2020-05-21T20:59:11Z) - Optimization of Graph Total Variation via Active-Set-based Combinatorial
Reconditioning [48.42916680063503]
We propose a novel adaptive preconditioning strategy for proximal algorithms on this problem class.
We show that nested-forest decomposition of the inactive edges yields a guaranteed local linear convergence rate.
Our results suggest that local convergence analysis can serve as a guideline for selecting variable metrics in proximal algorithms.
arXiv Detail & Related papers (2020-02-27T16:33:09Z) - Supervised Quantile Normalization for Low-rank Matrix Approximation [50.445371939523305]
We learn the parameters of quantile normalization operators that can operate row-wise on the values of $X$ and/or of its factorization $UV$ to improve the quality of the low-rank representation of $X$ itself.
We demonstrate the applicability of these techniques on synthetic and genomics datasets.
arXiv Detail & Related papers (2020-02-08T21:06:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.