Local Search, Semantics, and Genetic Programming: a Global Analysis
- URL: http://arxiv.org/abs/2305.16956v1
- Date: Fri, 26 May 2023 14:13:03 GMT
- Title: Local Search, Semantics, and Genetic Programming: a Global Analysis
- Authors: Fabio Anselmi, Mauro Castelli, Alberto d'Onofrio, Luca Manzoni, Luca
Mariot, Martina Saletta
- Abstract summary: Geometric Semantic Geometric Programming (GSGP) is one of the most prominent Genetic Programming (GP) variants.
Here we explore multiple possibilities to limit the overfitting of GSM-LS and GSGP-reg.
Results show that it is possible to consistently outperform standard GSGP on both training and unseen data.
- Score: 7.486818142115522
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Geometric Semantic Geometric Programming (GSGP) is one of the most prominent
Genetic Programming (GP) variants, thanks to its solid theoretical background,
the excellent performance achieved, and the execution time significantly
smaller than standard syntax-based GP. In recent years, a new mutation
operator, Geometric Semantic Mutation with Local Search (GSM-LS), has been
proposed to include a local search step in the mutation process based on the
idea that performing a linear regression during the mutation can allow for a
faster convergence to good-quality solutions. While GSM-LS helps the
convergence of the evolutionary search, it is prone to overfitting. Thus, it
was suggested to use GSM-LS only for a limited number of generations and,
subsequently, to switch back to standard geometric semantic mutation. A more
recently defined variant of GSGP (called GSGP-reg) also includes a local search
step but shares similar strengths and weaknesses with GSM-LS. Here we explore
multiple possibilities to limit the overfitting of GSM-LS and GSGP-reg, ranging
from adaptive methods to estimate the risk of overfitting at each mutation to a
simple regularized regression. The results show that the method used to limit
overfitting is not that important: providing that a technique to control
overfitting is used, it is possible to consistently outperform standard GSGP on
both training and unseen data. The obtained results allow practitioners to
better understand the role of local search in GSGP and demonstrate that simple
regularization strategies are effective in controlling overfitting.
Related papers
- Deep Transformed Gaussian Processes [0.0]
Transformed Gaussian Processes (TGPs) are processes specified by transforming samples from the joint distribution from a prior process (typically a GP) using an invertible transformation.
We propose a generalization of TGPs named Deep Transformed Gaussian Processes (DTGPs), which follows the trend of concatenating layers of processes.
Experiments conducted evaluate the proposed DTGPs in multiple regression datasets, achieving good scalability and performance.
arXiv Detail & Related papers (2023-10-27T16:09:39Z) - A Communication-efficient Algorithm with Linear Convergence for
Federated Minimax Learning [1.713291434132985]
We study a large-scale multi-agent minimax optimization problem, which models Geneimation Adversarial Networks (GANs)
The overall objective is a sum of agents' private local objective functions.
We show that FedGDA-GT converges linearly with a constant stepsize to global $epsilon GDA solution.
arXiv Detail & Related papers (2022-06-02T16:31:16Z) - The Effect of Multi-Generational Selection in Geometric Semantic Genetic
Programming [8.0322025529523]
Geometric Semantic Genetic Programming (GSGP) has shown to be successfully applicable to many real-world problems.
Due to a peculiarity in its implementation, GSGP needs to store all the evolutionary history, i.e., all populations from the first one.
We exploit this stored information to define a multi-generational selection scheme that is able to use individuals from older populations.
arXiv Detail & Related papers (2022-05-05T12:26:25Z) - Coefficient Mutation in the Gene-pool Optimal Mixing Evolutionary
Algorithm for Symbolic Regression [0.0]
GP-GOMEA is a top-performing algorithm for symbolic regression.
We show how simple approaches for optimizing coefficients can be integrated into GP-GOMEA.
We find that coefficient mutation can help re-discover the underlying equation by a substantial amount.
arXiv Detail & Related papers (2022-04-26T08:58:47Z) - Benign Underfitting of Stochastic Gradient Descent [72.38051710389732]
We study to what extent may gradient descent (SGD) be understood as a "conventional" learning rule that achieves generalization performance by obtaining a good fit training data.
We analyze the closely related with-replacement SGD, for which an analogous phenomenon does not occur and prove that its population risk does in fact converge at the optimal rate.
arXiv Detail & Related papers (2022-02-27T13:25:01Z) - Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient
Methods [73.35353358543507]
Gradient Descent-Ascent (SGDA) is one of the most prominent algorithms for solving min-max optimization and variational inequalities problems (VIP)
In this paper, we propose a unified convergence analysis that covers a large variety of descent-ascent methods.
We develop several new variants of SGDA such as a new variance-reduced method (L-SVRGDA), new distributed methods with compression (QSGDA, DIANA-SGDA, VR-DIANA-SGDA), and a new method with coordinate randomization (SEGA-SGDA)
arXiv Detail & Related papers (2022-02-15T09:17:39Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and
Beyond [63.59034509960994]
We study shuffling-based variants: minibatch and local Random Reshuffling, which draw gradients without replacement.
For smooth functions satisfying the Polyak-Lojasiewicz condition, we obtain convergence bounds which show that these shuffling-based variants converge faster than their with-replacement counterparts.
We propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings.
arXiv Detail & Related papers (2021-10-20T02:25:25Z) - Incremental Ensemble Gaussian Processes [53.3291389385672]
We propose an incremental ensemble (IE-) GP framework, where an EGP meta-learner employs an it ensemble of GP learners, each having a unique kernel belonging to a prescribed kernel dictionary.
With each GP expert leveraging the random feature-based approximation to perform online prediction and model update with it scalability, the EGP meta-learner capitalizes on data-adaptive weights to synthesize the per-expert predictions.
The novel IE-GP is generalized to accommodate time-varying functions by modeling structured dynamics at the EGP meta-learner and within each GP learner.
arXiv Detail & Related papers (2021-10-13T15:11:25Z) - A general sample complexity analysis of vanilla policy gradient [101.16957584135767]
Policy gradient (PG) is one of the most popular reinforcement learning (RL) problems.
"vanilla" theoretical understanding of PG trajectory is one of the most popular methods for solving RL problems.
arXiv Detail & Related papers (2021-07-23T19:38:17Z) - SGP-DT: Semantic Genetic Programming Based on Dynamic Targets [6.841231589814175]
This paper presents a new Semantic GP approach based on Dynamic Target (SGP-DT)
The evolution in each run is guided by a new (dynamic) target based on the residual errors.
SGP-DT achieves small RMSE values, on average 23.19% smaller than the one of epsilon-lexicase.
arXiv Detail & Related papers (2020-01-30T19:33:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.