Down-Sampled Epsilon-Lexicase Selection for Real-World Symbolic
Regression Problems
- URL: http://arxiv.org/abs/2302.04301v1
- Date: Wed, 8 Feb 2023 19:36:26 GMT
- Title: Down-Sampled Epsilon-Lexicase Selection for Real-World Symbolic
Regression Problems
- Authors: Alina Geiger, Dominik Sobania, Franz Rothlauf
- Abstract summary: Down-sampled epsilon-lexicase selection combines epsilon-lexicase selection with random subsampling to improve the performance in the domain of symbolic regression.
We find that the diversity is reduced by using down-sampled epsilon-lexicase selection compared to standard epsilon-lexicase selection.
With down-sampled epsilon-lexicase selection we observe an improvement of the solution quality of up to 85% in comparison to standard epsilon-lexicase selection.
- Score: 1.8047694351309207
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Epsilon-lexicase selection is a parent selection method in genetic
programming that has been successfully applied to symbolic regression problems.
Recently, the combination of random subsampling with lexicase selection
significantly improved performance in other genetic programming domains such as
program synthesis. However, the influence of subsampling on the solution
quality of real-world symbolic regression problems has not yet been studied. In
this paper, we propose down-sampled epsilon-lexicase selection which combines
epsilon-lexicase selection with random subsampling to improve the performance
in the domain of symbolic regression. Therefore, we compare down-sampled
epsilon-lexicase with traditional selection methods on common real-world
symbolic regression problems and analyze its influence on the properties of the
population over a genetic programming run. We find that the diversity is
reduced by using down-sampled epsilon-lexicase selection compared to standard
epsilon-lexicase selection. This comes along with high hyperselection rates we
observe for down-sampled epsilon-lexicase selection. Further, we find that
down-sampled epsilon-lexicase selection outperforms the traditional selection
methods on all studied problems. Overall, with down-sampled epsilon-lexicase
selection we observe an improvement of the solution quality of up to 85% in
comparison to standard epsilon-lexicase selection.
Related papers
- Refined Risk Bounds for Unbounded Losses via Transductive Priors [58.967816314671296]
We revisit the sequential variants of linear regression with the squared loss, classification problems with hinge loss, and logistic regression.
Our key tools are based on the exponential weights algorithm with carefully chosen transductive priors.
arXiv Detail & Related papers (2024-10-29T00:01:04Z) - Lexicase-based Selection Methods with Down-sampling for Symbolic Regression Problems: Overview and Benchmark [0.8602553195689513]
This paper evaluates random as well as informed down-sampling in combination with the relevant lexicase-based selection methods on a wide range of symbolic regression problems.
We find that for a given evaluation budget, epsilon-lexicase selection in combination with random or informed down-sampling outperforms all other methods.
arXiv Detail & Related papers (2024-07-31T14:26:22Z) - Minimum variance threshold for epsilon-lexicase selection [0.7373617024876725]
Methods often rely on average error over the entire dataset as a criterion to select the parents.
We propose a new criteria that splits errors into two partitions that minimize the total variance within partitions.
Our results show a better performance of our approach compared to traditional epsilon-lexicase selection in the real-world datasets.
arXiv Detail & Related papers (2024-04-08T23:47:26Z) - DALex: Lexicase-like Selection via Diverse Aggregation [6.394522608656896]
We show that DALex (for Diversely Aggregated Lexicase) achieves significant speedups over lexicase selection and its relaxed variants.
Results on program synthesis, deep learning, symbolic regression, and learning systems demonstrate that DALex achieves significant speedups over lexicase selection and its relaxed variants.
arXiv Detail & Related papers (2024-01-23T01:20:15Z) - Semi-Supervised Laplace Learning on Stiefel Manifolds [48.3427853588646]
We develop the framework Sequential Subspace for graph-based, supervised samples at low-label rates.
We achieves that our methods at extremely low rates, and high label rates.
arXiv Detail & Related papers (2023-07-31T20:19:36Z) - Probabilistic Lexicase Selection [6.177959045971966]
We introduce probabilistic lexicase selection (plexicase selection), a novel parent selection algorithm that efficiently approximates the probability distribution of lexicase selection.
Our method not only demonstrates superior problem-solving capabilities as a semantic-aware selection method, but also benefits from having a probabilistic representation of the selection process.
arXiv Detail & Related papers (2023-05-19T13:57:04Z) - Regression with Label Differential Privacy [64.21020761920322]
We derive a label DP randomization mechanism that is optimal under a given regression loss function.
We prove that the optimal mechanism takes the form of a "randomized response on bins"
arXiv Detail & Related papers (2022-12-12T17:41:32Z) - Optimal Rates for Random Order Online Optimization [60.011653053877126]
We study the citetgarber 2020online, where the loss functions may be chosen by an adversary, but are then presented online in a uniformly random order.
We show that citetgarber 2020online algorithms achieve the optimal bounds and significantly improve their stability.
arXiv Detail & Related papers (2021-06-29T09:48:46Z) - Problem-solving benefits of down-sampled lexicase selection [0.20305676256390928]
We show that down-sampled lexicase selection's main benefit stems from the fact that it allows the evolutionary process to examine more individuals within the same computational budget.
The reasons that down-sampling helps, however, are not yet fully understood.
arXiv Detail & Related papers (2021-06-10T23:42:09Z) - Bloom Origami Assays: Practical Group Testing [90.2899558237778]
Group testing is a well-studied problem with several appealing solutions.
Recent biological studies impose practical constraints for COVID-19 that are incompatible with traditional methods.
We develop a new method combining Bloom filters with belief propagation to scale to larger values of n (more than 100) with good empirical results.
arXiv Detail & Related papers (2020-07-21T19:31:41Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.