Down-Sampled Epsilon-Lexicase Selection for Real-World Symbolic
Regression Problems
- URL: http://arxiv.org/abs/2302.04301v1
- Date: Wed, 8 Feb 2023 19:36:26 GMT
- Title: Down-Sampled Epsilon-Lexicase Selection for Real-World Symbolic
Regression Problems
- Authors: Alina Geiger, Dominik Sobania, Franz Rothlauf
- Abstract summary: Down-sampled epsilon-lexicase selection combines epsilon-lexicase selection with random subsampling to improve the performance in the domain of symbolic regression.
We find that the diversity is reduced by using down-sampled epsilon-lexicase selection compared to standard epsilon-lexicase selection.
With down-sampled epsilon-lexicase selection we observe an improvement of the solution quality of up to 85% in comparison to standard epsilon-lexicase selection.
- Score: 1.8047694351309207
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Epsilon-lexicase selection is a parent selection method in genetic
programming that has been successfully applied to symbolic regression problems.
Recently, the combination of random subsampling with lexicase selection
significantly improved performance in other genetic programming domains such as
program synthesis. However, the influence of subsampling on the solution
quality of real-world symbolic regression problems has not yet been studied. In
this paper, we propose down-sampled epsilon-lexicase selection which combines
epsilon-lexicase selection with random subsampling to improve the performance
in the domain of symbolic regression. Therefore, we compare down-sampled
epsilon-lexicase with traditional selection methods on common real-world
symbolic regression problems and analyze its influence on the properties of the
population over a genetic programming run. We find that the diversity is
reduced by using down-sampled epsilon-lexicase selection compared to standard
epsilon-lexicase selection. This comes along with high hyperselection rates we
observe for down-sampled epsilon-lexicase selection. Further, we find that
down-sampled epsilon-lexicase selection outperforms the traditional selection
methods on all studied problems. Overall, with down-sampled epsilon-lexicase
selection we observe an improvement of the solution quality of up to 85% in
comparison to standard epsilon-lexicase selection.
Related papers
- Minimum variance threshold for epsilon-lexicase selection [0.7373617024876725]
Methods often rely on average error over the entire dataset as a criterion to select the parents.
We propose a new criteria that splits errors into two partitions that minimize the total variance within partitions.
Our results show a better performance of our approach compared to traditional epsilon-lexicase selection in the real-world datasets.
arXiv Detail & Related papers (2024-04-08T23:47:26Z) - DALex: Lexicase-like Selection via Diverse Aggregation [6.394522608656896]
We show that DALex (for Diversely Aggregated Lexicase) achieves significant speedups over lexicase selection and its relaxed variants.
Results on program synthesis, deep learning, symbolic regression, and learning systems demonstrate that DALex achieves significant speedups over lexicase selection and its relaxed variants.
arXiv Detail & Related papers (2024-01-23T01:20:15Z) - Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical [66.57396042747706]
Complementary-label learning is a weakly supervised learning problem.
We propose a consistent approach that does not rely on the uniform distribution assumption.
We find that complementary-label learning can be expressed as a set of negative-unlabeled binary classification problems.
arXiv Detail & Related papers (2023-11-27T02:59:17Z) - Probabilistic Lexicase Selection [6.177959045971966]
We introduce probabilistic lexicase selection (plexicase selection), a novel parent selection algorithm that efficiently approximates the probability distribution of lexicase selection.
Our method not only demonstrates superior problem-solving capabilities as a semantic-aware selection method, but also benefits from having a probabilistic representation of the selection process.
arXiv Detail & Related papers (2023-05-19T13:57:04Z) - Analyzing the Interaction Between Down-Sampling and Selection [52.77024349608834]
Genetic programming systems often use large training sets to evaluate the quality of candidate solutions for selection.
Down-sampling training sets has long been used to decrease the computational cost of evaluation in a wide range of application domains.
arXiv Detail & Related papers (2023-04-14T12:21:19Z) - Regression with Label Differential Privacy [64.21020761920322]
We derive a label DP randomization mechanism that is optimal under a given regression loss function.
We prove that the optimal mechanism takes the form of a "randomized response on bins"
arXiv Detail & Related papers (2022-12-12T17:41:32Z) - Optimal Rates for Random Order Online Optimization [60.011653053877126]
We study the citetgarber 2020online, where the loss functions may be chosen by an adversary, but are then presented online in a uniformly random order.
We show that citetgarber 2020online algorithms achieve the optimal bounds and significantly improve their stability.
arXiv Detail & Related papers (2021-06-29T09:48:46Z) - Problem-solving benefits of down-sampled lexicase selection [0.20305676256390928]
We show that down-sampled lexicase selection's main benefit stems from the fact that it allows the evolutionary process to examine more individuals within the same computational budget.
The reasons that down-sampling helps, however, are not yet fully understood.
arXiv Detail & Related papers (2021-06-10T23:42:09Z) - Shape-constrained Symbolic Regression -- Improving Extrapolation with
Prior Knowledge [0.0]
The aim is to find models which conform to expected behaviour and which have improved capabilities.
The algorithms are tested on a set of 19 synthetic and four real-world regression problems.
Shape-constrained regression produces the best results for the test set but also significantly larger models.
arXiv Detail & Related papers (2021-03-29T14:04:18Z) - Bloom Origami Assays: Practical Group Testing [90.2899558237778]
Group testing is a well-studied problem with several appealing solutions.
Recent biological studies impose practical constraints for COVID-19 that are incompatible with traditional methods.
We develop a new method combining Bloom filters with belief propagation to scale to larger values of n (more than 100) with good empirical results.
arXiv Detail & Related papers (2020-07-21T19:31:41Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.