Problem-solving benefits of down-sampled lexicase selection
- URL: http://arxiv.org/abs/2106.06085v1
- Date: Thu, 10 Jun 2021 23:42:09 GMT
- Title: Problem-solving benefits of down-sampled lexicase selection
- Authors: Thomas Helmuth and Lee Spector
- Abstract summary: We show that down-sampled lexicase selection's main benefit stems from the fact that it allows the evolutionary process to examine more individuals within the same computational budget.
The reasons that down-sampling helps, however, are not yet fully understood.
- Score: 0.20305676256390928
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In genetic programming, an evolutionary method for producing computer
programs that solve specified computational problems, parent selection is
ordinarily based on aggregate measures of performance across an entire training
set. Lexicase selection, by contrast, selects on the basis of performance on
random sequences of training cases; this has been shown to enhance
problem-solving power in many circumstances. Lexicase selection can also be
seen as better reflecting biological evolution, by modeling sequences of
challenges that organisms face over their lifetimes. Recent work has
demonstrated that the advantages of lexicase selection can be amplified by
down-sampling, meaning that only a random subsample of the training cases is
used each generation. This can be seen as modeling the fact that individual
organisms encounter only subsets of the possible environments, and that
environments change over time. Here we provide the most extensive benchmarking
of down-sampled lexicase selection to date, showing that its benefits hold up
to increased scrutiny. The reasons that down-sampling helps, however, are not
yet fully understood. Hypotheses include that down-sampling allows for more
generations to be processed with the same budget of program evaluations; that
the variation of training data across generations acts as a changing
environment, encouraging adaptation; or that it reduces overfitting, leading to
more general solutions. We systematically evaluate these hypotheses, finding
evidence against all three, and instead draw the conclusion that down-sampled
lexicase selection's main benefit stems from the fact that it allows the
evolutionary process to examine more individuals within the same computational
budget, even though each individual is examined less completely.
Related papers
- Lexicase-based Selection Methods with Down-sampling for Symbolic Regression Problems: Overview and Benchmark [0.8602553195689513]
This paper evaluates random as well as informed down-sampling in combination with the relevant lexicase-based selection methods on a wide range of symbolic regression problems.
We find that for a given evaluation budget, epsilon-lexicase selection in combination with random or informed down-sampling outperforms all other methods.
arXiv Detail & Related papers (2024-07-31T14:26:22Z) - Diversified Batch Selection for Training Acceleration [68.67164304377732]
A prevalent research line, known as online batch selection, explores selecting informative subsets during the training process.
vanilla reference-model-free methods involve independently scoring and selecting data in a sample-wise manner.
We propose Diversified Batch Selection (DivBS), which is reference-model-free and can efficiently select diverse and representative samples.
arXiv Detail & Related papers (2024-06-07T12:12:20Z) - DALex: Lexicase-like Selection via Diverse Aggregation [6.394522608656896]
We show that DALex (for Diversely Aggregated Lexicase) achieves significant speedups over lexicase selection and its relaxed variants.
Results on program synthesis, deep learning, symbolic regression, and learning systems demonstrate that DALex achieves significant speedups over lexicase selection and its relaxed variants.
arXiv Detail & Related papers (2024-01-23T01:20:15Z) - Untangling the Effects of Down-Sampling and Selection in Genetic Programming [40.05141985769286]
Genetic programming systems often use large training sets to evaluate the quality of candidate solutions for selection.
Recent studies have shown that both random and informed down-sampling can substantially improve problem-solving success.
arXiv Detail & Related papers (2023-04-14T12:21:19Z) - Informed Down-Sampled Lexicase Selection: Identifying productive
training cases for efficient problem solving [40.683810697551166]
Genetic Programming (GP) often uses large training sets and requires all individuals to be evaluated on all training cases during selection.
Random down-sampled lexicase selection evaluates individuals on only a random subset of the training cases allowing for more individuals to be explored with the same amount of program executions.
In Informed Down-Sampled Lexicase Selection, we use population statistics to build down-samples that contain more distinct and therefore informative training cases.
arXiv Detail & Related papers (2023-01-04T08:47:18Z) - Adaptive Identification of Populations with Treatment Benefit in
Clinical Trials: Machine Learning Challenges and Solutions [78.31410227443102]
We study the problem of adaptively identifying patient subpopulations that benefit from a given treatment during a confirmatory clinical trial.
We propose AdaGGI and AdaGCPI, two meta-algorithms for subpopulation construction.
arXiv Detail & Related papers (2022-08-11T14:27:49Z) - The Environmental Discontinuity Hypothesis for Down-Sampled Lexicase
Selection [0.0]
Down-sampling has proved effective in genetic programming (GP) runs that utilize the lexicase parent selection technique.
We hypothesize that the random sampling that is performed every generation causes discontinuities that result in the population being unable to adapt to the shifting environment.
We find that forcing incremental environmental change is not significantly better for evolving solutions to program synthesis problems than simple random down-sampling.
arXiv Detail & Related papers (2022-05-31T16:21:14Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - SelectAugment: Hierarchical Deterministic Sample Selection for Data
Augmentation [72.58308581812149]
We propose an effective approach, dubbed SelectAugment, to select samples to be augmented in a deterministic and online manner.
Specifically, in each batch, we first determine the augmentation ratio, and then decide whether to augment each training sample under this ratio.
In this way, the negative effects of the randomness in selecting samples to augment can be effectively alleviated and the effectiveness of DA is improved.
arXiv Detail & Related papers (2021-12-06T08:38:38Z) - Robust Sampling in Deep Learning [62.997667081978825]
Deep learning requires regularization mechanisms to reduce overfitting and improve generalization.
We address this problem by a new regularization method based on distributional robust optimization.
During the training, the selection of samples is done according to their accuracy in such a way that the worst performed samples are the ones that contribute the most in the optimization.
arXiv Detail & Related papers (2020-06-04T09:46:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.