Related papers: Problem-solving benefits of down-sampled lexicase selection

Problem-solving benefits of down-sampled lexicase selection

URL: http://arxiv.org/abs/2106.06085v1
Date: Thu, 10 Jun 2021 23:42:09 GMT
Title: Problem-solving benefits of down-sampled lexicase selection
Authors: Thomas Helmuth and Lee Spector
Abstract summary: We show that down-sampled lexicase selection's main benefit stems from the fact that it allows the evolutionary process to examine more individuals within the same computational budget. The reasons that down-sampling helps, however, are not yet fully understood.
Score: 0.20305676256390928
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In genetic programming, an evolutionary method for producing computer programs that solve specified computational problems, parent selection is ordinarily based on aggregate measures of performance across an entire training set. Lexicase selection, by contrast, selects on the basis of performance on random sequences of training cases; this has been shown to enhance problem-solving power in many circumstances. Lexicase selection can also be seen as better reflecting biological evolution, by modeling sequences of challenges that organisms face over their lifetimes. Recent work has demonstrated that the advantages of lexicase selection can be amplified by down-sampling, meaning that only a random subsample of the training cases is used each generation. This can be seen as modeling the fact that individual organisms encounter only subsets of the possible environments, and that environments change over time. Here we provide the most extensive benchmarking of down-sampled lexicase selection to date, showing that its benefits hold up to increased scrutiny. The reasons that down-sampling helps, however, are not yet fully understood. Hypotheses include that down-sampling allows for more generations to be processed with the same budget of program evaluations; that the variation of training data across generations acts as a changing environment, encouraging adaptation; or that it reduces overfitting, leading to more general solutions. We systematically evaluate these hypotheses, finding evidence against all three, and instead draw the conclusion that down-sampled lexicase selection's main benefit stems from the fact that it allows the evolutionary process to examine more individuals within the same computational budget, even though each individual is examined less completely.

Related papers

Lexicase-based Selection Methods with Down-sampling for Symbolic Regression Problems: Overview and Benchmark [0.8602553195689513]
This paper evaluates random as well as informed down-sampling in combination with the relevant lexicase-based selection methods on a wide range of symbolic regression problems. We find that for a given evaluation budget, epsilon-lexicase selection in combination with random or informed down-sampling outperforms all other methods.
arXiv Detail & Related papers (2024-07-31T14:26:22Z)
Diversified Batch Selection for Training Acceleration [68.67164304377732]
A prevalent research line, known as online batch selection, explores selecting informative subsets during the training process. vanilla reference-model-free methods involve independently scoring and selecting data in a sample-wise manner. We propose Diversified Batch Selection (DivBS), which is reference-model-free and can efficiently select diverse and representative samples.
arXiv Detail & Related papers (2024-06-07T12:12:20Z)
DALex: Lexicase-like Selection via Diverse Aggregation [6.394522608656896]
We show that DALex (for Diversely Aggregated Lexicase) achieves significant speedups over lexicase selection and its relaxed variants. Results on program synthesis, deep learning, symbolic regression, and learning systems demonstrate that DALex achieves significant speedups over lexicase selection and its relaxed variants.
arXiv Detail & Related papers (2024-01-23T01:20:15Z)
Probabilistic Lexicase Selection [6.177959045971966]
We introduce probabilistic lexicase selection (plexicase selection), a novel parent selection algorithm that efficiently approximates the probability distribution of lexicase selection. Our method not only demonstrates superior problem-solving capabilities as a semantic-aware selection method, but also benefits from having a probabilistic representation of the selection process.
arXiv Detail & Related papers (2023-05-19T13:57:04Z)
Untangling the Effects of Down-Sampling and Selection in Genetic Programming [40.05141985769286]
Genetic programming systems often use large training sets to evaluate the quality of candidate solutions for selection. Recent studies have shown that both random and informed down-sampling can substantially improve problem-solving success.
arXiv Detail & Related papers (2023-04-14T12:21:19Z)
Informed Down-Sampled Lexicase Selection: Identifying productive training cases for efficient problem solving [40.683810697551166]
Genetic Programming (GP) often uses large training sets and requires all individuals to be evaluated on all training cases during selection. Random down-sampled lexicase selection evaluates individuals on only a random subset of the training cases allowing for more individuals to be explored with the same amount of program executions. In Informed Down-Sampled Lexicase Selection, we use population statistics to build down-samples that contain more distinct and therefore informative training cases.
arXiv Detail & Related papers (2023-01-04T08:47:18Z)
Adaptive Identification of Populations with Treatment Benefit in Clinical Trials: Machine Learning Challenges and Solutions [78.31410227443102]
We study the problem of adaptively identifying patient subpopulations that benefit from a given treatment during a confirmatory clinical trial. We propose AdaGGI and AdaGCPI, two meta-algorithms for subpopulation construction.
arXiv Detail & Related papers (2022-08-11T14:27:49Z)
The Environmental Discontinuity Hypothesis for Down-Sampled Lexicase Selection [0.0]
Down-sampling has proved effective in genetic programming (GP) runs that utilize the lexicase parent selection technique. We hypothesize that the random sampling that is performed every generation causes discontinuities that result in the population being unable to adapt to the shifting environment. We find that forcing incremental environmental change is not significantly better for evolving solutions to program synthesis problems than simple random down-sampling.
arXiv Detail & Related papers (2022-05-31T16:21:14Z)
Equivariance Allows Handling Multiple Nuisance Variables When Analyzing Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution. We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z)
SelectAugment: Hierarchical Deterministic Sample Selection for Data Augmentation [72.58308581812149]
We propose an effective approach, dubbed SelectAugment, to select samples to be augmented in a deterministic and online manner. Specifically, in each batch, we first determine the augmentation ratio, and then decide whether to augment each training sample under this ratio. In this way, the negative effects of the randomness in selecting samples to augment can be effectively alleviated and the effectiveness of DA is improved.
arXiv Detail & Related papers (2021-12-06T08:38:38Z)
Rethinking Sampling Strategies for Unsupervised Person Re-identification [59.47536050785886]
We analyze the reasons for the performance differences between various sampling strategies under the same framework and loss function. Group sampling is proposed, which gathers samples from the same class into groups. Experiments on Market-1501, DukeMTMC-reID and MSMT17 show that group sampling achieves performance comparable to state-of-the-art methods.
arXiv Detail & Related papers (2021-07-07T05:39:58Z)
Robust Sampling in Deep Learning [62.997667081978825]
Deep learning requires regularization mechanisms to reduce overfitting and improve generalization. We address this problem by a new regularization method based on distributional robust optimization. During the training, the selection of samples is done according to their accuracy in such a way that the worst performed samples are the ones that contribute the most in the optimization.
arXiv Detail & Related papers (2020-06-04T09:46:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.