Analyzing the Interaction Between Down-Sampling and Selection
- URL: http://arxiv.org/abs/2304.07089v1
- Date: Fri, 14 Apr 2023 12:21:19 GMT
- Title: Analyzing the Interaction Between Down-Sampling and Selection
- Authors: Ryan Boldi, Ashley Bao, Martin Briesch, Thomas Helmuth, Dominik
Sobania, Lee Spector, Alexander Lalejini
- Abstract summary: Genetic programming systems often use large training sets to evaluate the quality of candidate solutions for selection.
Down-sampling training sets has long been used to decrease the computational cost of evaluation in a wide range of application domains.
- Score: 52.77024349608834
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Genetic programming systems often use large training sets to evaluate the
quality of candidate solutions for selection. However, evaluating populations
on large training sets can be computationally expensive. Down-sampling training
sets has long been used to decrease the computational cost of evaluation in a
wide range of application domains. Indeed, recent studies have shown that both
random and informed down-sampling can substantially improve problem-solving
success for GP systems that use the lexicase parent selection algorithm. We use
the PushGP framework to experimentally test whether these down-sampling
techniques can also improve problem-solving success in the context of two other
commonly used selection methods, fitness-proportionate and tournament
selection, across eight GP problems (four program synthesis and four symbolic
regression). We verified that down-sampling can benefit the problem-solving
success of both fitness-proportionate and tournament selection. However, the
number of problems wherein down-sampling improved problem-solving success
varied by selection scheme, suggesting that the impact of down-sampling depends
both on the problem and choice of selection scheme. Surprisingly, we found that
down-sampling was most consistently beneficial when combined with lexicase
selection as compared to tournament and fitness-proportionate selection.
Overall, our results suggest that down-sampling should be considered more often
when solving test-based GP problems.
Related papers
- Learning Fair Policies for Multi-stage Selection Problems from
Observational Data [4.282745020665833]
We consider the problem of learning fair policies for multi-stage selection problems from observational data.
This problem arises in several high-stakes domains such as company hiring, loan approval, or bail decisions where outcomes are only observed for those selected.
We propose a multi-stage framework that can be augmented with various fairness constraints, such as demographic parity or equal opportunity.
arXiv Detail & Related papers (2023-12-20T16:33:15Z) - Selecting Learnable Training Samples is All DETRs Need in Crowded
Pedestrian Detection [72.97320260601347]
In crowded pedestrian detection, the performance of DETRs is still unsatisfactory due to the inappropriate sample selection method.
We propose Sample Selection for Crowded Pedestrians, which consists of the constraint-guided label assignment scheme (CGLA)
Experimental results show that the proposed SSCP effectively improves the baselines without introducing any overhead in inference.
arXiv Detail & Related papers (2023-05-18T08:28:01Z) - A Static Analysis of Informed Down-Samples [62.997667081978825]
We study recorded populations from the first generation of genetic programming runs, as well as entirely synthetic populations.
We show that both forms of down-sampling cause greater test coverage loss than standard lexicase selection with no down-sampling.
arXiv Detail & Related papers (2023-04-04T17:34:48Z) - In Search of Insights, Not Magic Bullets: Towards Demystification of the
Model Selection Dilemma in Heterogeneous Treatment Effect Estimation [92.51773744318119]
This paper empirically investigates the strengths and weaknesses of different model selection criteria.
We highlight that there is a complex interplay between selection strategies, candidate estimators and the data used for comparing them.
arXiv Detail & Related papers (2023-02-06T16:55:37Z) - Informed Down-Sampled Lexicase Selection: Identifying productive
training cases for efficient problem solving [40.683810697551166]
Genetic Programming (GP) often uses large training sets and requires all individuals to be evaluated on all training cases during selection.
Random down-sampled lexicase selection evaluates individuals on only a random subset of the training cases allowing for more individuals to be explored with the same amount of program executions.
In Informed Down-Sampled Lexicase Selection, we use population statistics to build down-samples that contain more distinct and therefore informative training cases.
arXiv Detail & Related papers (2023-01-04T08:47:18Z) - The Environmental Discontinuity Hypothesis for Down-Sampled Lexicase
Selection [0.0]
Down-sampling has proved effective in genetic programming (GP) runs that utilize the lexicase parent selection technique.
We hypothesize that the random sampling that is performed every generation causes discontinuities that result in the population being unable to adapt to the shifting environment.
We find that forcing incremental environmental change is not significantly better for evolving solutions to program synthesis problems than simple random down-sampling.
arXiv Detail & Related papers (2022-05-31T16:21:14Z) - Problem-solving benefits of down-sampled lexicase selection [0.20305676256390928]
We show that down-sampled lexicase selection's main benefit stems from the fact that it allows the evolutionary process to examine more individuals within the same computational budget.
The reasons that down-sampling helps, however, are not yet fully understood.
arXiv Detail & Related papers (2021-06-10T23:42:09Z) - Multi-characteristic Subject Selection from Biased Datasets [79.82881947891589]
We present a constrained optimization-based method that finds the best possible sampling fractions for the different population subgroups.
Our results show that our proposed method outperforms the baselines for all problem variations by up to 90%.
arXiv Detail & Related papers (2020-12-18T15:55:27Z) - Bloom Origami Assays: Practical Group Testing [90.2899558237778]
Group testing is a well-studied problem with several appealing solutions.
Recent biological studies impose practical constraints for COVID-19 that are incompatible with traditional methods.
We develop a new method combining Bloom filters with belief propagation to scale to larger values of n (more than 100) with good empirical results.
arXiv Detail & Related papers (2020-07-21T19:31:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.