Runtime phylogenetic analysis enables extreme subsampling for test-based
problems
- URL: http://arxiv.org/abs/2402.01610v1
- Date: Fri, 2 Feb 2024 18:14:33 GMT
- Title: Runtime phylogenetic analysis enables extreme subsampling for test-based
problems
- Authors: Alexander Lalejini, Marcos Sanson, Jack Garbus, Matthew Andres Moreno,
Emily Dolson
- Abstract summary: We introduce phylogeny-informed subsampling, a new class of subsampling methods that exploit runtime phylogenetic analyses for solving test-based problems.
We find that phylogeny-informed subsampling methods enable problem-solving success at extreme subsampling levels where other subsampling methods fail.
Our diagnostic experiments show that phylogeny-informed subsampling improves diversity maintenance relative to random subsampling, but its effects on a selection scheme's capacity to rapidly exploit fitness gradients varied by selection scheme.
- Score: 42.642008092347986
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A phylogeny describes the evolutionary history of an evolving population.
Evolutionary search algorithms can perfectly track the ancestry of candidate
solutions, illuminating a population's trajectory through the search space.
However, phylogenetic analyses are typically limited to post-hoc studies of
search performance. We introduce phylogeny-informed subsampling, a new class of
subsampling methods that exploit runtime phylogenetic analyses for solving
test-based problems. Specifically, we assess two phylogeny-informed subsampling
methods -- individualized random subsampling and ancestor-based subsampling --
on three diagnostic problems and ten genetic programming (GP) problems from
program synthesis benchmark suites. Overall, we found that phylogeny-informed
subsampling methods enable problem-solving success at extreme subsampling
levels where other subsampling methods fail. For example, phylogeny-informed
subsampling methods more reliably solved program synthesis problems when
evaluating just one training case per-individual, per-generation. However, at
moderate subsampling levels, phylogeny-informed subsampling generally performed
no better than random subsampling on GP problems. Our diagnostic experiments
show that phylogeny-informed subsampling improves diversity maintenance
relative to random subsampling, but its effects on a selection scheme's
capacity to rapidly exploit fitness gradients varied by selection scheme.
Continued refinements of phylogeny-informed subsampling techniques offer a
promising new direction for scaling up evolutionary systems to handle problems
with many expensive-to-evaluate fitness criteria.
Related papers
- Subgroup analysis methods for time-to-event outcomes in heterogeneous
randomized controlled trials [7.940293148084845]
Non-significant randomized control trials can hide subgroups of good responders to experimental drugs.
We provide an open source Python package, available on Github, containing our generation process and our comprehensive benchmark framework.
arXiv Detail & Related papers (2024-01-22T11:00:49Z) - Phylogeny-informed fitness estimation [58.720142291102135]
We propose phylogeny-informed fitness estimation, which exploits a population's phylogeny to estimate fitness evaluations.
Our results indicate that phylogeny-informed fitness estimation can mitigate the drawbacks of down-sampled lexicase.
This work serves as an initial step toward improving evolutionary algorithms by exploiting runtime phylogenetic analysis.
arXiv Detail & Related papers (2023-06-06T19:05:01Z) - Challenging mitosis detection algorithms: Global labels allow centroid
localization [1.7382198387953947]
Mitotic activity is a crucial biomarker for the diagnosis and prognosis of different types of cancers.
In this work, we propose to avoid complex scenarios, and we perform the localization task in a weakly supervised manner, using only image-level labels on patches.
The results obtained on the publicly available TUPAC16 dataset are competitive with state-of-the-art methods, using only one training phase.
arXiv Detail & Related papers (2022-11-30T09:52:26Z) - Adaptive Identification of Populations with Treatment Benefit in
Clinical Trials: Machine Learning Challenges and Solutions [78.31410227443102]
We study the problem of adaptively identifying patient subpopulations that benefit from a given treatment during a confirmatory clinical trial.
We propose AdaGGI and AdaGCPI, two meta-algorithms for subpopulation construction.
arXiv Detail & Related papers (2022-08-11T14:27:49Z) - Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient
Methods [73.35353358543507]
Gradient Descent-Ascent (SGDA) is one of the most prominent algorithms for solving min-max optimization and variational inequalities problems (VIP)
In this paper, we propose a unified convergence analysis that covers a large variety of descent-ascent methods.
We develop several new variants of SGDA such as a new variance-reduced method (L-SVRGDA), new distributed methods with compression (QSGDA, DIANA-SGDA, VR-DIANA-SGDA), and a new method with coordinate randomization (SEGA-SGDA)
arXiv Detail & Related papers (2022-02-15T09:17:39Z) - Problem-solving benefits of down-sampled lexicase selection [0.20305676256390928]
We show that down-sampled lexicase selection's main benefit stems from the fact that it allows the evolutionary process to examine more individuals within the same computational budget.
The reasons that down-sampling helps, however, are not yet fully understood.
arXiv Detail & Related papers (2021-06-10T23:42:09Z) - AdaLead: A simple and robust adaptive greedy search algorithm for
sequence design [55.41644538483948]
We develop an easy-to-directed, scalable, and robust evolutionary greedy algorithm (AdaLead)
AdaLead is a remarkably strong benchmark that out-competes more complex state of the art approaches in a variety of biologically motivated sequence design challenges.
arXiv Detail & Related papers (2020-10-05T16:40:38Z) - Rectified Meta-Learning from Noisy Labels for Robust Image-based Plant
Disease Diagnosis [64.82680813427054]
Plant diseases serve as one of main threats to food security and crop production.
One popular approach is to transform this problem as a leaf image classification task, which can be addressed by the powerful convolutional neural networks (CNNs)
We propose a novel framework that incorporates rectified meta-learning module into common CNN paradigm to train a noise-robust deep network without using extra supervision information.
arXiv Detail & Related papers (2020-03-17T09:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.