Related papers: AdaLead: A simple and robust adaptive greedy search algorithm for sequence design

AdaLead: A simple and robust adaptive greedy search algorithm for sequence design

URL: http://arxiv.org/abs/2010.02141v1
Date: Mon, 5 Oct 2020 16:40:38 GMT
Title: AdaLead: A simple and robust adaptive greedy search algorithm for sequence design
Authors: Sam Sinai, Richard Wang, Alexander Whatley, Stewart Slocum, Elina Locane, Eric D. Kelsic
Abstract summary: We develop an easy-to-directed, scalable, and robust evolutionary greedy algorithm (AdaLead) AdaLead is a remarkably strong benchmark that out-competes more complex state of the art approaches in a variety of biologically motivated sequence design challenges.
Score: 55.41644538483948
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Efficient design of biological sequences will have a great impact across many industrial and healthcare domains. However, discovering improved sequences requires solving a difficult optimization problem. Traditionally, this challenge was approached by biologists through a model-free method known as "directed evolution", the iterative process of random mutation and selection. As the ability to build models that capture the sequence-to-function map improves, such models can be used as oracles to screen sequences before running experiments. In recent years, interest in better algorithms that effectively use such oracles to outperform model-free approaches has intensified. These span from approaches based on Bayesian Optimization, to regularized generative models and adaptations of reinforcement learning. In this work, we implement an open-source Fitness Landscape EXploration Sandbox (FLEXS: github.com/samsinai/FLEXS) environment to test and evaluate these algorithms based on their optimality, consistency, and robustness. Using FLEXS, we develop an easy-to-implement, scalable, and robust evolutionary greedy algorithm (AdaLead). Despite its simplicity, we show that AdaLead is a remarkably strong benchmark that out-competes more complex state of the art approaches in a variety of biologically motivated sequence design challenges.

Related papers

Multiple-gain Estimation for Running Time of Evolutionary Combinatorial Optimization [12.810490884742784]
This paper proposes a multiple-gain model to estimate the fitness trend of population during iterations. The proposed model is an improved version of the average gain model, which is the approach to estimate the running time of evolutionary algorithms for numerical optimization.
arXiv Detail & Related papers (2025-01-13T01:24:36Z)
Fast Genetic Algorithm for feature selection -- A qualitative approximation approach [5.279268784803583]
We propose a two-stage surrogate-assisted evolutionary approach to address the computational issues arising from using Genetic Algorithm (GA) for feature selection. We show that CHCQX converges faster to feature subset solutions of significantly higher accuracy, particularly for large datasets with over 100K instances.
arXiv Detail & Related papers (2024-04-05T10:15:24Z)
Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data. In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z)
Designing Biological Sequences via Meta-Reinforcement Learning and Bayesian Optimization [68.28697120944116]
We train an autoregressive generative model via Meta-Reinforcement Learning to propose promising sequences for selection. We pose this problem as that of finding an optimal policy over a distribution of MDPs induced by sampling subsets of the data. Our in-silico experiments show that meta-learning over such ensembles provides robustness against reward misspecification and achieves competitive results.
arXiv Detail & Related papers (2022-09-13T18:37:27Z)
Improving RNA Secondary Structure Design using Deep Reinforcement Learning [69.63971634605797]
We propose a new benchmark of applying reinforcement learning to RNA sequence design, in which the objective function is defined to be the free energy in the sequence's secondary structure. We show results of the ablation analysis that we do for these algorithms, as well as graphs indicating the algorithm's performance across batches.
arXiv Detail & Related papers (2021-11-05T02:54:06Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Polygonal Unadjusted Langevin Algorithms: Creating stable and efficient adaptive algorithms for neural networks [0.0]
We present a new class of Langevin based algorithms, which overcomes many of the known shortcomings of popular adaptive vanishing algorithms. In particular, we provide a nonasymptotic analysis and full theoretical guarantees for the convergence properties of an algorithm of this novel class, which we named TH$varepsilon$O POULA (or, simply, TheoPouLa)
arXiv Detail & Related papers (2021-05-28T15:58:48Z)
Evolutionary Variational Optimization of Generative Models [0.0]
We combine two popular optimization approaches to derive learning algorithms for generative models: variational optimization and evolutionary algorithms. We show that evolutionary algorithms can effectively and efficiently optimize the variational bound. In the category of "zero-shot" learning, we observed the evolutionary variational algorithm to significantly improve the state-of-the-art in many benchmark settings.
arXiv Detail & Related papers (2020-12-22T19:06:33Z)
Combination of digital signal processing and assembled predictive models facilitates the rational design of proteins [0.0]
Predicting the effect of mutations in proteins is one of the most critical challenges in protein engineering. We use clustering, embedding, and dimensionality reduction techniques to select combinations of physicochemical properties for the encoding stage. We then select the best performing predictive models in each set of properties and create an assembled model.
arXiv Detail & Related papers (2020-10-07T16:35:02Z)
Devolutionary genetic algorithms with application to the minimum labeling Steiner tree problem [0.0]
This paper characterizes and discusses devolutionary genetic algorithms and evaluates their performances in solving the minimum labeling Steiner tree (MLST) problem. We define devolutionary algorithms as the process of reaching a feasible solution by devolving a population of super-optimal unfeasible solutions over time. We show how classical evolutionary concepts, such as crossing, mutation and fitness can be adapted to aim at reaching an optimal or close-to-optimal solution.
arXiv Detail & Related papers (2020-04-18T13:27:28Z)
Learning Gaussian Graphical Models via Multiplicative Weights [54.252053139374205]
We adapt an algorithm of Klivans and Meka based on the method of multiplicative weight updates. The algorithm enjoys a sample complexity bound that is qualitatively similar to others in the literature. It has a low runtime $O(mp2)$ in the case of $m$ samples and $p$ nodes, and can trivially be implemented in an online manner.
arXiv Detail & Related papers (2020-02-20T10:50:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.