Optimal Decision Making in High-Throughput Virtual Screening Pipelines
- URL: http://arxiv.org/abs/2109.11683v1
- Date: Thu, 23 Sep 2021 22:58:14 GMT
- Title: Optimal Decision Making in High-Throughput Virtual Screening Pipelines
- Authors: Hyun-Myung Woo, Xiaoning Qian, Li Tan, Shantenu Jha, Francis J.
Alexander, Edward R. Dougherty, Byung-Jun Yoon
- Abstract summary: We propose two optimization frameworks, applying to most (if not all) screening campaigns involving experimental or/and computational evaluations.
In particular, we consider the optimal computational campaign for the long non-coding RNA (lncRNA) classification as a practical example.
The simulation results demonstrate that the proposed frameworks significantly reduce the effective selection cost per potential candidate.
- Score: 12.366455276434513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Effective selection of the potential candidates that meet certain conditions
in a tremendously large search space has been one of the major concerns in many
real-world applications. In addition to the nearly infinitely large search
space, rigorous evaluation of a sample based on the reliable experimental or
computational platform is often prohibitively expensive, making the screening
problem more challenging. In such a case, constructing a high-throughput
screening (HTS) pipeline that pre-sifts the samples expected to be potential
candidates through the efficient earlier stages, results in a significant
amount of savings in resources. However, to the best of our knowledge, despite
many successful applications, no one has studied optimal pipeline design or
optimal pipeline operations. In this study, we propose two optimization
frameworks, applying to most (if not all) screening campaigns involving
experimental or/and computational evaluations, for optimally determining the
screening thresholds of an HTS pipeline. We validate the proposed frameworks on
both analytic and practical scenarios. In particular, we consider the optimal
computational campaign for the long non-coding RNA (lncRNA) classification as a
practical example. To accomplish this, we built the high-throughput virtual
screening (HTVS) pipeline for classifying the lncRNA. The simulation results
demonstrate that the proposed frameworks significantly reduce the effective
selection cost per potential candidate and make the HTS pipelines less
sensitive to their structural variations. In addition to the validation, we
provide insights on constructing a better HTS pipeline based on the simulation
results.
Related papers
- Efficient Weighting Schemes for Auditing Instant-Runoff Voting Elections [57.67176250198289]
AWAIRE involves adaptively weighted averages of test statistics, essentially "learning" an effective set of hypotheses to test.
We explore schemes and settings more extensively, to identify and recommend efficient choices for practice.
A limitation of the current AWAIRE implementation is its restriction to a small number of candidates.
arXiv Detail & Related papers (2024-02-18T10:13:01Z) - Poisson Process for Bayesian Optimization [126.51200593377739]
We propose a ranking-based surrogate model based on the Poisson process and introduce an efficient BO framework, namely Poisson Process Bayesian Optimization (PoPBO)
Compared to the classic GP-BO method, our PoPBO has lower costs and better robustness to noise, which is verified by abundant experiments.
arXiv Detail & Related papers (2024-02-05T02:54:50Z) - Can LLMs Configure Software Tools [0.76146285961466]
In software engineering, the meticulous configuration of software tools is crucial in ensuring optimal performance within intricate systems.
In this study, we embark on an exploration of leveraging Large-Language Models (LLMs) to streamline the software configuration process.
Our work presents a novel approach that employs LLMs, such as Chat-GPT, to identify starting conditions and narrow down the search space, improving configuration efficiency.
arXiv Detail & Related papers (2023-12-11T05:03:02Z) - Learning Regions of Interest for Bayesian Optimization with Adaptive
Level-Set Estimation [84.0621253654014]
We propose a framework, called BALLET, which adaptively filters for a high-confidence region of interest.
We show theoretically that BALLET can efficiently shrink the search space, and can exhibit a tighter regret bound than standard BO.
arXiv Detail & Related papers (2023-07-25T09:45:47Z) - Task-specific experimental design for treatment effect estimation [59.879567967089145]
Large randomised trials (RCTs) are the standard for causal inference.
Recent work has proposed more sample-efficient alternatives to RCTs, but these are not adaptable to the downstream application for which the causal effect is sought.
We develop a task-specific approach to experimental design and derive sampling strategies customised to particular downstream applications.
arXiv Detail & Related papers (2023-06-08T18:10:37Z) - Selection by Prediction with Conformal p-values [7.917044695538599]
We study screening procedures that aim to select candidates whose unobserved outcomes exceed user-specified values.
We develop a method that wraps around any prediction model to produce a subset of candidates while controlling the proportion of falsely selected units.
arXiv Detail & Related papers (2022-10-04T06:34:49Z) - Generating Exact Optimal Designs via Particle Swarm Optimization:
Assessing Efficacy and Efficiency via Case Study [0.0]
We present the results of a large computer study in which we bench-mark both efficiency and efficacy of PSO to generate high quality candidate designs.
PSO is demonstrated, even in a single run, to generate highly efficient designs with large probability at small computing cost.
arXiv Detail & Related papers (2022-06-14T16:00:22Z) - Reinforcement Learning based Sequential Batch-sampling for Bayesian
Optimal Experimental Design [1.6249267147413522]
Sequential design of experiments (SDOE) is a popular suite of methods, that has yielded promising results in recent years.
In this work, we aim to extend the SDOE strategy, to query the experiment or computer code at a batch of inputs.
A unique capability of the proposed methodology is its ability to be applied to multiple tasks, for example optimization of a function, once its trained.
arXiv Detail & Related papers (2021-12-21T02:25:23Z) - Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment.
Policy gradients for local search are often obtained from random perturbations.
We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z) - Provably Efficient Reward-Agnostic Navigation with Linear Value
Iteration [143.43658264904863]
We show how iteration under a more standard notion of low inherent Bellman error, typically employed in least-square value-style algorithms, can provide strong PAC guarantees on learning a near optimal value function.
We present a computationally tractable algorithm for the reward-free setting and show how it can be used to learn a near optimal policy for any (linear) reward function.
arXiv Detail & Related papers (2020-08-18T04:34:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.