BioProAgent: Neuro-Symbolic Grounding for Constrained Scientific Planning
- URL: http://arxiv.org/abs/2603.00876v1
- Date: Sun, 01 Mar 2026 02:36:01 GMT
- Title: BioProAgent: Neuro-Symbolic Grounding for Constrained Scientific Planning
- Authors: Yuyang Liu, Jingya Wang, Liuzhenghao Lv, Yonghong Tian,
- Abstract summary: BioProAgent is a neuro-symbolic framework that anchors probabilistic planning in a Finite State Machine.<n>We introduce a State-Augmented Planning mechanism that enforces a rigorous textitDesign-Verify-Rectify workflow.<n>In the extended BioProBench benchmark, BioProAgent achieves 95.6% physical compliance.
- Score: 56.04636248418112
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have demonstrated significant reasoning capabilities in scientific discovery but struggle to bridge the gap to physical execution in wet-labs. In these irreversible environments, probabilistic hallucinations are not merely incorrect, but also cause equipment damage or experimental failure. To address this, we propose \textbf{BioProAgent}, a neuro-symbolic framework that anchors probabilistic planning in a deterministic Finite State Machine (FSM). We introduce a State-Augmented Planning mechanism that enforces a rigorous \textit{Design-Verify-Rectify} workflow, ensuring hardware compliance before execution. Furthermore, we address the context bottleneck inherent in complex device schemas by \textit{Semantic Symbol Grounding}, reducing token consumption by $\sim$6$\times$ through symbolic abstraction. In the extended BioProBench benchmark, BioProAgent achieves 95.6\% physical compliance (compared to 21.0\% for ReAct), demonstrating that neuro-symbolic constraints are essential for reliable autonomy in irreversible physical environments. \footnote{Code at https://github.com/YuyangSunshine/bioproagent and project at https://yuyangsunshine.github.io/BioPro-Project/}
Related papers
- Mozi: Governed Autonomy for Drug Discovery LLM Agents [21.429647382651677]
In dependency-heavy pharmaceutical pipelines, autonomous agents often drift into irreproducible trajectories.<n>We present Mozi, a dual-layer architecture that bridges the flexibility of generative AI with the deterministic rigor of computational biology.<n>We demonstrate Mozi's ability to navigate massive chemical spaces, enforce stringent toxicity filters, and generate highly competitive in silico candidates.
arXiv Detail & Related papers (2026-03-04T02:22:21Z) - PIS: A Physics-Informed System for Accurate State Partitioning of $Aβ_{42}$ Protein Trajectories [3.7874902461360627]
We introduce PIS, a Physics-Informed System designed for robust metastable state partitioning.<n>Our model achieves superior performance on the $A_42$ dataset.<n> PIS provides an interactive platform that features dynamic monitoring of physical characteristics and multi-dimensional result validation.
arXiv Detail & Related papers (2026-02-23T02:27:18Z) - Protect$^*$: Steerable Retrosynthesis through Neuro-Symbolic State Encoding [0.0]
We introduce Protect$*$, a neuro-symbolic framework that grounds the generative capabilities of Large Language Models (LLMs) in rigorous chemical logic.<n>Our approach combines automated rule-based reasoning and the generative of neural models.<n>We demonstrate this neuro-symbolic approach through case studies on complex natural products, including the discovery of a novel synthetic pathway for Erythromycin B.
arXiv Detail & Related papers (2026-02-13T19:41:55Z) - Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning [25.852162778115808]
Test-time compute allocation in large reasoning models (LRMs) is widely used and has applications in mathematical problem solving, code synthesis, and planning.<n>We analyze and harness the model's tendency to restate the question, which we term the emphEcho of Prompt (EOP), as a front-loaded, compute-shaping mechanism.
arXiv Detail & Related papers (2026-02-06T10:53:26Z) - ProAct: Agentic Lookahead in Interactive Environments [56.50613398808361]
ProAct is a framework that enables agents to internalize accurate lookahead reasoning through a two-stage training paradigm.<n>We introduce Grounded LookAhead Distillation (GLAD), where the agent undergoes supervised fine-tuning on trajectories derived from environment-based search.<n>We also propose the Monte-Carlo Critic (MC-Critic), a plug-and-play auxiliary value estimator designed to enhance policy-gradient algorithms.
arXiv Detail & Related papers (2026-02-05T05:45:16Z) - GRASP: Graph Reasoning Agents for Systems Pharmacology with Human-in-the-Loop [0.6019777076722421]
We present textbfGRASP -- a multi-agent, graph-reasoning framework with a human-in-the-loop conversational interface.<n>It encodes QSP models as typed biological knowledge graphs and compiles them to executable/Sim code while preserving units, mass balance, and physiological constraints.<n>It outperforms SME-guided CoT and ToT baselines across biological plausibility, mathematical correctness, structural fidelity, and code quality.
arXiv Detail & Related papers (2025-12-05T07:59:16Z) - Symbolic Neural Generation with Applications to Lead Discovery in Drug Design [1.3534513856953387]
We investigate a class of hybrid neurosymbolic models integrating symbolic learning with neural reasoning.<n>In textitSymbolic Neural Generators (SNGs), symbolic learners examine logical specifications of feasible data from a small set of instances.<n>We implement an SNG combining a restricted form of Inductive Logic Programming (ILP) with a large language model (LLM) and evaluate it on early-stage drug design.
arXiv Detail & Related papers (2025-10-27T14:29:22Z) - Lost in Tokenization: Context as the Key to Unlocking Biomolecular Understanding in Scientific LLMs [78.18336140706471]
Sci-LLMs have emerged as a promising frontier for accelerating biological discovery.<n>Current strategies limit Sci-LLMs' reasoning capacity when processing raw biomolecular sequences.<n>We show that a more effective strategy is to provide Sci-LLMs with high-level structured context.
arXiv Detail & Related papers (2025-10-27T09:03:21Z) - GENERator: A Long-Context Generative Genomic Foundation Model [66.46537421135996]
We present GENERator, a generative genomic foundation model featuring a context length of 98k base pairs (bp) and 1.2B parameters.<n>Trained on an expansive dataset comprising 386B bp of DNA, the GENERator demonstrates state-of-the-art performance across both established and newly proposed benchmarks.<n>It also shows significant promise in sequence optimization, particularly through the prompt-responsive generation of enhancer sequences with specific activity profiles.
arXiv Detail & Related papers (2025-02-11T05:39:49Z) - CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph [66.11279161533619]
CBGBench is a benchmark for structure-based drug design (SBDD)
By categorizing existing methods based on their attributes, CBGBench implements various cutting-edge methods.
We have adapted these models to a range of tasks essential in drug design, which are considered sub-tasks within the graph fill-in-the-blank tasks.
arXiv Detail & Related papers (2024-06-16T08:20:24Z) - Neuro-Symbolic Entropy Regularization [78.16196949641079]
In structured prediction, the goal is to jointly predict many output variables that together encode a structured object.
One approach -- entropy regularization -- posits that decision boundaries should lie in low-probability regions.
We propose a loss, neuro-symbolic entropy regularization, that encourages the model to confidently predict a valid object.
arXiv Detail & Related papers (2022-01-25T06:23:10Z) - Acting in Delayed Environments with Non-Stationary Markov Policies [57.52103323209643]
We introduce a framework for learning and planning in MDPs where the decision-maker commits actions that are executed with a delay of $m$ steps.
We prove that with execution delay, deterministic Markov policies in the original state-space are sufficient for attaining maximal reward, but need to be non-stationary.
We devise a non-stationary Q-learning style model-based algorithm that solves delayed execution tasks without resorting to state-augmentation.
arXiv Detail & Related papers (2021-01-28T13:35:37Z) - Towards Assessment of Randomized Smoothing Mechanisms for Certifying
Adversarial Robustness [50.96431444396752]
We argue that the main difficulty is how to assess the appropriateness of each randomized mechanism.
We first conclude that the Gaussian mechanism is indeed an appropriate option to certify $ell$-norm.
Surprisingly, we show that the Gaussian mechanism is also an appropriate option for certifying $ell_infty$-norm, instead of the Exponential mechanism.
arXiv Detail & Related papers (2020-05-15T03:54:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.