From Literature to Hypotheses: An AI Co-Scientist System for Biomarker-Guided Drug Combination Hypothesis Generation
- URL: http://arxiv.org/abs/2603.00612v1
- Date: Sat, 28 Feb 2026 12:14:37 GMT
- Title: From Literature to Hypotheses: An AI Co-Scientist System for Biomarker-Guided Drug Combination Hypothesis Generation
- Authors: Raneen Younis, Suvinava Basak, Lukas Chavez, Zahra Ahmadi,
- Abstract summary: CoDHy is an interactive, human-in-the-loop system for biomarker-guided drug combination hypothesis generation in cancer research.<n>It integrates structured biomedical databases and unstructured literature evidence into a task-specific knowledge graph.<n>Users can configure the scientific context, inspect intermediate results, and iteratively refine hypotheses.
- Score: 4.281508114645598
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The rapid growth of biomedical literature and curated databases has made it increasingly difficult for researchers to systematically connect biomarker mechanisms to actionable drug combination hypotheses. We present AI Co-Scientist (CoDHy), an interactive, human-in-the-loop system for biomarker-guided drug combination hypothesis generation in cancer research. CoDHy integrates structured biomedical databases and unstructured literature evidence into a task-specific knowledge graph, which serves as the basis for graph-based reasoning and hypothesis construction. The system combines knowledge graph embeddings with agent-based reasoning to generate, validate, and rank candidate drug combinations, while explicitly grounding each hypothesis in retrievable evidence. Through a web-based interface, users can configure the scientific context, inspect intermediate results, and iteratively refine hypotheses, enabling transparent and researcher-steerable exploration rather than automated decision-making. We demonstrate CoDHy as a system for exploratory hypothesis generation and decision support in translational oncology, highlighting its design, interaction workflow, and practical use cases.
Related papers
- BABE: Biology Arena BEnchmark [51.53220868983288]
BABE is a benchmark designed to evaluate the experimental reasoning capabilities of biological AI systems.<n>Our benchmark provides a robust framework for assessing how well AI systems can reason like practicing scientists.
arXiv Detail & Related papers (2026-02-05T16:39:20Z) - SAGE: Agentic Framework for Interpretable and Clinically Translatable Computational Pathology Biomarker Discovery [0.8778472217028965]
We introduce SAGE, an agentic AI system designed to identify interpretable, engineered pathology biomarkers by grounding them in biological evidence.<n> SAGE integrates literature-anchored reasoning with multimodal data analysis to correlate image-derived features with molecular biomarkers, such as gene expression, and clinically relevant outcomes.
arXiv Detail & Related papers (2026-02-01T01:12:12Z) - AI Co-Scientist for Knowledge Synthesis in Medical Contexts: A Proof of Concept [0.0]
We present an AI for scalable and transparent knowledge synthesis based on explicit formalization of Population, Intervention, Comparator, Outcome, and Study design (PICOS)<n>The platform integrates relational storage, vector-based semantic retrieval, and a Neo4j knowledge graph.<n>Results show that PICOS-aware and explainable natural language processing can improve the scalability, transparency, and efficiency of evidence synthesis.
arXiv Detail & Related papers (2026-01-16T23:07:58Z) - BioVerge: A Comprehensive Benchmark and Study of Self-Evaluating Agents for Biomedical Hypothesis Generation [16.117624717812863]
We introduce BioVerge, a comprehensive benchmark, and BioVerge Agent, an LLM-based agent framework, to create a standardized environment for exploring biomedical hypothesis generation.<n>Our dataset includes structured and textual data derived from historical biomedical hypotheses and PubMed literature, organized to support exploration by LLM agents.
arXiv Detail & Related papers (2025-11-12T01:09:52Z) - Hypothesis Hunting with Evolving Networks of Autonomous Scientific Agents [52.50038914857797]
We term this process hypothesis hunting: the cumulative search for insight through sustained exploration across vast and complex hypothesis spaces.<n>We introduce AScience, a framework modeling discovery as the interaction of agents, networks, and evaluation norms, and implement it as ASCollab.<n> Experiments show that such social dynamics enable the accumulation of expert-rated results along the diversity-quality-novelty frontier.
arXiv Detail & Related papers (2025-10-08T08:47:07Z) - BioMaze: Benchmarking and Enhancing Large Language Models for Biological Pathway Reasoning [49.487327661584686]
We introduce BioMaze, a dataset with 5.1K complex pathway problems from real research.<n>Our evaluation of methods such as CoT and graph-augmented reasoning, shows that LLMs struggle with pathway reasoning.<n>To address this, we propose PathSeeker, an LLM agent that enhances reasoning through interactive subgraph-based navigation.
arXiv Detail & Related papers (2025-02-23T17:38:10Z) - Causal Representation Learning from Multimodal Biomedical Observations [57.00712157758845]
We develop flexible identification conditions for multimodal data and principled methods to facilitate the understanding of biomedical datasets.<n>Key theoretical contribution is the structural sparsity of causal connections between modalities.<n>Results on a real-world human phenotype dataset are consistent with established biomedical research.
arXiv Detail & Related papers (2024-11-10T16:40:27Z) - BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments [112.25067497985447]
We introduce BioDiscoveryAgent, an agent that designs new experiments, reasons about their outcomes, and efficiently navigates the hypothesis space to reach desired solutions.<n>BioDiscoveryAgent can uniquely design new experiments without the need to train a machine learning model.<n>It achieves an average of 21% improvement in predicting relevant genetic perturbations across six datasets.
arXiv Detail & Related papers (2024-05-27T19:57:17Z) - Descriptive Knowledge Graph in Biomedical Domain [26.91431888505873]
We present a novel system that automatically extracts and generates informative and descriptive sentences from the biomedical corpus.
Unlike previous search engines or exploration systems that retrieve unconnected passages, our system organizes descriptive sentences as a graph.
We spotlight the application of our system in COVID-19 research, illustrating its utility in areas such as drug repurposing and literature curation.
arXiv Detail & Related papers (2023-10-18T03:10:25Z) - Discovering Drug-Target Interaction Knowledge from Biomedical Literature [107.98712673387031]
The Interaction between Drugs and Targets (DTI) in human body plays a crucial role in biomedical science and applications.
As millions of papers come out every year in the biomedical domain, automatically discovering DTI knowledge from literature becomes an urgent demand in the industry.
We explore the first end-to-end solution for this task by using generative approaches.
We regard the DTI triplets as a sequence and use a Transformer-based model to directly generate them without using the detailed annotations of entities and relations.
arXiv Detail & Related papers (2021-09-27T17:00:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.