Related papers: Hypothesis Hunting with Evolving Networks of Autonomous Scientific Agents

Hypothesis Hunting with Evolving Networks of Autonomous Scientific Agents

URL: http://arxiv.org/abs/2510.08619v1
Date: Wed, 08 Oct 2025 08:47:07 GMT
Title: Hypothesis Hunting with Evolving Networks of Autonomous Scientific Agents
Authors: Tennison Liu, Silas Ruhrberg Estévez, David L. Bentley, Mihaela van der Schaar,
Abstract summary: We term this process hypothesis hunting: the cumulative search for insight through sustained exploration across vast and complex hypothesis spaces.<n>We introduce AScience, a framework modeling discovery as the interaction of agents, networks, and evaluation norms, and implement it as ASCollab.<n> Experiments show that such social dynamics enable the accumulation of expert-rated results along the diversity-quality-novelty frontier.
Score: 52.50038914857797
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large-scale scientific datasets -- spanning health biobanks, cell atlases, Earth reanalyses, and more -- create opportunities for exploratory discovery unconstrained by specific research questions. We term this process hypothesis hunting: the cumulative search for insight through sustained exploration across vast and complex hypothesis spaces. To support it, we introduce AScience, a framework modeling discovery as the interaction of agents, networks, and evaluation norms, and implement it as ASCollab, a distributed system of LLM-based research agents with heterogeneous behaviors. These agents self-organize into evolving networks, continually producing and peer-reviewing findings under shared standards of evaluation. Experiments show that such social dynamics enable the accumulation of expert-rated results along the diversity-quality-novelty frontier, including rediscoveries of established biomarkers, extensions of known pathways, and proposals of new therapeutic targets. While wet-lab validation remains indispensable, our experiments on cancer cohorts demonstrate that socially structured, agentic networks can sustain exploratory hypothesis hunting at scale.

Related papers

From Literature to Hypotheses: An AI Co-Scientist System for Biomarker-Guided Drug Combination Hypothesis Generation [4.281508114645598]
CoDHy is an interactive, human-in-the-loop system for biomarker-guided drug combination hypothesis generation in cancer research.<n>It integrates structured biomedical databases and unstructured literature evidence into a task-specific knowledge graph.<n>Users can configure the scientific context, inspect intermediate results, and iteratively refine hypotheses.
arXiv Detail & Related papers (2026-02-28T12:14:37Z)
BABE: Biology Arena BEnchmark [51.53220868983288]
BABE is a benchmark designed to evaluate the experimental reasoning capabilities of biological AI systems.<n>Our benchmark provides a robust framework for assessing how well AI systems can reason like practicing scientists.
arXiv Detail & Related papers (2026-02-05T16:39:20Z)
FIRE-Bench: Evaluating Agents on the Rediscovery of Scientific Insights [63.32178443510396]
We introduce FIRE-Bench (Full-cycle Insight Rediscovery Evaluation), a benchmark that evaluates agents through the rediscovery of established findings.<n>Even the strongest agents achieve limited rediscovery success (50 F1), exhibit high variance across runs, and display recurring failure modes in experimental design, execution, and evidence-based reasoning.
arXiv Detail & Related papers (2026-02-02T23:21:13Z)
Cross-Disciplinary Knowledge Retrieval and Synthesis: A Compound AI Architecture for Scientific Discovery [1.5143261755366868]
BioSage is a novel compound AI architecture that integrates LLMs with RAG, orchestrated specialized agents and tools to enable discoveries across AI, data science, biomedical, and biosecurity domains.<n>Our system features several specialized agents including the retrieval agent with query planning and response synthesis that enable knowledge retrieval across domains with citation-backed responses.<n>Our ongoing work focuses on multimodal retrieval and reasoning over charts, tables, and structured scientific data, along with developing comprehensive multimodal benchmarks for cross-disciplinary discovery.
arXiv Detail & Related papers (2025-11-23T05:33:11Z)
BioVerge: A Comprehensive Benchmark and Study of Self-Evaluating Agents for Biomedical Hypothesis Generation [16.117624717812863]
We introduce BioVerge, a comprehensive benchmark, and BioVerge Agent, an LLM-based agent framework, to create a standardized environment for exploring biomedical hypothesis generation.<n>Our dataset includes structured and textual data derived from historical biomedical hypotheses and PubMed literature, organized to support exploration by LLM agents.
arXiv Detail & Related papers (2025-11-12T01:09:52Z)
Operationalizing Serendipity: Multi-Agent AI Workflows for Enhanced Materials Characterization with Theory-in-the-Loop [0.0]
SciLink is an open-source, multi-agent artificial intelligence framework designed to operationalize serendipity in materials research.<n>It creates a direct, automated link between experimental observation, novelty assessment, and theoretical simulations.<n>We show its application to atomic-resolution and hyperspectral data, its capacity to integrate real-time human expert guidance, and its ability to close the research loop.
arXiv Detail & Related papers (2025-08-07T04:59:17Z)
ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition [67.26124739345332]
Large language models (LLMs) have demonstrated potential in assisting scientific research, yet their ability to discover high-quality research hypotheses remains unexamined.<n>We introduce the first large-scale benchmark for evaluating LLMs with a near-sufficient set of sub-tasks of scientific discovery.<n>We develop an automated framework that extracts critical components - research questions, background surveys, inspirations, and hypotheses - from scientific papers.
arXiv Detail & Related papers (2025-03-27T08:09:15Z)
Causal Representation Learning from Multimodal Biomedical Observations [57.00712157758845]
We develop flexible identification conditions for multimodal data and principled methods to facilitate the understanding of biomedical datasets.<n>Key theoretical contribution is the structural sparsity of causal connections between modalities.<n>Results on a real-world human phenotype dataset are consistent with established biomedical research.
arXiv Detail & Related papers (2024-11-10T16:40:27Z)
Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation [15.495976478018264]
Large language models (LLMs) have emerged as a promising tool to revolutionize knowledge interaction. We construct a dataset of background-hypothesis pairs from biomedical literature, partitioned into training, seen, and unseen test sets. We assess the hypothesis generation capabilities of top-tier instructed models in zero-shot, few-shot, and fine-tuning settings.
arXiv Detail & Related papers (2024-07-12T02:55:13Z)
Seeing Unseen: Discover Novel Biomedical Concepts via Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues. We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space. A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z)
Literature-based Discovery for Landscape Planning [1.1939762265857434]
This project demonstrates how medical corpus hypothesis generation can be used to derive new research angles for landscape and urban planners. AGATHA was used to identify likely conceptual relationships between emerging infectious diseases (EIDs) and deforestation. This research also serves as a partial proof-of-concept for the application of medical database hypothesis generation to medicine-adjacent hypothesis discovery.
arXiv Detail & Related papers (2023-06-05T04:32:46Z)
GeneDisco: A Benchmark for Experimental Design in Drug Discovery [41.6425999218259]
In vitro cellular experimentation with genetic interventions is an essential step in early-stage drug discovery. GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.
arXiv Detail & Related papers (2021-10-22T16:01:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.