INQUIRE-Search: A Framework for Interactive Discovery in Large-Scale Biodiversity Databases
- URL: http://arxiv.org/abs/2511.15656v1
- Date: Wed, 19 Nov 2025 17:42:35 GMT
- Title: INQUIRE-Search: A Framework for Interactive Discovery in Large-Scale Biodiversity Databases
- Authors: Edward Vendrow, Julia Chae, Rupa Kurinchi-Vendhan, Isaac Eckert, Jazlynn Hall, Marta Jarzyna, Reymond Miyajima, Ruth Oliver, Laura Pollock, Lauren Schrack, Scott Yanco, Oisin Mac Aodha, Sara Beery,
- Abstract summary: INQUIRE-Search is an open-source system that enables scientists to rapidly and interactively search within an ecological image database.<n>We show the diversity of scientific applications that a tool like INQUIRE-Search can support, from seasonal variation across species to forest regrowth after wildfires.
- Score: 19.569666968746166
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Large community science platforms such as iNaturalist contain hundreds of millions of biodiversity images that often capture ecological context on behaviors, interactions, phenology, and habitat. Yet most ecological workflows rely on metadata filtering or manual inspection, leaving this secondary information inaccessible at scale. We introduce INQUIRE-Search, an open-source system that enables scientists to rapidly and interactively search within an ecological image database for specific concepts using natural language, verify and export relevant observations, and utilize this discovered data for novel scientific analysis. Compared to traditional methods, INQUIRE-Search takes a fraction of the time, opening up new possibilities for scientific questions that can be explored. Through five case studies, we show the diversity of scientific applications that a tool like INQUIRE-Search can support, from seasonal variation in behavior across species to forest regrowth after wildfires. These examples demonstrate a new paradigm for interactive, efficient, and scalable scientific discovery that can begin to unlock previously inaccessible scientific value in large-scale biodiversity datasets. Finally, we emphasize using such AI-enabled discovery tools for science call for experts to reframe the priorities of the scientific process and develop novel methods for experiment design, data collection, survey effort, and uncertainty analysis.
Related papers
- WildSci: Advancing Scientific Reasoning from In-the-Wild Literature [50.16160754134139]
We introduce WildSci, a new dataset of domain-specific science questions automatically synthesized from peer-reviewed literature.<n>By framing complex scientific reasoning tasks in a multiple-choice format, we enable scalable training with well-defined reward signals.<n>Experiments on a suite of scientific benchmarks demonstrate the effectiveness of our dataset and approach.
arXiv Detail & Related papers (2026-01-09T06:35:23Z) - DeepEvidence: Empowering Biomedical Discovery with Deep Knowledge Graph Research [33.51246292480848]
We introduce DeepEvidence, an AI-agent framework designed to perform Deep Research across various biomedical knowledge graphs (KGs)<n>Unlike generic Deep Research systems that rely primarily on internet-scale text, DeepEvidence incorporates specialized knowledge-graph tooling and coordinated exploration strategies.<n>DeepEvidence demonstrates substantial gains in systematic exploration and evidence synthesis across four key stages of the biomedical discovery lifecycle.
arXiv Detail & Related papers (2025-12-23T14:34:38Z) - Towards Open-Ended Visual Scientific Discovery with Sparse Autoencoders [11.190791003373322]
We ask whether sparse autoencoders can enable open-ended feature discovery from foundation model representations.<n>Applying to ecological imagery, the same procedure surfaces fine-grained anatomical structure without access to segmentation or part labels.<n>Our results indicate that sparse decomposition provides a practical instrument for exploring what scientific foundation models have learned.
arXiv Detail & Related papers (2025-11-21T19:38:07Z) - Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics [82.55776608452017]
Large language models (LLMs) provide a flexible and versatile framework that orchestrates interactions with human scientists, natural language, computer language and code, and physics.<n>This paper presents our view and vision of LLM-based scientific agents and their growing role in transforming the scientific discovery lifecycle.<n>We identify open research challenges and outline promising directions for building more robust, generalizable, and adaptive scientific agents.
arXiv Detail & Related papers (2025-10-10T22:26:26Z) - Hypothesis Hunting with Evolving Networks of Autonomous Scientific Agents [52.50038914857797]
We term this process hypothesis hunting: the cumulative search for insight through sustained exploration across vast and complex hypothesis spaces.<n>We introduce AScience, a framework modeling discovery as the interaction of agents, networks, and evaluation norms, and implement it as ASCollab.<n> Experiments show that such social dynamics enable the accumulation of expert-rated results along the diversity-quality-novelty frontier.
arXiv Detail & Related papers (2025-10-08T08:47:07Z) - Flow Matching Meets Biology and Life Science: A Survey [65.2146737141455]
Flow matching has emerged as a powerful and efficient alternative to diffusion-based generative modeling.<n>This paper presents the first comprehensive survey of recent developments in flow matching and its applications in biological domains.
arXiv Detail & Related papers (2025-07-23T17:44:29Z) - Mining for Species, Locations, Habitats, and Ecosystems from Scientific Papers in Invasion Biology: A Large-Scale Exploratory Study with Large Language Models [6.364723262453785]
This paper harnesses the capabilities of large language models (LLMs) to mine key ecological entities from invasion biology literature.<n>Specifically, we focus on extracting species names, their locations, associated habitats, and ecosystems, information that is critical for understanding species spread.<n>This study lays the groundwork for more advanced, automated knowledge extraction tools that can aid researchers and practitioners in understanding and managing biological invasions.
arXiv Detail & Related papers (2025-01-30T11:55:44Z) - A Review of BioTree Construction in the Context of Information Fusion: Priors, Methods, Applications and Trends [41.740569399988644]
Biological tree (BioTree) analysis is a foundational tool in biology, enabling the exploration of evolutionary and differentiation.<n>Traditional tree construction methods face challenges in handling the growing complexity and scale of modern biological data.<n>Advances in deep learning (DL) offer transformative opportunities by enabling the fusion of biological prior knowledge with data-driven models.
arXiv Detail & Related papers (2024-10-07T08:00:41Z) - GFlowNets for AI-Driven Scientific Discovery [74.27219800878304]
We present a new probabilistic machine learning framework called GFlowNets.
GFlowNets can be applied in the modeling, hypotheses generation and experimental design stages of the experimental science loop.
We argue that GFlowNets can become a valuable tool for AI-driven scientific discovery.
arXiv Detail & Related papers (2023-02-01T17:29:43Z) - Modeling Information Change in Science Communication with Semantically
Matched Paraphrases [50.67030449927206]
SPICED is the first paraphrase dataset of scientific findings annotated for degree of information change.
SPICED contains 6,000 scientific finding pairs extracted from news stories, social media discussions, and full texts of original papers.
Models trained on SPICED improve downstream performance on evidence retrieval for fact checking of real-world scientific claims.
arXiv Detail & Related papers (2022-10-24T07:44:38Z) - Unlocking the potential of deep learning for marine ecology: overview,
applications, and outlook [8.3226670069051]
This paper aims to bridge the gap between marine ecologists and computer scientists.
We provide insight into popular deep learning approaches for ecological data analysis in plain language.
We illustrate challenges and opportunities through established and emerging applications of deep learning to marine ecology.
arXiv Detail & Related papers (2021-09-29T21:59:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.