Related papers: Evaluating the Potential of Leading Large Language Models in Reasoning Biology Questions

Evaluating the Potential of Leading Large Language Models in Reasoning Biology Questions

URL: http://arxiv.org/abs/2311.07582v1
Date: Sun, 5 Nov 2023 03:34:17 GMT
Title: Evaluating the Potential of Leading Large Language Models in Reasoning Biology Questions
Authors: Xinyu Gong, Jason Holmes, Yiwei Li, Zhengliang Liu, Qi Gan, Zihao Wu, Jianli Zhang, Yusong Zou, Yuxi Teng, Tian Jiang, Hongtu Zhu, Wei Liu, Tianming Liu, Yajun Yan
Abstract summary: This study evaluated the capabilities of leading Large Language Models (LLMs) in answering conceptual biology questions. The models were tested on a 108-question multiple-choice exam covering biology topics in molecular biology, biological techniques, metabolic engineering, and synthetic biology. The results indicated GPT-4's proficiency in logical reasoning and its potential to aid biology research through capabilities like data analysis, hypothesis generation, and knowledge integration.
Score: 33.81650223615028
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in Large Language Models (LLMs) have presented new opportunities for integrating Artificial General Intelligence (AGI) into biological research and education. This study evaluated the capabilities of leading LLMs, including GPT-4, GPT-3.5, PaLM2, Claude2, and SenseNova, in answering conceptual biology questions. The models were tested on a 108-question multiple-choice exam covering biology topics in molecular biology, biological techniques, metabolic engineering, and synthetic biology. Among the models, GPT-4 achieved the highest average score of 90 and demonstrated the greatest consistency across trials with different prompts. The results indicated GPT-4's proficiency in logical reasoning and its potential to aid biology research through capabilities like data analysis, hypothesis generation, and knowledge integration. However, further development and validation are still required before the promise of LLMs in accelerating biological discovery can be realized.

Related papers

Flow Matching Meets Biology and Life Science: A Survey [65.2146737141455]
Flow matching has emerged as a powerful and efficient alternative to diffusion-based generative modeling.<n>This paper presents the first comprehensive survey of recent developments in flow matching and its applications in biological domains.
arXiv Detail & Related papers (2025-07-23T17:44:29Z)
BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model [12.596088399210581]
We introduce BioReason, a pioneering architecture that integrates a DNA foundation model with a Large Language Model.<n>BioReason's sophisticated multi-step reasoning is developed through supervised fine-tuning and targeted reinforcement learning.<n>On biological reasoning benchmarks, BioReason demonstrates an average 15% performance gain over strong single-modality baselines.
arXiv Detail & Related papers (2025-05-29T15:49:27Z)
CellVerse: Do Large Language Models Really Understand Cell Biology? [74.34984441715517]
We introduce CellVerse, a unified language-centric question-answering benchmark that integrates four types of single-cell multi-omics data.<n>We systematically evaluate the performance across 14 open-source and closed-source LLMs ranging from 160M to 671B on CellVerse.
arXiv Detail & Related papers (2025-05-09T06:47:23Z)
Biomedical Foundation Model: A Survey [84.26268124754792]
Foundation models are large-scale pre-trained models that learn from extensive unlabeled datasets. These models can be adapted to various applications such as question answering and visual understanding. This survey explores the potential of foundation models across diverse domains within biomedical fields.
arXiv Detail & Related papers (2025-03-03T22:42:00Z)
BioMaze: Benchmarking and Enhancing Large Language Models for Biological Pathway Reasoning [49.487327661584686]
We introduce BioMaze, a dataset with 5.1K complex pathway problems from real research. Our evaluation of methods such as CoT and graph-augmented reasoning, shows that LLMs struggle with pathway reasoning. To address this, we propose PathSeeker, an LLM agent that enhances reasoning through interactive subgraph-based navigation.
arXiv Detail & Related papers (2025-02-23T17:38:10Z)
Biology Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language Models [51.316001071698224]
We introduce Biology-Instructions, the first large-scale multi-omics biological sequences-related instruction-tuning dataset. This dataset can bridge the gap between large language models (LLMs) and complex biological sequences-related tasks. We also develop a strong baseline called ChatMultiOmics with a novel three-stage training pipeline.
arXiv Detail & Related papers (2024-12-26T12:12:23Z)
GP-GPT: Large Language Model for Gene-Phenotype Mapping [44.12550855245415]
GP-GPT is the first specialized large language model for genetic-phenotype knowledge representation and genomics relation analysis. Our model is fine-tuned in two stages on a comprehensive corpus composed of over 3,000,000 terms in genomics, genetics and scientific publications.
arXiv Detail & Related papers (2024-09-15T18:56:20Z)
VLM4Bio: A Benchmark Dataset to Evaluate Pretrained Vision-Language Models for Trait Discovery from Biological Images [21.497452524517783]
We evaluate the effectiveness of 12 state-of-the-art (SOTA) VLMs in the field of organismal biology using a novel dataset, VLM4Bio. We also explore the effects of applying prompting techniques and tests for reasoning hallucination on the performance of VLMs.
arXiv Detail & Related papers (2024-08-28T23:53:57Z)
Multimodal Large Language Models for Bioimage Analysis [39.120941702559726]
Multimodal Large Language Models (MLLMs) exhibit strong emergent capacities, such as understanding, analyzing, reasoning, and generalization. With these capabilities, MLLMs hold promise to extract intricate information from biological images and data obtained through various modalities. Development of MLLMs shows increasing promise in serving as intelligent assistants or agents for augmenting human researchers in biology research.
arXiv Detail & Related papers (2024-07-29T08:21:25Z)
Understanding Biology in the Age of Artificial Intelligence [4.299566787216408]
Modern life sciences research is increasingly relying on artificial intelligence approaches to model biological systems. Although machine learning (ML) models are useful for identifying patterns in large, complex data sets, its widespread application in biological sciences represents a significant deviation from traditional methods of scientific inquiry. Here, we identify general principles that can guide the design and application of ML systems to model biological phenomena and advance scientific knowledge.
arXiv Detail & Related papers (2024-03-06T23:20:34Z)
BioT5+: Towards Generalized Biological Understanding with IUPAC Integration and Multi-task Tuning [77.90250740041411]
This paper introduces BioT5+, an extension of the BioT5 framework, tailored to enhance biological research and drug discovery. BioT5+ incorporates several novel features: integration of IUPAC names for molecular understanding, inclusion of extensive bio-text and molecule data from sources like bioRxiv and PubChem, the multi-task instruction tuning for generality across tasks, and a numerical tokenization technique for improved processing of numerical data.
arXiv Detail & Related papers (2024-02-27T12:43:09Z)
An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks. These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems. Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z)
Progress and Opportunities of Foundation Models in Bioinformatics [77.74411726471439]
Foundations models (FMs) have ushered in a new era in computational biology, especially in the realm of deep learning. Central to our focus is the application of FMs to specific biological problems, aiming to guide the research community in choosing appropriate FMs for their research needs. Review analyses challenges and limitations faced by FMs in biology, such as data noise, model explainability, and potential biases.
arXiv Detail & Related papers (2024-02-06T02:29:17Z)
ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab [67.24684071577211]
The challenge of replicating research results has posed a significant impediment to the field of molecular biology. We first curate a comprehensive multimodal dataset, named ProBio, as an initial step towards this objective. Next, we devise two challenging benchmarks, transparent solution tracking and multimodal action recognition, to emphasize the unique characteristics and difficulties associated with activity understanding in BioLab settings.
arXiv Detail & Related papers (2023-11-01T14:44:01Z)
BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining [140.61707108174247]
We propose BioGPT, a domain-specific generative Transformer language model pre-trained on large scale biomedical literature. We get 44.98%, 38.42% and 40.76% F1 score on BC5CDR, KD-DTI and DDI end-to-end relation extraction tasks respectively, and 78.2% accuracy on PubMedQA.
arXiv Detail & Related papers (2022-10-19T07:17:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.