LLMs4Life: Large Language Models for Ontology Learning in Life Sciences
- URL: http://arxiv.org/abs/2412.02035v1
- Date: Mon, 02 Dec 2024 23:31:52 GMT
- Title: LLMs4Life: Large Language Models for Ontology Learning in Life Sciences
- Authors: Nadeen Fathallah, Steffen Staab, Alsayed Algergawy,
- Abstract summary: Existing Large Language Models (LLMs) struggle to generate with multiple hierarchical levels, rich interconnections, and comprehensive coverage.<n>We extend the NeOn-GPT for ontology learning using LLMs with advanced prompt engineering techniques.<n>Our evaluation shows the viability of LLMs for learning in specialized domains, providing solutions to longstanding limitations in model performance and scalability.
- Score: 10.658387847149195
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ontology learning in complex domains, such as life sciences, poses significant challenges for current Large Language Models (LLMs). Existing LLMs struggle to generate ontologies with multiple hierarchical levels, rich interconnections, and comprehensive class coverage due to constraints on the number of tokens they can generate and inadequate domain adaptation. To address these issues, we extend the NeOn-GPT pipeline for ontology learning using LLMs with advanced prompt engineering techniques and ontology reuse to enhance the generated ontologies' domain-specific reasoning and structural depth. Our work evaluates the capabilities of LLMs in ontology learning in the context of highly specialized and complex domains such as life science domains. To assess the logical consistency, completeness, and scalability of the generated ontologies, we use the AquaDiva ontology developed and used in the collaborative research center AquaDiva as a case study. Our evaluation shows the viability of LLMs for ontology learning in specialized domains, providing solutions to longstanding limitations in model performance and scalability.
Related papers
- Assessing the Capability of Large Language Models for Domain-Specific Ontology Generation [1.099532646524593]
Large Language Models (LLMs) have shown significant potential for ontology engineering.
We investigate the generalizability of two state-of-the-art LLMs, DeepSeek and o1-preview, by generating from a set of competency questions.
Our findings show that the performance of the experiments is remarkably consistent across all domains.
arXiv Detail & Related papers (2025-04-24T09:47:14Z) - A Call for New Recipes to Enhance Spatial Reasoning in MLLMs [85.67171333213301]
Multimodal Large Language Models (MLLMs) have demonstrated impressive performance in general vision-language tasks.
Recent studies have exposed critical limitations in their spatial reasoning capabilities.
This deficiency in spatial reasoning significantly constrains MLLMs' ability to interact effectively with the physical world.
arXiv Detail & Related papers (2025-04-21T11:48:39Z) - A Survey on Post-training of Large Language Models [185.51013463503946]
Large Language Models (LLMs) have fundamentally transformed natural language processing, making them indispensable across domains ranging from conversational systems to scientific exploration.
These challenges necessitate advanced post-training language models (PoLMs) to address shortcomings, such as restricted reasoning capacities, ethical uncertainties, and suboptimal domain-specific performance.
This paper presents the first comprehensive survey of PoLMs, systematically tracing their evolution across five core paradigms.
arXiv Detail & Related papers (2025-03-08T05:41:42Z) - OntoTune: Ontology-Driven Self-training for Aligning Large Language Models [36.707858872631945]
Training on large-scale corpora often fails to effectively organize domain knowledge of Large Language Models.
Inspired by how humans connect concepts and organize knowledge through mind maps, we propose an ontology-driven self-training framework called OntoTune.
We conduct our study in the medical domain to evaluate the effectiveness of OntoTune.
arXiv Detail & Related papers (2025-02-08T07:38:45Z) - Biology Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language Models [51.316001071698224]
We introduce Biology-Instructions, the first large-scale multi-omics biological sequences-related instruction-tuning dataset.
This dataset can bridge the gap between large language models (LLMs) and complex biological sequences-related tasks.
We also develop a strong baseline called ChatMultiOmics with a novel three-stage training pipeline.
arXiv Detail & Related papers (2024-12-26T12:12:23Z) - CLR-Fact: Evaluating the Complex Logical Reasoning Capability of Large Language Models over Factual Knowledge [44.59258397967782]
Large language models (LLMs) have demonstrated impressive capabilities across various natural language processing tasks.
We present a systematic evaluation of state-of-the-art LLMs' complex logical reasoning abilities.
We find that LLMs excel at reasoning over general world knowledge but face significant challenges with specialized domain-specific knowledge.
arXiv Detail & Related papers (2024-07-30T05:40:32Z) - Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? [54.667202878390526]
Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases.
We introduce LOFT, a benchmark of real-world tasks requiring context up to millions of tokens designed to evaluate LCLMs' performance on in-context retrieval and reasoning.
Our findings reveal LCLMs' surprising ability to rival state-of-the-art retrieval and RAG systems, despite never having been explicitly trained for these tasks.
arXiv Detail & Related papers (2024-06-19T00:28:58Z) - SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models [70.01883340129204]
spatial reasoning is a crucial component of both biological and artificial intelligence.
We present a comprehensive study of the capability of current state-of-the-art large language models (LLMs) on spatial reasoning.
arXiv Detail & Related papers (2024-06-07T01:06:34Z) - On the Use of Large Language Models to Generate Capability Ontologies [43.06143768014157]
Large Language Models (LLMs) have shown that they can generate machine-interpretable models from natural language text input.
This paper investigates how LLMs can be used to create capability.
arXiv Detail & Related papers (2024-04-26T16:41:00Z) - Towards Complex Ontology Alignment using Large Language Models [1.3218260503808055]
Ontology alignment is a critical process in Web for detecting relationships between different labels and content.
Recent advancements in Large Language Models (LLMs) presents new opportunities for enhancing engineering practices.
This paper investigates the application of LLM technologies to tackle the complex alignment challenge.
arXiv Detail & Related papers (2024-04-16T07:13:22Z) - Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey [100.24095818099522]
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP)
They provide a highly useful, task-agnostic foundation for a wide range of applications.
However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles.
arXiv Detail & Related papers (2023-05-30T03:00:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.