Machine Learning-Friendly Biomedical Datasets for Equivalence and
Subsumption Ontology Matching
- URL: http://arxiv.org/abs/2205.03447v8
- Date: Sun, 23 Jul 2023 00:13:46 GMT
- Title: Machine Learning-Friendly Biomedical Datasets for Equivalence and
Subsumption Ontology Matching
- Authors: Yuan He, Jiaoyan Chen, Hang Dong, Ernesto Jim\'enez-Ruiz, Ali Hadian,
Ian Horrocks
- Abstract summary: We introduce five new Ontology Matching (OM) tasks involving extracted from Mondo and UMLS.
Each task includes both equivalence and subsumption matching; the quality of reference mappings is ensured by human curation.
A comprehensive evaluation framework is proposed to measure OM performance from various perspectives for both ML-based and non-ML-based OM systems.
- Score: 35.76522395991403
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ontology Matching (OM) plays an important role in many domains such as
bioinformatics and the Semantic Web, and its research is becoming increasingly
popular, especially with the application of machine learning (ML) techniques.
Although the Ontology Alignment Evaluation Initiative (OAEI) represents an
impressive effort for the systematic evaluation of OM systems, it still suffers
from several limitations including limited evaluation of subsumption mappings,
suboptimal reference mappings, and limited support for the evaluation of
ML-based systems. To tackle these limitations, we introduce five new biomedical
OM tasks involving ontologies extracted from Mondo and UMLS. Each task includes
both equivalence and subsumption matching; the quality of reference mappings is
ensured by human curation, ontology pruning, etc.; and a comprehensive
evaluation framework is proposed to measure OM performance from various
perspectives for both ML-based and non-ML-based OM systems. We report
evaluation results for OM systems of different types to demonstrate the usage
of these resources, all of which are publicly available as part of the new
BioML track at OAEI 2022.
Related papers
- MLLM4PUE: Toward Universal Embeddings in Computational Pathology through Multimodal LLMs [34.454047458272505]
We highlight the need for universal multimodal embeddings that can support multiple downstream tasks.
Previous approaches often involve fine-tuning CLIP-based models, which handle images and text separately.
We introduce the Pathology Multimodal Embedding Benchmark (PMEB), a benchmark designed to assess the quality of pathology multimodal embeddings.
arXiv Detail & Related papers (2025-02-11T03:28:55Z) - EmbodiedEval: Evaluate Multimodal LLMs as Embodied Agents [57.4686961979566]
EmbodiedEval is a comprehensive and interactive evaluation benchmark for MLLMs with embodied tasks.
It covers a broad spectrum of existing embodied AI tasks with significantly enhanced diversity.
We evaluated the state-of-the-art MLLMs on EmbodiedEval and found that they have a significant shortfall compared to human level on embodied tasks.
arXiv Detail & Related papers (2025-01-21T03:22:10Z) - OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain [62.89809156574998]
We introduce an omnidirectional and automatic RAG benchmark, OmniEval, in the financial domain.
Our benchmark is characterized by its multi-dimensional evaluation framework.
Our experiments demonstrate the comprehensiveness of OmniEval, which includes extensive test datasets.
arXiv Detail & Related papers (2024-12-17T15:38:42Z) - MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs [97.94579295913606]
Multimodal Large Language Models (MLLMs) have garnered increased attention from both industry and academia.
In the development process, evaluation is critical since it provides intuitive feedback and guidance on improving models.
This work aims to offer researchers an easy grasp of how to effectively evaluate MLLMs according to different needs and to inspire better evaluation methods.
arXiv Detail & Related papers (2024-11-22T18:59:54Z) - Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering [70.44269982045415]
Retrieval-augmented generation (RAG) has emerged as a promising approach to enhance the performance of large language models (LLMs)
We introduce Medical Retrieval-Augmented Generation Benchmark (MedRGB) that provides various supplementary elements to four medical QA datasets.
Our experimental results reveals current models' limited ability to handle noise and misinformation in the retrieved documents.
arXiv Detail & Related papers (2024-11-14T06:19:18Z) - Surveying the MLLM Landscape: A Meta-Review of Current Surveys [17.372501468675303]
Multimodal Large Language Models (MLLMs) have become a transformative force in the field of artificial intelligence.
This survey aims to provide a systematic review of benchmark tests and evaluation methods for MLLMs.
arXiv Detail & Related papers (2024-09-17T14:35:38Z) - A Survey for Large Language Models in Biomedicine [31.719451674137844]
This review is based on an analysis of 484 publications sourced from databases including PubMed, Web of Science, and arXiv.
We explore the capabilities of LLMs in zero-shot learning across a broad spectrum of biomedical tasks, including diagnostic assistance, drug discovery, and personalized medicine.
We discuss the challenges that LLMs face in the biomedicine domain including data privacy concerns, limited model interpretability, issues with dataset quality, and ethics.
arXiv Detail & Related papers (2024-08-29T12:39:16Z) - Agent-OM: Leveraging LLM Agents for Ontology Matching [4.222245509121683]
This study introduces a novel agent-powered design paradigm for Ontology matching systems.
We propose a framework, namely Agent-OM (Agent for Ontology Matching), consisting of two Siamese agents for retrieval and matching.
Our system can achieve results very close to the long-standing best performance on simple OM tasks.
arXiv Detail & Related papers (2023-12-01T03:44:54Z) - A systematic evaluation of large language models for biomedical natural language processing: benchmarks, baselines, and recommendations [22.668383945059762]
We present a systematic evaluation of four representative Large Language Models (LLMs) across 12 BioNLP datasets.
The evaluation is conducted under four settings: zero-shot, static few-shot, dynamic K-nearest few-shot, and fine-tuning.
We compare these models against state-of-the-art (SOTA) approaches that fine-tune (domain-specific) BERT or BART models.
arXiv Detail & Related papers (2023-05-10T13:40:06Z) - Machine Learning in Nano-Scale Biomedical Engineering [77.75587007080894]
We review the existing research regarding the use of machine learning in nano-scale biomedical engineering.
The main challenges that can be formulated as ML problems are classified into the three main categories.
For each of the presented methodologies, special emphasis is given to its principles, applications, and limitations.
arXiv Detail & Related papers (2020-08-05T15:45:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.