EBOCA: Evidences for BiOmedical Concepts Association Ontology
- URL: http://arxiv.org/abs/2208.01093v1
- Date: Mon, 1 Aug 2022 18:47:03 GMT
- Title: EBOCA: Evidences for BiOmedical Concepts Association Ontology
- Authors: Andrea \'Alvarez P\'erez, Ana Iglesias-Molina, Luc\'ia Prieto
Santamar\'ia, Mar\'ia Poveda-Villal\'on, Carlos Badenes-Olmedo, Alejandro
Rodr\'iguez-Gonz\'alez
- Abstract summary: This paper proposes EBOCA, an ontology that describes (i) biomedical domain concepts and associations between them, and (ii) evidences supporting these associations.
Test data coming from a subset of DISNET and automatic association extractions from texts has been transformed to create a Knowledge Graph that can be used in real scenarios.
- Score: 55.41644538483948
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: There is a large number of online documents data sources available nowadays.
The lack of structure and the differences between formats are the main
difficulties to automatically extract information from them, which also has a
negative impact on its use and reuse. In the biomedical domain, the DISNET
platform emerged to provide researchers with a resource to obtain information
in the scope of human disease networks by means of large-scale heterogeneous
sources. Specifically in this domain, it is critical to offer not only the
information extracted from different sources, but also the evidence that
supports it. This paper proposes EBOCA, an ontology that describes (i)
biomedical domain concepts and associations between them, and (ii) evidences
supporting these associations; with the objective of providing an schema to
improve the publication and description of evidences and biomedical
associations in this domain. The ontology has been successfully evaluated to
ensure there are no errors, modelling pitfalls and that it meets the previously
defined functional requirements. Test data coming from a subset of DISNET and
automatic association extractions from texts has been transformed according to
the proposed ontology to create a Knowledge Graph that can be used in real
scenarios, and which has also been used for the evaluation of the presented
ontology.
Related papers
- Ontology Embedding: A Survey of Methods, Applications and Resources [54.3453925775069]
Ontologies are widely used for representing domain knowledge and meta data.
One straightforward solution is to integrate statistical analysis and machine learning.
Numerous papers have been published on embedding, but a lack of systematic reviews hinders researchers from gaining a comprehensive understanding of this field.
arXiv Detail & Related papers (2024-06-16T14:49:19Z) - Towards Ontology-Enhanced Representation Learning for Large Language Models [0.18416014644193066]
We propose a novel approach to improve an embedding-Large Language Model (embedding-LLM) of interest by infusing knowledge by a reference ontology.
The linguistic information (i.e. concept synonyms and descriptions) and structural information (i.e. is-a relations) are utilized to compile a comprehensive set of concept definitions.
These concept definitions are then employed to fine-tune the target embedding-LLM using a contrastive learning framework.
arXiv Detail & Related papers (2024-05-30T23:01:10Z) - A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis [48.84443450990355]
Deep networks have achieved broad success in analyzing natural images, when applied to medical scans, they often fail in unexcepted situations.
We investigate this challenge and focus on model sensitivity to domain shifts, such as data sampled from different hospitals or data confounded by demographic variables such as sex, race, etc, in the context of chest X-rays and skin lesion images.
Taking inspiration from medical training, we propose giving deep networks a prior grounded in explicit medical knowledge communicated in natural language.
arXiv Detail & Related papers (2024-05-23T17:55:02Z) - Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - Multi-level biomedical NER through multi-granularity embeddings and
enhanced labeling [3.8599767910528917]
This paper proposes a hybrid approach that integrates the strengths of multiple models.
BERT provides contextualized word embeddings, a pre-trained multi-channel CNN for character-level information capture, and following by a BiLSTM + CRF for sequence labelling and modelling dependencies between the words in the text.
We evaluate our model on the benchmark i2b2/2010 dataset, achieving an F1-score of 90.11.
arXiv Detail & Related papers (2023-12-24T21:45:36Z) - Improving Biomedical Abstractive Summarisation with Knowledge
Aggregation from Citation Papers [24.481854035628434]
Existing language models struggle to generate technical summaries that are on par with those produced by biomedical experts.
We propose a novel attention-based citation aggregation model that integrates domain-specific knowledge from citation papers.
Our model outperforms state-of-the-art approaches and achieves substantial improvements in abstractive biomedical text summarisation.
arXiv Detail & Related papers (2023-10-24T09:56:46Z) - PathLDM: Text conditioned Latent Diffusion Model for Histopathology [62.970593674481414]
We introduce PathLDM, the first text-conditioned Latent Diffusion Model tailored for generating high-quality histopathology images.
Our approach fuses image and textual data to enhance the generation process.
We achieved a SoTA FID score of 7.64 for text-to-image generation on the TCGA-BRCA dataset, significantly outperforming the closest text-conditioned competitor with FID 30.1.
arXiv Detail & Related papers (2023-09-01T22:08:32Z) - Cross-Domain Data Integration for Named Entity Disambiguation in
Biomedical Text [5.008513565240167]
We propose a cross-domain data integration method that transfers structural knowledge from a general text knowledge base to the medical domain.
We utilize our integration scheme to augment structural resources and generate a large biomedical NED dataset for pretraining.
Our pretrained model with injected structural knowledge achieves state-of-the-art performance on two benchmark medical NED datasets: MedMentions and BC5CDR.
arXiv Detail & Related papers (2021-10-15T17:38:16Z) - Low Resource Recognition and Linking of Biomedical Concepts from a Large
Ontology [30.324906836652367]
PubMed, the most well known database of biomedical papers, relies on human curators to add these annotations.
Our approach achieves new state-of-the-art results for the UMLS in both traditional recognition/linking and semantic indexing-based evaluation.
arXiv Detail & Related papers (2021-01-26T06:41:12Z) - Text Mining to Identify and Extract Novel Disease Treatments From
Unstructured Datasets [56.38623317907416]
We use Google Cloud to transcribe podcast episodes of an NPR radio show.
We then build a pipeline for systematically pre-processing the text.
Our model successfully identified that Omeprazole can help treat heartburn.
arXiv Detail & Related papers (2020-10-22T19:52:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.