Does constituency analysis enhance domain-specific pre-trained BERT
models for relation extraction?
- URL: http://arxiv.org/abs/2112.02955v1
- Date: Thu, 25 Nov 2021 10:27:10 GMT
- Title: Does constituency analysis enhance domain-specific pre-trained BERT
models for relation extraction?
- Authors: Anfu Tang (LISN), Louise Del\'eger, Robert Bossy, Pierre Zweigenbaum
(LISN), Claire N\'edellec
- Abstract summary: The DrugProt track at BioCreative VII provides a manually-annotated corpus for the development and evaluation of relation extraction systems.
We describe the ensemble system that we used for our submission, which combines predictions of fine-tuned bioBERT, sciBERT and const-bioBERT models by majority voting.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently many studies have been conducted on the topic of relation
extraction. The DrugProt track at BioCreative VII provides a manually-annotated
corpus for the purpose of the development and evaluation of relation extraction
systems, in which interactions between chemicals and genes are studied. We
describe the ensemble system that we used for our submission, which combines
predictions of fine-tuned bioBERT, sciBERT and const-bioBERT models by majority
voting. We specifically tested the contribution of syntactic information to
relation extraction with BERT. We observed that adding constituentbased
syntactic information to BERT improved precision, but decreased recall, since
relations rarely seen in the train set were less likely to be predicted by BERT
models in which the syntactic information is infused. Our code is available
online [https://github.com/Maple177/drugprot-relation-extraction].
Related papers
- Extracting Protein-Protein Interactions (PPIs) from Biomedical
Literature using Attention-based Relational Context Information [5.456047952635665]
This work presents a unified, multi-source PPI corpora with vetted interaction definitions augmented by binary interaction type labels.
A Transformer-based deep learning method exploits entities' relational context information for relation representation to improve relation classification performance.
The model's performance is evaluated on four widely studied biomedical relation extraction datasets.
arXiv Detail & Related papers (2024-03-08T01:43:21Z) - Learning to Denoise Biomedical Knowledge Graph for Robust Molecular Interaction Prediction [50.7901190642594]
We propose BioKDN (Biomedical Knowledge Graph Denoising Network) for robust molecular interaction prediction.
BioKDN refines the reliable structure of local subgraphs by denoising noisy links in a learnable manner.
It maintains consistent and robust semantics by smoothing relations around the target interaction.
arXiv Detail & Related papers (2023-12-09T07:08:00Z) - BioREx: Improving Biomedical Relation Extraction by Leveraging
Heterogeneous Datasets [7.7587371896752595]
Biomedical relation extraction (RE) is a central task in biomedical natural language processing (NLP) research.
We present a novel framework for systematically addressing the data heterogeneity of individual datasets and combining them into a large dataset.
Our evaluation shows that BioREx achieves significantly higher performance than the benchmark system trained on the individual dataset.
arXiv Detail & Related papers (2023-06-19T22:48:18Z) - Drug Synergistic Combinations Predictions via Large-Scale Pre-Training
and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation.
Deep learning models have emerged as an efficient way to discover synergistic combinations.
Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z) - CU-UD: text-mining drug and chemical-protein interactions with ensembles
of BERT-based models [12.08949974675794]
BioCreative VII track 1 DrugProt task aims to promote the development and evaluation of systems that can automatically detect relations between chemical compounds/drugs and genes/proteins in PubMed abstracts.
We describe our submission, which is an ensemble system, including multiple BERT-based language models.
Our system obtained 0.7708 in precision and 0.7770 in recall, for an F1 score of 0.7739, demonstrating the effectiveness of using ensembles of BERT-based language models for automatically detecting relations between chemicals and proteins.
arXiv Detail & Related papers (2021-11-11T13:55:21Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - D-REX: Dialogue Relation Extraction with Explanations [65.3862263565638]
This work focuses on extracting explanations that indicate that a relation exists while using only partially labeled data.
We propose our model-agnostic framework, D-REX, a policy-guided semi-supervised algorithm that explains and ranks relations.
We find that about 90% of the time, human annotators prefer D-REX's explanations over a strong BERT-based joint relation extraction and explanation model.
arXiv Detail & Related papers (2021-09-10T22:30:48Z) - Improving BERT Model Using Contrastive Learning for Biomedical Relation
Extraction [13.354066085659198]
Contrastive learning is not widely utilized in natural language processing due to the lack of a general method of data augmentation for text data.
In this work, we explore the method of employing contrastive learning to improve the text representation from the BERT model for relation extraction.
The experimental results on three relation extraction benchmark datasets demonstrate that our method can improve the BERT model representation and achieve state-of-the-art performance.
arXiv Detail & Related papers (2021-04-28T17:50:24Z) - Learning Relation Prototype from Unlabeled Texts for Long-tail Relation
Extraction [84.64435075778988]
We propose a general approach to learn relation prototypes from unlabeled texts.
We learn relation prototypes as an implicit factor between entities.
We conduct experiments on two publicly available datasets: New York Times and Google Distant Supervision.
arXiv Detail & Related papers (2020-11-27T06:21:12Z) - Investigation of BERT Model on Biomedical Relation Extraction Based on
Revised Fine-tuning Mechanism [2.8881198461098894]
We will investigate the method of utilizing the entire layer in the fine-tuning process of BERT model.
In addition, further analysis indicates that the key knowledge about the relations can be learned from the last layer of BERT model.
arXiv Detail & Related papers (2020-11-01T01:47:16Z) - Text Mining to Identify and Extract Novel Disease Treatments From
Unstructured Datasets [56.38623317907416]
We use Google Cloud to transcribe podcast episodes of an NPR radio show.
We then build a pipeline for systematically pre-processing the text.
Our model successfully identified that Omeprazole can help treat heartburn.
arXiv Detail & Related papers (2020-10-22T19:52:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.