Can NLI Provide Proper Indirect Supervision for Low-resource Biomedical
Relation Extraction?
- URL: http://arxiv.org/abs/2212.10784v3
- Date: Thu, 19 Oct 2023 05:46:10 GMT
- Title: Can NLI Provide Proper Indirect Supervision for Low-resource Biomedical
Relation Extraction?
- Authors: Jiashu Xu, Mingyu Derek Ma, Muhao Chen
- Abstract summary: Two key obstacles in biomedical relation extraction (RE) are the scarcity of annotations and the prevalence of instances without explicitly pre-defined labels.
Existing approaches, which treat biomedical RE as a multi-class classification task, often result in poor generalization in low-resource settings.
We present NBR, which converts biomedical RE as natural language inference formulation through indirect supervision.
- Score: 39.17663642263077
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Two key obstacles in biomedical relation extraction (RE) are the scarcity of
annotations and the prevalence of instances without explicitly pre-defined
labels due to low annotation coverage. Existing approaches, which treat
biomedical RE as a multi-class classification task, often result in poor
generalization in low-resource settings and do not have the ability to make
selective prediction on unknown cases but give a guess from seen relations,
hindering the applicability of those approaches. We present NBR, which converts
biomedical RE as natural language inference formulation through indirect
supervision. By converting relations to natural language hypotheses, NBR is
capable of exploiting semantic cues to alleviate annotation scarcity. By
incorporating a ranking-based loss that implicitly calibrates abstinent
instances, NBR learns a clearer decision boundary and is instructed to abstain
on uncertain instances. Extensive experiments on three widely-used biomedical
RE benchmarks, namely ChemProt, DDI and GAD, verify the effectiveness of NBR in
both full-set and low-resource regimes. Our analysis demonstrates that indirect
supervision benefits biomedical RE even when a domain gap exists, and combining
NLI knowledge with biomedical knowledge leads to the best performance gains.
Related papers
- BioNCERE: Non-Contrastive Enhancement For Relation Extraction In Biomedical Texts [0.0]
State-of-the-art models for relation extraction (RE) in the biomedical domain may suffer from the anisotropy problem.
Contrastive learning methods can reduce this anisotropy phenomena, and also help to avoid class collapse in any classification problem.
BioNCERE uses transfer learning and non-contrastive learning to avoid full or dimensional collapse as well as bypass overfitting.
arXiv Detail & Related papers (2024-10-31T02:51:56Z) - Learning to Denoise Biomedical Knowledge Graph for Robust Molecular Interaction Prediction [50.7901190642594]
We propose BioKDN (Biomedical Knowledge Graph Denoising Network) for robust molecular interaction prediction.
BioKDN refines the reliable structure of local subgraphs by denoising noisy links in a learnable manner.
It maintains consistent and robust semantics by smoothing relations around the target interaction.
arXiv Detail & Related papers (2023-12-09T07:08:00Z) - Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning.
They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health.
Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z) - Extracting Biomedical Factual Knowledge Using Pretrained Language Model
and Electronic Health Record Context [7.7971830917251275]
We use prompt methods to extract knowledge from Language Models (LMs) as new knowledge Bases (LMs as KBs)
We specifically add EHR notes as context to the prompt to improve the low bound in the biomedical domain.
Our experiments show that the knowledge possessed by those language models can distinguish the correct knowledge from the noise knowledge in the EHR notes.
arXiv Detail & Related papers (2022-08-26T00:01:26Z) - BioRED: A Comprehensive Biomedical Relation Extraction Dataset [6.915371362219944]
We present BioRED, a first-of-its-kind biomedical RE corpus with multiple entity types and relation pairs.
We label each relation as describing either a novel finding or previously known background knowledge, enabling automated algorithms to differentiate between novel and background information.
Our results show that while existing approaches can reach high performance on the NER task, there is much room for improvement for the RE task.
arXiv Detail & Related papers (2022-04-08T19:23:49Z) - Fine-Tuning Large Neural Language Models for Biomedical Natural Language
Processing [55.52858954615655]
We conduct a systematic study on fine-tuning stability in biomedical NLP.
We show that finetuning performance may be sensitive to pretraining settings, especially in low-resource domains.
We show that these techniques can substantially improve fine-tuning performance for lowresource biomedical NLP applications.
arXiv Detail & Related papers (2021-12-15T04:20:35Z) - Gradient Imitation Reinforcement Learning for Low Resource Relation
Extraction [52.63803634033647]
Low-resource relation Extraction (LRE) aims to extract relation facts from limited labeled corpora when human annotation is scarce.
We develop a Gradient Imitation Reinforcement Learning method to encourage pseudo label data to imitate the gradient descent direction on labeled data.
We also propose a framework called GradLRE, which handles two major scenarios in low-resource relation extraction.
arXiv Detail & Related papers (2021-09-14T03:51:15Z) - BBAEG: Towards BERT-based Biomedical Adversarial Example Generation for
Text Classification [1.14219428942199]
We propose BBAEG (Biomedical BERT-based Adversarial Example Generation), a black-box attack algorithm for biomedical text classification.
We demonstrate that BBAEG performs stronger attack with better language fluency, semantic coherence as compared to prior work.
arXiv Detail & Related papers (2021-04-05T05:32:56Z) - Weakly-Supervised Cross-Domain Adaptation for Endoscopic Lesions
Segmentation [79.58311369297635]
We propose a new weakly-supervised lesions transfer framework, which can explore transferable domain-invariant knowledge across different datasets.
A Wasserstein quantified transferability framework is developed to highlight widerange transferable contextual dependencies.
A novel self-supervised pseudo label generator is designed to equally provide confident pseudo pixel labels for both hard-to-transfer and easy-to-transfer target samples.
arXiv Detail & Related papers (2020-12-08T02:26:03Z) - Automatic Extraction of Ranked SNP-Phenotype Associations from
Literature through Detecting Neural Candidates, Negation and Modality Markers [0.0]
There is no available method for extracting the association of SNP-phenotype from text.
The experiments show that negation cues and scope as well as detecting neutral candidates can be employed for implementing a superior relation extraction method.
A modality based approach is proposed to estimate the confidence level of the extracted association.
arXiv Detail & Related papers (2020-12-02T00:03:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.