Towards Realistic Low-resource Relation Extraction: A Benchmark with
Empirical Baseline Study
- URL: http://arxiv.org/abs/2210.10678v3
- Date: Mon, 18 Sep 2023 11:16:48 GMT
- Title: Towards Realistic Low-resource Relation Extraction: A Benchmark with
Empirical Baseline Study
- Authors: Xin Xu, Xiang Chen, Ningyu Zhang, Xin Xie, Xi Chen, Huajun Chen
- Abstract summary: This paper presents an empirical study to build relation extraction systems in low-resource settings.
We investigate three schemes to evaluate the performance in low-resource settings: (i) different types of prompt-based methods with few-shot labeled data; (ii) diverse balancing methods to address the long-tailed distribution issue; and (iii) data augmentation technologies and self-training to generate more labeled in-domain data.
- Score: 51.33182775762785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents an empirical study to build relation extraction systems
in low-resource settings. Based upon recent pre-trained language models, we
comprehensively investigate three schemes to evaluate the performance in
low-resource settings: (i) different types of prompt-based methods with
few-shot labeled data; (ii) diverse balancing methods to address the
long-tailed distribution issue; (iii) data augmentation technologies and
self-training to generate more labeled in-domain data. We create a benchmark
with 8 relation extraction (RE) datasets covering different languages, domains
and contexts and perform extensive comparisons over the proposed schemes with
combinations. Our experiments illustrate: (i) Though prompt-based tuning is
beneficial in low-resource RE, there is still much potential for improvement,
especially in extracting relations from cross-sentence contexts with multiple
relational triples; (ii) Balancing methods are not always helpful for RE with
long-tailed distribution; (iii) Data augmentation complements existing
baselines and can bring much performance gain, while self-training may not
consistently achieve advancement to low-resource RE. Code and datasets are in
https://github.com/zjunlp/LREBench.
Related papers
- Unleashing the Power of Large Language Models in Zero-shot Relation Extraction via Self-Prompting [21.04933334040135]
We introduce the Self-Prompting framework, a novel method designed to fully harness the embedded RE knowledge within Large Language Models.
Our framework employs a three-stage diversity approach to prompt LLMs, generating multiple synthetic samples that encapsulate specific relations from scratch.
Experimental evaluations on benchmark datasets show our approach outperforms existing LLM-based zero-shot RE methods.
arXiv Detail & Related papers (2024-10-02T01:12:54Z) - How Good are LLMs at Relation Extraction under Low-Resource Scenario? Comprehensive Evaluation [7.151108031568037]
This paper constructs low-resource relation extraction datasets in 10 low-resource languages (LRLs) in three regions (Central Asia, Southeast Asia and Middle East)
The corpora are constructed by translating the original publicly available English RE datasets (NYT10, FewRel and CrossRE) using an effective multilingual machine translation.
Then, we use the language perplexity (PPL) to filter out the low-quality data from the translated datasets.
arXiv Detail & Related papers (2024-06-17T03:02:04Z) - Enhancing Low-Resource Relation Representations through Multi-View Decoupling [21.32064890807893]
We propose a novel prompt-based relation representation method, named MVRE.
MVRE decouples each relation into different perspectives to encompass multi-view relation representations.
Our method can achieve state-of-the-art in low-resource settings.
arXiv Detail & Related papers (2023-12-26T14:16:16Z) - Noisy Self-Training with Synthetic Queries for Dense Retrieval [49.49928764695172]
We introduce a novel noisy self-training framework combined with synthetic queries.
Experimental results show that our method improves consistently over existing methods.
Our method is data efficient and outperforms competitive baselines.
arXiv Detail & Related papers (2023-11-27T06:19:50Z) - Continual Contrastive Finetuning Improves Low-Resource Relation
Extraction [34.76128090845668]
Relation extraction has been particularly challenging in low-resource scenarios and domains.
Recent literature has tackled low-resource RE by self-supervised learning.
We propose to pretrain and finetune the RE model using consistent objectives of contrastive learning.
arXiv Detail & Related papers (2022-12-21T07:30:22Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - A Generative Model for Relation Extraction and Classification [23.1277041729626]
We present a novel generative model for relation extraction and classification (which we call GREC)
We explore various encoding representations for the source and target sequences, and design effective schemes that enable GREC to achieve state-of-the-art performance on three benchmark RE datasets.
Our approach can be extended to extract all relation triples from a sentence in one pass.
arXiv Detail & Related papers (2022-02-26T21:17:18Z) - HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain
Language Model Compression [53.90578309960526]
Large pre-trained language models (PLMs) have shown overwhelming performances compared with traditional neural network methods.
We propose a hierarchical relational knowledge distillation (HRKD) method to capture both hierarchical and domain relational information.
arXiv Detail & Related papers (2021-10-16T11:23:02Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.