Towards Realistic Low-resource Relation Extraction: A Benchmark with
Empirical Baseline Study
- URL: http://arxiv.org/abs/2210.10678v3
- Date: Mon, 18 Sep 2023 11:16:48 GMT
- Title: Towards Realistic Low-resource Relation Extraction: A Benchmark with
Empirical Baseline Study
- Authors: Xin Xu, Xiang Chen, Ningyu Zhang, Xin Xie, Xi Chen, Huajun Chen
- Abstract summary: This paper presents an empirical study to build relation extraction systems in low-resource settings.
We investigate three schemes to evaluate the performance in low-resource settings: (i) different types of prompt-based methods with few-shot labeled data; (ii) diverse balancing methods to address the long-tailed distribution issue; and (iii) data augmentation technologies and self-training to generate more labeled in-domain data.
- Score: 51.33182775762785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents an empirical study to build relation extraction systems
in low-resource settings. Based upon recent pre-trained language models, we
comprehensively investigate three schemes to evaluate the performance in
low-resource settings: (i) different types of prompt-based methods with
few-shot labeled data; (ii) diverse balancing methods to address the
long-tailed distribution issue; (iii) data augmentation technologies and
self-training to generate more labeled in-domain data. We create a benchmark
with 8 relation extraction (RE) datasets covering different languages, domains
and contexts and perform extensive comparisons over the proposed schemes with
combinations. Our experiments illustrate: (i) Though prompt-based tuning is
beneficial in low-resource RE, there is still much potential for improvement,
especially in extracting relations from cross-sentence contexts with multiple
relational triples; (ii) Balancing methods are not always helpful for RE with
long-tailed distribution; (iii) Data augmentation complements existing
baselines and can bring much performance gain, while self-training may not
consistently achieve advancement to low-resource RE. Code and datasets are in
https://github.com/zjunlp/LREBench.
Related papers
- Diversity Over Quantity: A Lesson From Few Shot Relation Classification [62.66895901654023]
We show that training on a diverse set of relations significantly enhances a model's ability to generalize to unseen relations.
We introduce REBEL-FS, a new FSRC benchmark that incorporates an order of magnitude more relation types than existing datasets.
arXiv Detail & Related papers (2024-12-06T21:41:01Z) - Unleashing the Power of Large Language Models in Zero-shot Relation Extraction via Self-Prompting [21.04933334040135]
We introduce the Self-Prompting framework, a novel method designed to fully harness the embedded RE knowledge within Large Language Models.
Our framework employs a three-stage diversity approach to prompt LLMs, generating multiple synthetic samples that encapsulate specific relations from scratch.
Experimental evaluations on benchmark datasets show our approach outperforms existing LLM-based zero-shot RE methods.
arXiv Detail & Related papers (2024-10-02T01:12:54Z) - A Distribution-Aware Flow-Matching for Generating Unstructured Data for Few-Shot Reinforcement Learning [1.0709300917082865]
We introduce a distribution-aware flow matching approach to generate synthetic unstructured data for few-shot reinforcement learning.
Our approach addresses key challenges in traditional model-based RL, such as overfitting and data correlation.
Results demonstrate that our method achieves stable convergence in terms of maximum Q-value while enhancing frame rates by 30% in the initial timestamps.
arXiv Detail & Related papers (2024-09-21T15:50:59Z) - How Good are LLMs at Relation Extraction under Low-Resource Scenario? Comprehensive Evaluation [7.151108031568037]
This paper constructs low-resource relation extraction datasets in 10 low-resource languages (LRLs) in three regions (Central Asia, Southeast Asia and Middle East)
The corpora are constructed by translating the original publicly available English RE datasets (NYT10, FewRel and CrossRE) using an effective multilingual machine translation.
Then, we use the language perplexity (PPL) to filter out the low-quality data from the translated datasets.
arXiv Detail & Related papers (2024-06-17T03:02:04Z) - Enhancing Low-Resource Relation Representations through Multi-View Decoupling [21.32064890807893]
We propose a novel prompt-based relation representation method, named MVRE.
MVRE decouples each relation into different perspectives to encompass multi-view relation representations.
Our method can achieve state-of-the-art in low-resource settings.
arXiv Detail & Related papers (2023-12-26T14:16:16Z) - Noisy Self-Training with Synthetic Queries for Dense Retrieval [49.49928764695172]
We introduce a novel noisy self-training framework combined with synthetic queries.
Experimental results show that our method improves consistently over existing methods.
Our method is data efficient and outperforms competitive baselines.
arXiv Detail & Related papers (2023-11-27T06:19:50Z) - Continual Contrastive Finetuning Improves Low-Resource Relation
Extraction [34.76128090845668]
Relation extraction has been particularly challenging in low-resource scenarios and domains.
Recent literature has tackled low-resource RE by self-supervised learning.
We propose to pretrain and finetune the RE model using consistent objectives of contrastive learning.
arXiv Detail & Related papers (2022-12-21T07:30:22Z) - HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain
Language Model Compression [53.90578309960526]
Large pre-trained language models (PLMs) have shown overwhelming performances compared with traditional neural network methods.
We propose a hierarchical relational knowledge distillation (HRKD) method to capture both hierarchical and domain relational information.
arXiv Detail & Related papers (2021-10-16T11:23:02Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.