How to Unleash the Power of Large Language Models for Few-shot Relation
Extraction?
- URL: http://arxiv.org/abs/2305.01555v4
- Date: Fri, 9 Jun 2023 15:59:18 GMT
- Title: How to Unleash the Power of Large Language Models for Few-shot Relation
Extraction?
- Authors: Xin Xu, Yuqi Zhu, Xiaohan Wang, Ningyu Zhang
- Abstract summary: In this paper, we investigate principal methodologies, in-context learning and data generation, for few-shot relation extraction via GPT-3.5.
We observe that in-context learning can achieve performance on par with previous prompt learning approaches, and data generation with the large language model can boost previous solutions to obtain new state-of-the-art few-shot results.
- Score: 28.413620806193165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scaling language models have revolutionized widespread NLP tasks, yet little
comprehensively explored few-shot relation extraction with large language
models. In this paper, we investigate principal methodologies, in-context
learning and data generation, for few-shot relation extraction via GPT-3.5
through exhaustive experiments. To enhance few-shot performance, we further
propose task-related instructions and schema-constrained data generation. We
observe that in-context learning can achieve performance on par with previous
prompt learning approaches, and data generation with the large language model
can boost previous solutions to obtain new state-of-the-art few-shot results on
four widely-studied relation extraction datasets. We hope our work can inspire
future research for the capabilities of large language models in few-shot
relation extraction. Code is available in
https://github.com/zjunlp/DeepKE/tree/main/example/llm.
Related papers
- Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA [51.3033125256716]
We model the subgraph retrieval task as a conditional generation task handled by small language models.
Our base generative subgraph retrieval model, consisting of only 220M parameters, competitive retrieval performance compared to state-of-the-art models.
Our largest 3B model, when plugged with an LLM reader, sets new SOTA end-to-end performance on both the WebQSP and CWQ benchmarks.
arXiv Detail & Related papers (2024-10-08T15:22:36Z) - Learning to Extract Structured Entities Using Language Models [52.281701191329]
Recent advances in machine learning have significantly impacted the field of information extraction.
We reformulate the task to be entity-centric, enabling the use of diverse metrics.
We contribute to the field by introducing Structured Entity Extraction and proposing the Approximate Entity Set OverlaP metric.
arXiv Detail & Related papers (2024-02-06T22:15:09Z) - Semi-automatic Data Enhancement for Document-Level Relation Extraction
with Distant Supervision from Large Language Models [26.523153535336725]
Document-level Relation Extraction (DocRE) aims to extract relations from a long context.
We propose a method integrating a large language model (LLM) and a natural language inference (NLI) module to generate relation triples.
We demonstrate the effectiveness of our approach by introducing an enhanced dataset known as DocGNRE.
arXiv Detail & Related papers (2023-11-13T13:10:44Z) - RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models [57.12888828853409]
RAVEN is a model that combines retrieval-augmented masked language modeling and prefix language modeling.
Fusion-in-Context Learning enables the model to leverage more in-context examples without requiring additional training.
Our work underscores the potential of retrieval-augmented encoder-decoder language models for in-context learning.
arXiv Detail & Related papers (2023-08-15T17:59:18Z) - Relational Extraction on Wikipedia Tables using Convolutional and Memory
Networks [6.200672130699805]
Relation extraction (RE) is the task of extracting relations between entities in text.
We introduce a new model consisting of Convolutional Neural Network (CNN) and Bidirectional-Long Short Term Memory (BiLSTM) network to encode entities.
arXiv Detail & Related papers (2023-07-11T22:36:47Z) - ReGen: Zero-Shot Text Classification via Training Data Generation with
Progressive Dense Retrieval [22.882301169283323]
We propose a retrieval-enhanced framework to create training data from a general-domain unlabeled corpus.
Experiments on nine datasets demonstrate that REGEN achieves 4.3% gain over the strongest baselines and saves around 70% of the time compared to baselines using large NLG models.
arXiv Detail & Related papers (2023-05-18T04:30:09Z) - DORE: Document Ordered Relation Extraction based on Generative Framework [56.537386636819626]
This paper investigates the root cause of the underwhelming performance of the existing generative DocRE models.
We propose to generate a symbolic and ordered sequence from the relation matrix which is deterministic and easier for model to learn.
Experimental results on four datasets show that our proposed method can improve the performance of the generative DocRE models.
arXiv Detail & Related papers (2022-10-28T11:18:10Z) - Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types.
Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z) - Super-Prompting: Utilizing Model-Independent Contextual Data to Reduce
Data Annotation Required in Visual Commonsense Tasks [3.42658286826597]
We analyze different prompt-based fine-tuning techniques to improve results on both language and multimodal causal transformer models.
Our results show that by simple model-agnostic prompt-based fine-tuning, comparable results can be reached by only using 35%-40% of the fine-tuning training dataset.
arXiv Detail & Related papers (2022-04-25T18:56:55Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.