Grasping the Essentials: Tailoring Large Language Models for Zero-Shot
Relation Extraction
- URL: http://arxiv.org/abs/2402.11142v1
- Date: Sat, 17 Feb 2024 00:20:06 GMT
- Title: Grasping the Essentials: Tailoring Large Language Models for Zero-Shot
Relation Extraction
- Authors: Sizhe Zhou, Yu Meng, Bowen Jin, Jiawei Han
- Abstract summary: Relation extraction (RE) aims to identify semantic relationships between entities mentioned in texts.
Few-shot learning settings may offer incomplete and biased supervision for understanding target relation semantics.
We study the definition only zero-shot RE setting where only relation definitions expressed in natural language are used to train a RE model.
- Score: 36.627683488532234
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Relation extraction (RE), a crucial task in NLP, aims to identify semantic
relationships between entities mentioned in texts. Despite significant
advancements in this field, existing models typically rely on extensive
annotated data for training, which can be both costly and time-consuming to
acquire. Moreover, these models often struggle to adapt to new or unseen
relationships. In contrast, few-shot learning settings, which aim to reduce
annotation requirements, may offer incomplete and biased supervision for
understanding target relation semantics, leading to degraded and unstable
performance. To provide the model with accurate and explicit descriptions of
the relations types and meanwhile minimize the annotation requirements, we
study the definition only zero-shot RE setting where only relation definitions
expressed in natural language are used to train a RE model. Motivated by the
strong synthetic data generation power of LLMs, we propose a framework REPaL
which consists of three stages: (1) We utilize LLMs to generate initial seed
instances based on relation definitions and an unlabeled corpora. (2) We
fine-tune a bidirectional Small Language Model (SLM) using these initial seeds
to learn the relations for the target domain. (3) We enhance pattern coverage
and mitigate bias resulting from the limited number of initial seeds by
incorporating feedback acquired from SLM's predictions on unlabeled corpora. To
accomplish this, we leverage the multi-turn conversation ability of LLMs to
generate new instances in follow-up dialogues. Experiments on two datasets show
REPaL achieves better zero-shot performance with large margins over baseline
methods.
Related papers
- Navigating Semantic Relations: Challenges for Language Models in Abstract Common-Sense Reasoning [5.4141465747474475]
Large language models (LLMs) have achieved remarkable performance in generating human-like text and solving problems of moderate complexity.
We systematically evaluate abstract common-sense reasoning in LLMs using the ConceptNet knowledge graph.
arXiv Detail & Related papers (2025-02-19T20:20:24Z) - Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models [60.00178316095646]
Sentence embedding is essential for many NLP tasks, with contrastive learning methods achieving strong performance using datasets like NLI.
Recent studies leverage large language models (LLMs) to generate sentence pairs, reducing annotation dependency.
We propose a method for controlling the generation direction of LLMs in the latent space. Unlike unconstrained generation, the controlled approach ensures meaningful semantic divergence.
Experiments on multiple benchmarks demonstrate that our method achieves new SOTA performance with a modest cost in ranking sentence synthesis.
arXiv Detail & Related papers (2025-02-19T12:07:53Z) - Relation Extraction with Fine-Tuned Large Language Models in Retrieval Augmented Generation Frameworks [0.0]
Relation Extraction (RE) is crucial for converting unstructured data into structured formats like Knowledge Graphs (KGs)
Recent studies leveraging pre-trained language models (PLMs) have shown significant success in this area.
This work explores the performance of fine-tuned LLMs and their integration into the Retrieval Augmented-based (RAG) RE approach.
arXiv Detail & Related papers (2024-06-20T21:27:57Z) - Factual Dialogue Summarization via Learning from Large Language Models [35.63037083806503]
Large language model (LLM)-based automatic text summarization models generate more factually consistent summaries.
We employ zero-shot learning to extract symbolic knowledge from LLMs, generating factually consistent (positive) and inconsistent (negative) summaries.
Our approach achieves better factual consistency while maintaining coherence, fluency, and relevance, as confirmed by various automatic evaluation metrics.
arXiv Detail & Related papers (2024-06-20T20:03:37Z) - Learning from Semi-Factuals: A Debiased and Semantic-Aware Framework for
Generalized Relation Discovery [12.716874398564482]
Generalized Relation Discovery (GRD) aims to identify unlabeled instances in existing pre-defined relations or discover novel relations.
We propose a novel framework, SFGRD, for this task by learning from semi-factuals in two stages.
SFGRD surpasses state-of-the-art models in terms of accuracy by 2.36% $sim$5.78% and cosine similarity by 32.19%$sim$ 84.45%.
arXiv Detail & Related papers (2024-01-12T02:38:55Z) - Exploiting Contextual Target Attributes for Target Sentiment
Classification [53.30511968323911]
Existing PTLM-based models for TSC can be categorized into two groups: 1) fine-tuning-based models that adopt PTLM as the context encoder; 2) prompting-based models that transfer the classification task to the text/word generation task.
We present a new perspective of leveraging PTLM for TSC: simultaneously leveraging the merits of both language modeling and explicit target-context interactions via contextual target attributes.
arXiv Detail & Related papers (2023-12-21T11:45:28Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Improving Distantly Supervised Relation Extraction by Natural Language
Inference [9.181270251524866]
We propose a novel DSRE-NLI framework, which considers both distant supervision from existing knowledge bases and indirect supervision from pretrained language models for other tasks.
DSRE-NLI energizes an off-the-shelf natural language inference (NLI) engine with a semi-automatic relation verbalization (SARV) mechanism to provide indirect supervision.
With two simple and effective data consolidation strategies, the quality of training data is substantially improved.
arXiv Detail & Related papers (2022-07-31T02:48:34Z) - RelationPrompt: Leveraging Prompts to Generate Synthetic Data for
Zero-Shot Relation Triplet Extraction [65.4337085607711]
We introduce the task setting of Zero-Shot Relation Triplet Extraction (ZeroRTE)
Given an input sentence, each extracted triplet consists of the head entity, relation label, and tail entity where the relation label is not seen at the training stage.
We propose to synthesize relation examples by prompting language models to generate structured texts.
arXiv Detail & Related papers (2022-03-17T05:55:14Z) - Automatically Generating Counterfactuals for Relation Exaction [18.740447044960796]
relation extraction (RE) is a fundamental task in natural language processing.
Current deep neural models have achieved high accuracy but are easily affected by spurious correlations.
We develop a novel approach to derive contextual counterfactuals for entities.
arXiv Detail & Related papers (2022-02-22T04:46:10Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - Syntax-Enhanced Pre-trained Model [49.1659635460369]
We study the problem of leveraging the syntactic structure of text to enhance pre-trained models such as BERT and RoBERTa.
Existing methods utilize syntax of text either in the pre-training stage or in the fine-tuning stage, so that they suffer from discrepancy between the two stages.
We present a model that utilizes the syntax of text in both pre-training and fine-tuning stages.
arXiv Detail & Related papers (2020-12-28T06:48:04Z) - Learning Relation Prototype from Unlabeled Texts for Long-tail Relation
Extraction [84.64435075778988]
We propose a general approach to learn relation prototypes from unlabeled texts.
We learn relation prototypes as an implicit factor between entities.
We conduct experiments on two publicly available datasets: New York Times and Google Distant Supervision.
arXiv Detail & Related papers (2020-11-27T06:21:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.