Zero-shot Triplet Extraction by Template Infilling
- URL: http://arxiv.org/abs/2212.10708v2
- Date: Wed, 20 Sep 2023 05:12:27 GMT
- Title: Zero-shot Triplet Extraction by Template Infilling
- Authors: Bosung Kim, Hayate Iso, Nikita Bhutani, Estevam Hruschka, Ndapa
Nakashole, Tom Mitchell
- Abstract summary: Triplet extraction aims to extract pairs of entities and their corresponding relations from unstructured text.
We show that by reducing triplet extraction to a template infilling task over a pre-trained language model (LM), we can equip the extraction model with zero-shot learning capabilities.
We propose a novel framework, ZETT, that aligns the task objective to the pre-training objective of generative transformers to generalize to unseen relations.
- Score: 13.295751492744081
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The task of triplet extraction aims to extract pairs of entities and their
corresponding relations from unstructured text. Most existing methods train an
extraction model on training data involving specific target relations, and are
incapable of extracting new relations that were not observed at training time.
Generalizing the model to unseen relations typically requires fine-tuning on
synthetic training data which is often noisy and unreliable. We show that by
reducing triplet extraction to a template infilling task over a pre-trained
language model (LM), we can equip the extraction model with zero-shot learning
capabilities and eliminate the need for additional training data. We propose a
novel framework, ZETT (ZEro-shot Triplet extraction by Template infilling),
that aligns the task objective to the pre-training objective of generative
transformers to generalize to unseen relations. Experiments on FewRel and
Wiki-ZSL datasets demonstrate that ZETT shows consistent and stable
performance, outperforming previous state-of-the-art methods, even when using
automatically generated templates. https://github.com/megagonlabs/zett/
Related papers
- Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - ProtoNER: Few shot Incremental Learning for Named Entity Recognition
using Prototypical Networks [7.317342506617286]
Prototypical Network based end-to-end KVP extraction model is presented.
No dependency on dataset used for initial training of the model.
No intermediate synthetic data generation which tends to add noise and results in model's performance degradation.
arXiv Detail & Related papers (2023-10-03T18:52:19Z) - Generative Meta-Learning for Zero-Shot Relation Triplet Extraction [12.837901211741443]
We propose a novel generative meta-learning framework to boost the generalization capability of generative models.
Specifically, we first design a task-aware generative model which can learn the general knowledge by forcing the optimization process to be conducted across multiple tasks.
Based on it, we then present three generative meta-learning approaches designated for three typical meta-learning categories.
arXiv Detail & Related papers (2023-05-03T06:34:39Z) - PCRED: Zero-shot Relation Triplet Extraction with Potential Candidate
Relation Selection and Entity Boundary Detection [11.274924966891842]
Zero-shot relation triplet extraction (ZeroRTE) aims to extract relation triplets from unstructured texts.
Previous state-of-the-art method handles this challenging task by leveraging pretrained language models to generate data as additional training samples.
We tackle this task from a new perspective and propose a novel method named PCRED for ZeroRTE with Potential Candidate Relation selection and Entity boundary Detection.
arXiv Detail & Related papers (2022-11-26T04:27:31Z) - Falsesum: Generating Document-level NLI Examples for Recognizing Factual
Inconsistency in Summarization [63.21819285337555]
We show that NLI models can be effective for this task when the training data is augmented with high-quality task-oriented examples.
We introduce Falsesum, a data generation pipeline leveraging a controllable text generation model to perturb human-annotated summaries.
We show that models trained on a Falsesum-augmented NLI dataset improve the state-of-the-art performance across four benchmarks for detecting factual inconsistency in summarization.
arXiv Detail & Related papers (2022-05-12T10:43:42Z) - RelationPrompt: Leveraging Prompts to Generate Synthetic Data for
Zero-Shot Relation Triplet Extraction [65.4337085607711]
We introduce the task setting of Zero-Shot Relation Triplet Extraction (ZeroRTE)
Given an input sentence, each extracted triplet consists of the head entity, relation label, and tail entity where the relation label is not seen at the training stage.
We propose to synthesize relation examples by prompting language models to generate structured texts.
arXiv Detail & Related papers (2022-03-17T05:55:14Z) - Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG)
It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains.
Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z) - Contrastive Triple Extraction with Generative Transformer [72.21467482853232]
We introduce a novel model, contrastive triple extraction with a generative transformer.
Specifically, we introduce a single shared transformer module for encoder-decoder-based generation.
To generate faithful results, we propose a novel triplet contrastive training object.
arXiv Detail & Related papers (2020-09-14T05:29:24Z) - Downstream Model Design of Pre-trained Language Model for Relation
Extraction Task [6.608858001497843]
Supervised relation extraction methods based on deep neural network play an important role in the recent information extraction field.
New network architecture with a special loss function is designed to serve as a downstream model of PLMs for supervised relation extraction.
arXiv Detail & Related papers (2020-04-08T03:16:06Z) - Pre-training for Abstractive Document Summarization by Reinstating
Source Text [105.77348528847337]
This paper presents three pre-training objectives which allow us to pre-train a Seq2Seq based abstractive summarization model on unlabeled text.
Experiments on two benchmark summarization datasets show that all three objectives can improve performance upon baselines.
arXiv Detail & Related papers (2020-04-04T05:06:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.