Generative Meta-Learning for Zero-Shot Relation Triplet Extraction
- URL: http://arxiv.org/abs/2305.01920v2
- Date: Sat, 26 Apr 2025 11:09:43 GMT
- Title: Generative Meta-Learning for Zero-Shot Relation Triplet Extraction
- Authors: Wanli Li, Tieyun Qian, Yi Song, Zeyu Zhang, Jiawei Li, Zhuang Chen, Lixin Zou,
- Abstract summary: Zero-shot Relation Triplet Extraction (ZeroRTE) aims to extract relation triplets from texts containing unseen relation types.<n>Existing approaches typically leverage the knowledge embedded in pre-trained language models to accomplish the generalization process.<n>We propose a generative meta-learning framework which exploits the learning-to-learn' ability of meta-learning to boost the generalization capability of generative models.
- Score: 20.556880137419064
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot Relation Triplet Extraction (ZeroRTE) aims to extract relation triplets from texts containing unseen relation types. This capability benefits various downstream information retrieval (IR) tasks. The primary challenge lies in enabling models to generalize effectively to unseen relation categories. Existing approaches typically leverage the knowledge embedded in pre-trained language models to accomplish the generalization process. However, these methods focus solely on fitting the training data during training, without specifically improving the model's generalization performance, resulting in limited generalization capability. For this reason, we explore the integration of bi-level optimization (BLO) with pre-trained language models for learning generalized knowledge directly from the training data, and propose a generative meta-learning framework which exploits the `learning-to-learn' ability of meta-learning to boost the generalization capability of generative models. Specifically, we introduce a BLO approach that simultaneously addresses data fitting and generalization. This is achieved by constructing an upper-level loss to focus on generalization and a lower-level loss to ensure accurate data fitting. Building on this, we subsequently develop three generative meta-learning methods, each tailored to a distinct category of meta-learning. Extensive experimental results demonstrate that our framework performs well on the ZeroRTE task. Our code is available at https://github.com/leeworry/TGM-MetaLearning.
Related papers
- What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? [83.83230167222852]
We find that a model's generalization behavior can be effectively characterized by a training metric we call pre-memorization train accuracy.
By connecting a model's learning behavior to its generalization, pre-memorization train accuracy can guide targeted improvements to training strategies.
arXiv Detail & Related papers (2024-11-12T09:52:40Z) - Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models [52.03511469562013]
We introduce the Iterative Contrastive Unlearning (ICU) framework, which consists of three core components.
A Knowledge Unlearning Induction module targets specific knowledge for removal using an unlearning loss.
A Contrastive Learning Enhancement module preserves the model's expressive capabilities against the pure unlearning goal.
An Iterative Unlearning Refinement module dynamically adjusts the unlearning process through ongoing evaluation and updates.
arXiv Detail & Related papers (2024-07-25T07:09:35Z) - FREE: Faster and Better Data-Free Meta-Learning [77.90126669914324]
Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data.
We introduce the Faster and Better Data-Free Meta-Learning framework, which contains: (i) a meta-generator for rapidly recovering training tasks from pre-trained models; and (ii) a meta-learner for generalizing to new unseen tasks.
arXiv Detail & Related papers (2024-05-02T03:43:19Z) - Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate [40.5601980891318]
Generalization remains a central challenge in machine learning.
We propose Learning from Teaching (LoT), a novel regularization technique for deep neural networks to enhance generalization.
LoT operationalizes this concept to improve the generalization of the main model with auxiliary student learners.
arXiv Detail & Related papers (2024-02-05T07:05:17Z) - Meta-Learned Attribute Self-Interaction Network for Continual and
Generalized Zero-Shot Learning [46.6282595346048]
Zero-shot learning (ZSL) is a promising approach to generalizing a model to unseen categories during training.
We propose a Meta-learned Attribute self-Interaction Network (MAIN) for continual ZSL.
By pairing attribute self-interaction trained using meta-learning with inverse regularization of the attribute encoder, we are able to outperform state-of-the-art results without leveraging the unseen class attributes.
arXiv Detail & Related papers (2023-12-02T16:23:01Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning [117.48444197402858]
We propose ePisode cUrriculum inveRsion (ECI) during data-free meta training and invErsion calibRation following inner loop (ICFIL) during meta testing.
ECI adaptively increases the difficulty level of pseudo episodes according to the real-time feedback of the meta model.
We formulate the optimization process of meta training with ECI as an adversarial form in an end-to-end manner.
arXiv Detail & Related papers (2023-03-20T15:10:41Z) - Zero-shot Triplet Extraction by Template Infilling [13.295751492744081]
Triplet extraction aims to extract pairs of entities and their corresponding relations from unstructured text.
We show that by reducing triplet extraction to a template infilling task over a pre-trained language model (LM), we can equip the extraction model with zero-shot learning capabilities.
We propose a novel framework, ZETT, that aligns the task objective to the pre-training objective of generative transformers to generalize to unseen relations.
arXiv Detail & Related papers (2022-12-21T00:57:24Z) - General-Purpose In-Context Learning by Meta-Learning Transformers [45.63069059498147]
We show that Transformers and other black-box models can be meta-trained to act as general-purpose in-context learners.
We characterize transitions between algorithms that generalize, algorithms that memorize, and algorithms that fail to meta-train at all.
We propose practical interventions such as biasing the training distribution that improve the meta-training and meta-generalization of general-purpose in-context learning algorithms.
arXiv Detail & Related papers (2022-12-08T18:30:22Z) - PCRED: Zero-shot Relation Triplet Extraction with Potential Candidate
Relation Selection and Entity Boundary Detection [11.274924966891842]
Zero-shot relation triplet extraction (ZeroRTE) aims to extract relation triplets from unstructured texts.
Previous state-of-the-art method handles this challenging task by leveraging pretrained language models to generate data as additional training samples.
We tackle this task from a new perspective and propose a novel method named PCRED for ZeroRTE with Potential Candidate Relation selection and Entity boundary Detection.
arXiv Detail & Related papers (2022-11-26T04:27:31Z) - DORE: Document Ordered Relation Extraction based on Generative Framework [56.537386636819626]
This paper investigates the root cause of the underwhelming performance of the existing generative DocRE models.
We propose to generate a symbolic and ordered sequence from the relation matrix which is deterministic and easier for model to learn.
Experimental results on four datasets show that our proposed method can improve the performance of the generative DocRE models.
arXiv Detail & Related papers (2022-10-28T11:18:10Z) - Meta-Learning via Classifier(-free) Guidance [5.812784742024491]
State-of-the-art meta-learning techniques do not optimize for zero-shot adaptation to unseen tasks.
We propose meta-learning techniques that use natural language guidance to achieve higher zero-shot performance.
arXiv Detail & Related papers (2022-10-17T11:09:35Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - A Generative Model for Relation Extraction and Classification [23.1277041729626]
We present a novel generative model for relation extraction and classification (which we call GREC)
We explore various encoding representations for the source and target sequences, and design effective schemes that enable GREC to achieve state-of-the-art performance on three benchmark RE datasets.
Our approach can be extended to extract all relation triples from a sentence in one pass.
arXiv Detail & Related papers (2022-02-26T21:17:18Z) - Generating meta-learning tasks to evolve parametric loss for
classification learning [1.1355370218310157]
In existing meta-learning approaches, learning tasks for training meta-models are usually collected from public datasets.
We propose a meta-learning approach based on randomly generated meta-learning tasks to obtain a parametric loss for classification learning based on big data.
arXiv Detail & Related papers (2021-11-20T13:07:55Z) - Improving Non-autoregressive Generation with Mixup Training [51.61038444990301]
We present a non-autoregressive generation model based on pre-trained transformer models.
We propose a simple and effective iterative training method called MIx Source and pseudo Target.
Our experiments on three generation benchmarks including question generation, summarization and paraphrase generation, show that the proposed framework achieves the new state-of-the-art results.
arXiv Detail & Related papers (2021-10-21T13:04:21Z) - A Brief Summary of Interactions Between Meta-Learning and
Self-Supervised Learning [0.0]
This paper briefly reviews the connections between meta-learning and self-supervised learning.
We show that an integration of meta-learning and self-supervised learning models can best contribute to the improvement of model generalization capability.
arXiv Detail & Related papers (2021-03-01T08:31:28Z) - KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation [100.79870384880333]
We propose a knowledge-grounded pre-training (KGPT) to generate knowledge-enriched text.
We adopt three settings, namely fully-supervised, zero-shot, few-shot to evaluate its effectiveness.
Under zero-shot setting, our model achieves over 30 ROUGE-L on WebNLG while all other baselines fail.
arXiv Detail & Related papers (2020-10-05T19:59:05Z) - Contrastive Triple Extraction with Generative Transformer [72.21467482853232]
We introduce a novel model, contrastive triple extraction with a generative transformer.
Specifically, we introduce a single shared transformer module for encoder-decoder-based generation.
To generate faithful results, we propose a novel triplet contrastive training object.
arXiv Detail & Related papers (2020-09-14T05:29:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.