Related papers: Efficient Data Learning for Open Information Extraction with Pre-trained Language Models

Efficient Data Learning for Open Information Extraction with Pre-trained Language Models

URL: http://arxiv.org/abs/2310.15021v2
Date: Wed, 26 Jun 2024 08:23:10 GMT
Title: Efficient Data Learning for Open Information Extraction with Pre-trained Language Models
Authors: Zhiyuan Fan, Shizhu He,
Abstract summary: Open Information Extraction (OpenIE) is a fundamental yet challenging task in Natural Language Processing. In this paper, we introduce a novel framework, OK-IE, that ingeniously transforms the task form of OpenIE into the pre-training task form of the T5 model. Furthermore, we introduce an innovative concept of Anchor to control the sequence of model outputs, effectively eliminating the impact of order penalty on model convergence.
Score: 15.554865537872919
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Open Information Extraction (OpenIE) is a fundamental yet challenging task in Natural Language Processing, which involves extracting all triples (subject, predicate, object) from a given sentence. While labeling-based methods have their merits, generation-based techniques offer unique advantages, such as the ability to generate tokens not present in the original sentence. However, these generation-based methods often require a significant amount of training data to learn the task form of OpenIE and substantial training time to overcome slow model convergence due to the order penalty. In this paper, we introduce a novel framework, OK-IE, that ingeniously transforms the task form of OpenIE into the pre-training task form of the T5 model, thereby reducing the need for extensive training data. Furthermore, we introduce an innovative concept of Anchor to control the sequence of model outputs, effectively eliminating the impact of order penalty on model convergence and significantly reducing training time. Experimental results indicate that, compared to previous SOTA methods, OK-IE requires only 1/100 of the training data (900 instances) and 1/120 of the training time (3 minutes) to achieve comparable results.

Related papers

Effective Data Pruning through Score Extrapolation [40.61665742457229]
We introduce a novel importance score extrapolation framework that requires training on only a small subset of data.<n>We present two initial approaches in this framework to accurately predict sample importance for the entire dataset using patterns learned from this minimal subset.<n>Our results indicate that score extrapolation is a promising direction to scale expensive score calculation methods, such as pruning, data attribution, or other tasks.
arXiv Detail & Related papers (2025-06-10T17:38:49Z)
Distillation Robustifies Unlearning [36.888726242192504]
We propose a scalable method that distills an unlearned model into a partially noised copy of itself.<n>At its strongest setting, UNDO matches the robustness of a model retrained from scratch with perfect data filtering.<n>We also show that UNDO robustifies unlearning on the more realistic Weapons of Mass Destruction Proxy benchmark.
arXiv Detail & Related papers (2025-06-06T17:58:54Z)
The Surprising Effectiveness of Test-Time Training for Abstract Reasoning [64.36534512742736]
We investigate the effectiveness of test-time training (TTT) as a mechanism for improving models' reasoning capabilities. TTT significantly improves performance on ARC tasks, achieving up to 6x improvement in accuracy compared to base fine-tuned models. Our findings suggest that explicit symbolic search is not the only path to improved abstract reasoning in neural language models.
arXiv Detail & Related papers (2024-11-11T18:59:45Z)
Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest. Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z)
Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review [50.78587571704713]
Learn-Focus-Review (LFR) is a dynamic training approach that adapts to the model's learning progress. LFR tracks the model's learning performance across data blocks (sequences of tokens) and prioritizes revisiting challenging regions of the dataset. Compared to baseline models trained on the full datasets, LFR consistently achieved lower perplexity and higher accuracy.
arXiv Detail & Related papers (2024-09-10T00:59:18Z)
Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models [52.03511469562013]
We introduce the Iterative Contrastive Unlearning (ICU) framework, which consists of three core components. A Knowledge Unlearning Induction module targets specific knowledge for removal using an unlearning loss. A Contrastive Learning Enhancement module preserves the model's expressive capabilities against the pure unlearning goal. An Iterative Unlearning Refinement module dynamically adjusts the unlearning process through ongoing evaluation and updates.
arXiv Detail & Related papers (2024-07-25T07:09:35Z)
Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning [35.681853074122735]
We introduce an exact unlearning framework -- Sequence-aware Sharded Sliced Training (S3T) S3T is designed to enhance the deletion capabilities of an exact unlearning system while minimizing the impact on model's performance. We demonstrate S3T attains superior deletion capabilities and enhanced performance compared to baselines across a wide range of settings.
arXiv Detail & Related papers (2024-06-24T01:45:13Z)
Unsupervised Pre-training with Language-Vision Prompts for Low-Data Instance Segmentation [105.23631749213729]
We propose a novel method for unsupervised pre-training in low-data regimes. Inspired by the recently successful prompting technique, we introduce a new method, Unsupervised Pre-training with Language-Vision Prompts. We show that our method can converge faster and perform better than CNN-based models in low-data regimes.
arXiv Detail & Related papers (2024-05-22T06:48:43Z)
Incremental Self-training for Semi-supervised Learning [56.57057576885672]
IST is simple yet effective and fits existing self-training-based semi-supervised learning methods. We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
arXiv Detail & Related papers (2024-04-14T05:02:00Z)
Unlearnable Algorithms for In-context Learning [36.895152458323764]
In this paper, we focus on efficient unlearning methods for the task adaptation phase of a pretrained large language model. We observe that an LLM's ability to do in-context learning for task adaptation allows for efficient exact unlearning of task adaptation training data. We propose a new holistic measure of unlearning cost which accounts for varying inference costs.
arXiv Detail & Related papers (2024-02-01T16:43:04Z)
EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones [80.662250618795]
This paper presents a new curriculum learning approach for the efficient training of visual backbones (e.g., vision Transformers) As an off-the-shelf method, it reduces the wall-time training cost of a wide variety of popular models by >1.5x on ImageNet-1K/22K without sacrificing accuracy.
arXiv Detail & Related papers (2022-11-17T17:38:55Z)
Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability [53.27240222619834]
Knowledge Distillation as Efficient Pre-training aims to efficiently transfer the learned feature representation from pre-trained models to new student models for future downstream tasks. Our method performs comparably with supervised pre-training counterparts in 3 downstream tasks and 9 downstream datasets requiring 10x less data and 5x less pre-training time.
arXiv Detail & Related papers (2022-03-10T06:23:41Z)
Ex-Model: Continual Learning from a Stream of Trained Models [12.27992745065497]
We argue that continual learning systems should exploit the availability of compressed information in the form of trained models. We introduce and formalize a new paradigm named "Ex-Model Continual Learning" (ExML), where an agent learns from a sequence of previously trained models instead of raw data.
arXiv Detail & Related papers (2021-12-13T09:46:16Z)
Data-Efficient Pretraining via Contrastive Self-Supervision [48.255310614527694]
In this work, we evaluate against three core challenges for resource efficient learning. We propose a data and compute efficient self-supervised, contrastive text encoder, pretrained on 60MB of task-internal' text data. We find our method outperforms RoBERTa, while pretraining and fine-tuning in a 1/5th of RoBERTa's fine-tuning time.
arXiv Detail & Related papers (2020-10-02T15:41:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.