DeepStruct: Pretraining of Language Models for Structure Prediction
- URL: http://arxiv.org/abs/2205.10475v1
- Date: Sat, 21 May 2022 00:58:22 GMT
- Title: DeepStruct: Pretraining of Language Models for Structure Prediction
- Authors: Chenguang Wang, Xiao Liu, Zui Chen, Haoyun Hong, Jie Tang, Dawn Song
- Abstract summary: We pretrain language models on a collection of task-agnostic corpora to generate structures from text.
Our structure pretraining enables zero-shot transfer of the learned knowledge that models have about the structure tasks.
We show that a 10B parameter language model transfers non-trivially to most tasks and obtains state-of-the-art performance on 21 of 28 datasets.
- Score: 64.84144849119554
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a method for improving the structural understanding abilities of
language models. Unlike previous approaches that finetune the models with
task-specific augmentation, we pretrain language models on a collection of
task-agnostic corpora to generate structures from text. Our structure
pretraining enables zero-shot transfer of the learned knowledge that models
have about the structure tasks. We study the performance of this approach on 28
datasets, spanning 10 structure prediction tasks including open information
extraction, joint entity and relation extraction, named entity recognition,
relation classification, semantic role labeling, event extraction, coreference
resolution, factual probe, intent detection, and dialogue state tracking. We
further enhance the pretraining with the task-specific training sets. We show
that a 10B parameter language model transfers non-trivially to most tasks and
obtains state-of-the-art performance on 21 of 28 datasets that we evaluate.
Related papers
- Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness [3.2925222641796554]
"pointer-guided segment ordering" (SO) is a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations.
Our experiments show that pointer-guided pre-training significantly enhances the model's ability to understand complex document structures.
arXiv Detail & Related papers (2024-06-06T15:17:51Z) - Punctuation Restoration Improves Structure Understanding without
Supervision [6.4736137270915215]
We show that punctuation restoration as a learning objective improves in- and out-of-distribution performance on structure-related tasks.
Punctuation restoration is an effective learning objective that can improve structure understanding and yield a more robust structure-aware representations of natural language.
arXiv Detail & Related papers (2024-02-13T11:22:52Z) - Learning to Extract Structured Entities Using Language Models [52.281701191329]
Recent advances in machine learning have significantly impacted the field of information extraction.
We reformulate the task to be entity-centric, enabling the use of diverse metrics.
We contribute to the field by introducing Structured Entity Extraction and proposing the Approximate Entity Set OverlaP metric.
arXiv Detail & Related papers (2024-02-06T22:15:09Z) - Expanding the Vocabulary of BERT for Knowledge Base Construction [6.412048788884728]
"Knowledge Base Construction from Pretrained Language Models" challenge was held at International Semantic Web Conference 2023.
Our focus was on Track 1 of the challenge, where the parameters are constrained to a maximum of 1 billion.
We present Vocabulary Expandable BERT for knowledge base construction, which expand the language model's vocabulary while preserving semantic embeddings.
arXiv Detail & Related papers (2023-10-12T12:52:46Z) - Pre-Training to Learn in Context [138.0745138788142]
The ability of in-context learning is not fully exploited because language models are not explicitly trained to learn in context.
We propose PICL (Pre-training for In-Context Learning), a framework to enhance the language models' in-context learning ability.
Our experiments show that PICL is more effective and task-generalizable than a range of baselines, outperforming larger language models with nearly 4x parameters.
arXiv Detail & Related papers (2023-05-16T03:38:06Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - Learning Better Sentence Representation with Syntax Information [0.0]
We propose a novel approach to combining syntax information with a pre-trained language model.
Our model achieves 91.2% accuracy, outperforming the baseline model by 37.8% on sentence completion task.
arXiv Detail & Related papers (2021-01-09T12:15:08Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Exploiting Structured Knowledge in Text via Graph-Guided Representation
Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs.
Building upon entity-level masked language models, our first contribution is an entity masking scheme.
In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.