Autoregressive Structured Prediction with Language Models
- URL: http://arxiv.org/abs/2210.14698v1
- Date: Wed, 26 Oct 2022 13:27:26 GMT
- Title: Autoregressive Structured Prediction with Language Models
- Authors: Tianyu Liu, Yuchen Jiang, Nicholas Monath, Ryan Cotterell, Mrinmaya
Sachan
- Abstract summary: We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
- Score: 73.11519625765301
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent years have seen a paradigm shift in NLP towards using pretrained
language models ({PLM}) for a wide range of tasks.
However, there are many difficult design decisions to represent structures
(e.g. tagged text, coreference chains) in a way such that they can be captured
by PLMs.
Prior work on structured prediction with PLMs typically flattens the
structured output into a sequence, which limits the quality of structural
information being learned and leads to inferior performance compared to classic
discriminative models.
In this work, we describe an approach to model structures as sequences of
actions in an autoregressive manner with PLMs, allowing in-structure
dependencies to be learned without any loss.
Our approach achieves the new state-of-the-art on all the structured
prediction tasks we looked at, namely, named entity recognition, end-to-end
relation extraction, and coreference resolution.
Related papers
- Structured Language Generation Model for Robust Structure Prediction [6.4736137270915215]
We propose a framework that reduces sequence-to-sequence problems to classification problems via methodologies in loss calibration and decoding method.
Our experimental results show that SLGM is able to maintain performance without explicit dataset information, follow and potentially replace dataset-specific fine-tuning.
arXiv Detail & Related papers (2024-02-14T06:33:22Z) - Promptly Predicting Structures: The Return of Inference [31.442123334313035]
We present a framework for constructing zero- and few-shot linguistic structure predictors.
Our results show that enforcing consistency constructs not only structurally valid outputs, but also improves performance.
arXiv Detail & Related papers (2024-01-12T20:08:39Z) - On Conditional and Compositional Language Model Differentiable Prompting [75.76546041094436]
Prompts have been shown to be an effective method to adapt a frozen Pretrained Language Model (PLM) to perform well on downstream tasks.
We propose a new model, Prompt Production System (PRopS), which learns to transform task instructions or input metadata, into continuous prompts.
arXiv Detail & Related papers (2023-07-04T02:47:42Z) - Physics of Language Models: Part 1, Learning Hierarchical Language Structures [51.68385617116854]
Transformer-based language models are effective but complex, and understanding their inner workings is a significant challenge.
We introduce a family of synthetic CFGs that produce hierarchical rules, capable of generating lengthy sentences.
We demonstrate that generative models like GPT can accurately learn this CFG language and generate sentences based on it.
arXiv Detail & Related papers (2023-05-23T04:28:16Z) - OntoType: Ontology-Guided and Pre-Trained Language Model Assisted Fine-Grained Entity Typing [25.516304052884397]
Fine-grained entity typing (FET) assigns entities in text with context-sensitive, fine-grained semantic types.
OntoType follows a type ontological structure, from coarse to fine, ensembles multiple PLM prompting results to generate a set of type candidates.
Our experiments on the Ontonotes, FIGER, and NYT datasets demonstrate that our method outperforms the state-of-the-art zero-shot fine-grained entity typing methods.
arXiv Detail & Related papers (2023-05-21T00:32:37Z) - Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal
Structured Representations [70.41385310930846]
We present an end-to-end framework Structure-CLIP to enhance multi-modal structured representations.
We use scene graphs to guide the construction of semantic negative examples, which results in an increased emphasis on learning structured representations.
A Knowledge-Enhance (KEE) is proposed to leverage SGK as input to further enhance structured representations.
arXiv Detail & Related papers (2023-05-06T03:57:05Z) - Guiding the PLMs with Semantic Anchors as Intermediate Supervision:
Towards Interpretable Semantic Parsing [57.11806632758607]
We propose to incorporate the current pretrained language models with a hierarchical decoder network.
By taking the first-principle structures as the semantic anchors, we propose two novel intermediate supervision tasks.
We conduct intensive experiments on several semantic parsing benchmarks and demonstrate that our approach can consistently outperform the baselines.
arXiv Detail & Related papers (2022-10-04T07:27:29Z) - DeepStruct: Pretraining of Language Models for Structure Prediction [64.84144849119554]
We pretrain language models on a collection of task-agnostic corpora to generate structures from text.
Our structure pretraining enables zero-shot transfer of the learned knowledge that models have about the structure tasks.
We show that a 10B parameter language model transfers non-trivially to most tasks and obtains state-of-the-art performance on 21 of 28 datasets.
arXiv Detail & Related papers (2022-05-21T00:58:22Z) - Rethinking Relational Encoding in Language Model: Pre-Training for
General Sequences [23.806325599416134]
Language model pre-training fails at modeling per-sequence relations in non-natural language domains.
We develop a framework that couples LMPT with deep structure-preserving metric learning to produce richer embeddings.
Our approach offers notable performance improvements on downstream tasks.
arXiv Detail & Related papers (2021-03-18T15:51:04Z) - Structure by Architecture: Structured Representations without
Regularization [31.75200752252397]
We study the problem of self-supervised structured representation learning using autoencoders for downstream tasks such as generative modeling.
We design a novel autoencoder architecture capable of learning a structured representation without the need for aggressive regularization.
We demonstrate how these models learn a representation that improves results in a variety of downstream tasks including generation, disentanglement, and extrapolation.
arXiv Detail & Related papers (2020-06-14T04:37:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.