Related papers: Autoregressive Structured Prediction with Language Models

Autoregressive Structured Prediction with Language Models

URL: http://arxiv.org/abs/2210.14698v1
Date: Wed, 26 Oct 2022 13:27:26 GMT
Title: Autoregressive Structured Prediction with Language Models
Authors: Tianyu Liu, Yuchen Jiang, Nicholas Monath, Ryan Cotterell, Mrinmaya Sachan
Abstract summary: We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs. Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
Score: 73.11519625765301
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent years have seen a paradigm shift in NLP towards using pretrained language models ({PLM}) for a wide range of tasks. However, there are many difficult design decisions to represent structures (e.g. tagged text, coreference chains) in a way such that they can be captured by PLMs. Prior work on structured prediction with PLMs typically flattens the structured output into a sequence, which limits the quality of structural information being learned and leads to inferior performance compared to classic discriminative models. In this work, we describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs, allowing in-structure dependencies to be learned without any loss. Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at, namely, named entity recognition, end-to-end relation extraction, and coreference resolution.

Related papers

Generating Structured Plan Representation of Procedures with LLMs [5.623006055588189]
We introduce SOP Structuring ( SOPStruct), a novel approach to transform SOPs into structured representations. SOPStruct produces a standardized representation of SOPs across different domains, reduces cognitive load, and improves user comprehension. Our research highlights the transformative potential of Large Language Models to streamline process modeling.
arXiv Detail & Related papers (2025-03-28T22:38:24Z)
Structured Language Generation Model for Robust Structure Prediction [6.4736137270915215]
We propose a framework that reduces sequence-to-sequence problems to classification problems via methodologies in loss calibration and decoding method. Our experimental results show that SLGM is able to maintain performance without explicit dataset information, follow and potentially replace dataset-specific fine-tuning.
arXiv Detail & Related papers (2024-02-14T06:33:22Z)
Promptly Predicting Structures: The Return of Inference [31.442123334313035]
We present a framework for constructing zero- and few-shot linguistic structure predictors. Our results show that enforcing consistency constructs not only structurally valid outputs, but also improves performance.
arXiv Detail & Related papers (2024-01-12T20:08:39Z)
On Conditional and Compositional Language Model Differentiable Prompting [75.76546041094436]
Prompts have been shown to be an effective method to adapt a frozen Pretrained Language Model (PLM) to perform well on downstream tasks. We propose a new model, Prompt Production System (PRopS), which learns to transform task instructions or input metadata, into continuous prompts.
arXiv Detail & Related papers (2023-07-04T02:47:42Z)
Physics of Language Models: Part 1, Learning Hierarchical Language Structures [51.68385617116854]
Transformer-based language models are effective but complex, and understanding their inner workings is a significant challenge. We introduce a family of synthetic CFGs that produce hierarchical rules, capable of generating lengthy sentences. We demonstrate that generative models like GPT can accurately learn this CFG language and generate sentences based on it.
arXiv Detail & Related papers (2023-05-23T04:28:16Z)
OntoType: Ontology-Guided and Pre-Trained Language Model Assisted Fine-Grained Entity Typing [25.516304052884397]
Fine-grained entity typing (FET) assigns entities in text with context-sensitive, fine-grained semantic types. OntoType follows a type ontological structure, from coarse to fine, ensembles multiple PLM prompting results to generate a set of type candidates. Our experiments on the Ontonotes, FIGER, and NYT datasets demonstrate that our method outperforms the state-of-the-art zero-shot fine-grained entity typing methods.
arXiv Detail & Related papers (2023-05-21T00:32:37Z)
Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations [70.41385310930846]
We present an end-to-end framework Structure-CLIP to enhance multi-modal structured representations. We use scene graphs to guide the construction of semantic negative examples, which results in an increased emphasis on learning structured representations. A Knowledge-Enhance (KEE) is proposed to leverage SGK as input to further enhance structured representations.
arXiv Detail & Related papers (2023-05-06T03:57:05Z)
Guiding the PLMs with Semantic Anchors as Intermediate Supervision: Towards Interpretable Semantic Parsing [57.11806632758607]
We propose to incorporate the current pretrained language models with a hierarchical decoder network. By taking the first-principle structures as the semantic anchors, we propose two novel intermediate supervision tasks. We conduct intensive experiments on several semantic parsing benchmarks and demonstrate that our approach can consistently outperform the baselines.
arXiv Detail & Related papers (2022-10-04T07:27:29Z)
DeepStruct: Pretraining of Language Models for Structure Prediction [64.84144849119554]
We pretrain language models on a collection of task-agnostic corpora to generate structures from text. Our structure pretraining enables zero-shot transfer of the learned knowledge that models have about the structure tasks. We show that a 10B parameter language model transfers non-trivially to most tasks and obtains state-of-the-art performance on 21 of 28 datasets.
arXiv Detail & Related papers (2022-05-21T00:58:22Z)
Rethinking Relational Encoding in Language Model: Pre-Training for General Sequences [23.806325599416134]
Language model pre-training fails at modeling per-sequence relations in non-natural language domains. We develop a framework that couples LMPT with deep structure-preserving metric learning to produce richer embeddings. Our approach offers notable performance improvements on downstream tasks.
arXiv Detail & Related papers (2021-03-18T15:51:04Z)
Structure by Architecture: Structured Representations without Regularization [31.75200752252397]
We study the problem of self-supervised structured representation learning using autoencoders for downstream tasks such as generative modeling. We design a novel autoencoder architecture capable of learning a structured representation without the need for aggressive regularization. We demonstrate how these models learn a representation that improves results in a variety of downstream tasks including generation, disentanglement, and extrapolation.
arXiv Detail & Related papers (2020-06-14T04:37:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.