Promptly Predicting Structures: The Return of Inference
- URL: http://arxiv.org/abs/2401.06877v3
- Date: Fri, 29 Mar 2024 18:27:17 GMT
- Title: Promptly Predicting Structures: The Return of Inference
- Authors: Maitrey Mehta, Valentina Pyatkin, Vivek Srikumar,
- Abstract summary: We present a framework for constructing zero- and few-shot linguistic structure predictors.
Our results show that enforcing consistency constructs not only structurally valid outputs, but also improves performance.
- Score: 31.442123334313035
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prompt-based methods have been used extensively across NLP to build zero- and few-shot label predictors. Many NLP tasks are naturally structured: that is, their outputs consist of multiple labels which constrain each other. Annotating data for such tasks can be cumbersome. Can the promise of the prompt-based paradigm be extended to such structured outputs? In this paper, we present a framework for constructing zero- and few-shot linguistic structure predictors. Our key insight is that we can use structural constraints -- and combinatorial inference derived from them -- to filter out inconsistent structures predicted by large language models. We instantiated this framework on two structured prediction tasks, and five datasets. Across all cases, our results show that enforcing consistency not only constructs structurally valid outputs, but also improves performance over the unconstrained variants.
Related papers
- Unsupervised Chunking with Hierarchical RNN [62.15060807493364]
This paper introduces an unsupervised approach to chunking, a syntactic task that involves grouping words in a non-hierarchical manner.
We present a two-layer Hierarchical Recurrent Neural Network (HRNN) designed to model word-to-chunk and chunk-to-sentence compositions.
Experiments on the CoNLL-2000 dataset reveal a notable improvement over existing unsupervised methods, enhancing phrase F1 score by up to 6 percentage points.
arXiv Detail & Related papers (2023-09-10T02:55:12Z) - Physics of Language Models: Part 1, Learning Hierarchical Language Structures [51.68385617116854]
Transformer-based language models are effective but complex, and understanding their inner workings is a significant challenge.
We introduce a family of synthetic CFGs that produce hierarchical rules, capable of generating lengthy sentences.
We demonstrate that generative models like GPT can accurately learn this CFG language and generate sentences based on it.
arXiv Detail & Related papers (2023-05-23T04:28:16Z) - StrAE: Autoencoding for Pre-Trained Embeddings using Explicit Structure [5.2869308707704255]
StrAE is a Structured Autoencoder framework that through strict adherence to explicit structure, enables effective learning of multi-level representations.
We show that our results are directly attributable to the informativeness of the structure provided as input, and show that this is not the case for existing tree models.
We then extend StrAE to allow the model to define its own compositions using a simple localised-merge algorithm.
arXiv Detail & Related papers (2023-05-09T16:20:48Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - Randomized Deep Structured Prediction for Discourse-Level Processing [45.725437752821655]
Expressive text encoders have been at the center of NLP models in recent work.
We show that we can efficiently leverage deep structured prediction and expressive neural encoders for a set of tasks involving complicated argumentative structures.
arXiv Detail & Related papers (2021-01-25T21:49:32Z) - Automated Concatenation of Embeddings for Structured Prediction [75.44925576268052]
We propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks.
We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model.
arXiv Detail & Related papers (2020-10-10T14:03:20Z) - Representations of Syntax [MASK] Useful: Effects of Constituency and
Dependency Structure in Recursive LSTMs [26.983602540576275]
Sequence-based neural networks show significant sensitivity to syntactic structure, but they still perform less well on syntactic tasks than tree-based networks.
We evaluate which of these two representational schemes more effectively introduces biases for syntactic structure.
We show that a constituency-based network generalizes more robustly than a dependency-based one, and that combining the two types of structure does not yield further improvement.
arXiv Detail & Related papers (2020-04-30T18:00:06Z) - Discontinuous Constituent Parsing with Pointer Networks [0.34376560669160383]
discontinuous constituent trees are crucial for representing all grammatical phenomena of languages such as German.
Recent advances in dependency parsing have shown that Pointer Networks excel in efficiently parsing syntactic relations between words in a sentence.
We propose a novel neural network architecture that is able to generate the most accurate discontinuous constituent representations.
arXiv Detail & Related papers (2020-02-05T15:12:03Z) - Torch-Struct: Deep Structured Prediction Library [138.5262350501951]
We introduce Torch-Struct, a library for structured prediction.
Torch-Struct includes a broad collection of probabilistic structures accessed through a simple and flexible distribution-based API.
arXiv Detail & Related papers (2020-02-03T16:43:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.