Related papers: Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints

Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints

URL: http://arxiv.org/abs/2302.09185v1
Date: Fri, 17 Feb 2023 23:30:28 GMT
Title: Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints
Authors: Albert Lu, Hongxin Zhang, Yanzhe Zhang, Xuezhi Wang, Diyi Yang
Abstract summary: We take a prompt-centric approach to analyzing and bounding the abilities of open-ended generative models. We present a generic methodology of analysis with two challenging prompt constraint types: structural and stylistic. Our results and our in-context mitigation strategies reveal open challenges for future research.
Score: 38.69469206527995
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The limits of open-ended generative models are unclear, yet increasingly important. What causes them to succeed and what causes them to fail? In this paper, we take a prompt-centric approach to analyzing and bounding the abilities of open-ended generative models. We present a generic methodology of analysis with two challenging prompt constraint types: structural and stylistic. These constraint types are categorized into a set of well-defined constraints that are analyzable by a single prompt. We then systematically create a diverse set of simple, natural, and useful prompts to robustly analyze each individual constraint. Using the GPT-3 text-davinci-002 model as a case study, we generate outputs from our collection of prompts and analyze the model's generative failures. We also show the generalizability of our proposed method on other large models like BLOOM and OPT. Our results and our in-context mitigation strategies reveal open challenges for future research. We have publicly released our code at https://github.com/SALT-NLP/Bound-Cap-LLM.

Related papers

Generalizing Constraint Models in Constraint Acquisition [6.305123652677644]
Constraint Acquisition (CA) aims to widen the use of constraint programming by assisting users in the modeling process. Most CA methods suffer from a significant drawback: they learn a single set of individual constraints for a specific problem instance, but cannot generalize these constraints to the parameterized constraint specifications of the problem. We propose GenCon, a novel approach to learn parameterized constraint models capable of modeling varying instances of the same problem.
arXiv Detail & Related papers (2024-12-19T15:31:29Z)
Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines [74.42485647685272]
We focus on Generative Masked Language Models (GMLMs) We train a model to fit conditional probabilities of the data distribution via masking, which are subsequently used as inputs to a Markov Chain to draw samples from the model. We adapt the T5 model for iteratively-refined parallel decoding, achieving 2-3x speedup in machine translation with minimal sacrifice in quality.
arXiv Detail & Related papers (2024-07-22T18:00:00Z)
Optimizing Language Model's Reasoning Abilities with Weak Supervision [48.60598455782159]
We present textscPuzzleBen, a weakly supervised benchmark that comprises 25,147 complex questions, answers, and human-generated rationales. A unique aspect of our dataset is the inclusion of 10,000 unannotated questions, enabling us to explore utilizing fewer supersized data to boost LLMs' inference capabilities.
arXiv Detail & Related papers (2024-05-07T07:39:15Z)
Vector-Quantized Prompt Learning for Paraphrase Generation [18.40940464497253]
This paper proposes to generate diverse and high-quality paraphrases by exploiting the pre-trained models with instance-dependent prompts. Extensive experiments demonstrate that the proposed method achieves new state-of-art results on three benchmark datasets.
arXiv Detail & Related papers (2023-11-25T07:13:06Z)
Generative Judge for Evaluating Alignment [84.09815387884753]
We propose a generative judge with 13B parameters, Auto-J, designed to address these challenges. Our model is trained on user queries and LLM-generated responses under massive real-world scenarios. Experimentally, Auto-J outperforms a series of strong competitors, including both open-source and closed-source models.
arXiv Detail & Related papers (2023-10-09T07:27:15Z)
Tractable Control for Autoregressive Language Generation [82.79160918147852]
We propose to use tractable probabilistic models (TPMs) to impose lexical constraints in autoregressive text generation models. We show that GeLaTo achieves state-of-the-art performance on challenging benchmarks for constrained text generation. Our work opens up new avenues for controlling large language models and also motivates the development of more expressive TPMs.
arXiv Detail & Related papers (2023-04-15T00:19:44Z)
Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World [67.03968403301143]
Scene Graph Generation (SGG) aims to extract subject, predicate, object> relationships in images for vision understanding. Existing re-balancing strategies try to handle it via prior rules but are still confined to pre-defined conditions. We propose a Cross-modal prediCate boosting (CaCao) framework, where a visually-prompted language model is learned to generate diverse fine-grained predicates.
arXiv Detail & Related papers (2023-03-23T13:06:38Z)
Constrained Sampling from Language Models via Langevin Dynamics in Embedding Spaces [34.375537557235724]
We propose a sampling procedure that combines the log-likelihood of the language model with arbitrary differentiable constraints into a single energy function. We evaluate our approach on different text generation tasks with soft and hard constraints as well as their combinations with competitive results for toxicity avoidance, sentiment control, and keyword-guided generation.
arXiv Detail & Related papers (2022-05-25T08:09:03Z)
Twist Decoding: Diverse Generators Guide Each Other [116.20780037268801]
We introduce Twist decoding, a simple and general inference algorithm that generates text while benefiting from diverse models. Our method does not assume the vocabulary, tokenization or even generation order is shared.
arXiv Detail & Related papers (2022-05-19T01:27:53Z)
ANLIzing the Adversarial Natural Language Inference Dataset [46.7480191735164]
We perform an in-depth error analysis of Adversarial NLI (ANLI), a recently introduced large-scale human-and-model-in-the-loop natural language inference dataset. We propose a fine-grained annotation scheme of the different aspects of inference that are responsible for the gold classification labels, and use it to hand-code all three of the ANLI development sets.
arXiv Detail & Related papers (2020-10-24T01:03:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.