Bounding the Capabilities of Large Language Models in Open Text
Generation with Prompt Constraints
- URL: http://arxiv.org/abs/2302.09185v1
- Date: Fri, 17 Feb 2023 23:30:28 GMT
- Title: Bounding the Capabilities of Large Language Models in Open Text
Generation with Prompt Constraints
- Authors: Albert Lu, Hongxin Zhang, Yanzhe Zhang, Xuezhi Wang, Diyi Yang
- Abstract summary: We take a prompt-centric approach to analyzing and bounding the abilities of open-ended generative models.
We present a generic methodology of analysis with two challenging prompt constraint types: structural and stylistic.
Our results and our in-context mitigation strategies reveal open challenges for future research.
- Score: 38.69469206527995
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The limits of open-ended generative models are unclear, yet increasingly
important. What causes them to succeed and what causes them to fail? In this
paper, we take a prompt-centric approach to analyzing and bounding the
abilities of open-ended generative models. We present a generic methodology of
analysis with two challenging prompt constraint types: structural and
stylistic. These constraint types are categorized into a set of well-defined
constraints that are analyzable by a single prompt. We then systematically
create a diverse set of simple, natural, and useful prompts to robustly analyze
each individual constraint. Using the GPT-3 text-davinci-002 model as a case
study, we generate outputs from our collection of prompts and analyze the
model's generative failures. We also show the generalizability of our proposed
method on other large models like BLOOM and OPT. Our results and our in-context
mitigation strategies reveal open challenges for future research. We have
publicly released our code at https://github.com/SALT-NLP/Bound-Cap-LLM.
Related papers
- Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines [74.42485647685272]
We focus on Generative Masked Language Models (GMLMs)
We train a model to fit conditional probabilities of the data distribution via masking, which are subsequently used as inputs to a Markov Chain to draw samples from the model.
We adapt the T5 model for iteratively-refined parallel decoding, achieving 2-3x speedup in machine translation with minimal sacrifice in quality.
arXiv Detail & Related papers (2024-07-22T18:00:00Z) - Optimizing Language Model's Reasoning Abilities with Weak Supervision [48.60598455782159]
We present textscPuzzleBen, a weakly supervised benchmark that comprises 25,147 complex questions, answers, and human-generated rationales.
A unique aspect of our dataset is the inclusion of 10,000 unannotated questions, enabling us to explore utilizing fewer supersized data to boost LLMs' inference capabilities.
arXiv Detail & Related papers (2024-05-07T07:39:15Z) - Vector-Quantized Prompt Learning for Paraphrase Generation [18.40940464497253]
This paper proposes to generate diverse and high-quality paraphrases by exploiting the pre-trained models with instance-dependent prompts.
Extensive experiments demonstrate that the proposed method achieves new state-of-art results on three benchmark datasets.
arXiv Detail & Related papers (2023-11-25T07:13:06Z) - Generative Judge for Evaluating Alignment [84.09815387884753]
We propose a generative judge with 13B parameters, Auto-J, designed to address these challenges.
Our model is trained on user queries and LLM-generated responses under massive real-world scenarios.
Experimentally, Auto-J outperforms a series of strong competitors, including both open-source and closed-source models.
arXiv Detail & Related papers (2023-10-09T07:27:15Z) - Tractable Control for Autoregressive Language Generation [82.79160918147852]
We propose to use tractable probabilistic models (TPMs) to impose lexical constraints in autoregressive text generation models.
We show that GeLaTo achieves state-of-the-art performance on challenging benchmarks for constrained text generation.
Our work opens up new avenues for controlling large language models and also motivates the development of more expressive TPMs.
arXiv Detail & Related papers (2023-04-15T00:19:44Z) - Visually-Prompted Language Model for Fine-Grained Scene Graph Generation
in an Open World [67.03968403301143]
Scene Graph Generation (SGG) aims to extract subject, predicate, object> relationships in images for vision understanding.
Existing re-balancing strategies try to handle it via prior rules but are still confined to pre-defined conditions.
We propose a Cross-modal prediCate boosting (CaCao) framework, where a visually-prompted language model is learned to generate diverse fine-grained predicates.
arXiv Detail & Related papers (2023-03-23T13:06:38Z) - Constrained Sampling from Language Models via Langevin Dynamics in
Embedding Spaces [34.375537557235724]
We propose a sampling procedure that combines the log-likelihood of the language model with arbitrary differentiable constraints into a single energy function.
We evaluate our approach on different text generation tasks with soft and hard constraints as well as their combinations with competitive results for toxicity avoidance, sentiment control, and keyword-guided generation.
arXiv Detail & Related papers (2022-05-25T08:09:03Z) - Twist Decoding: Diverse Generators Guide Each Other [116.20780037268801]
We introduce Twist decoding, a simple and general inference algorithm that generates text while benefiting from diverse models.
Our method does not assume the vocabulary, tokenization or even generation order is shared.
arXiv Detail & Related papers (2022-05-19T01:27:53Z) - ANLIzing the Adversarial Natural Language Inference Dataset [46.7480191735164]
We perform an in-depth error analysis of Adversarial NLI (ANLI), a recently introduced large-scale human-and-model-in-the-loop natural language inference dataset.
We propose a fine-grained annotation scheme of the different aspects of inference that are responsible for the gold classification labels, and use it to hand-code all three of the ANLI development sets.
arXiv Detail & Related papers (2020-10-24T01:03:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.