Stay on topic with Classifier-Free Guidance
- URL: http://arxiv.org/abs/2306.17806v1
- Date: Fri, 30 Jun 2023 17:07:02 GMT
- Title: Stay on topic with Classifier-Free Guidance
- Authors: Guillaume Sanchez, Honglu Fan, Alexander Spangher, Elad Levi, Pawan
Sasanka Ammanamanchi, Stella Biderman
- Abstract summary: We show that CFG can be used broadly as an inference-time technique in pure language modeling.
We show that CFG improves the performance of Pythia, GPT-2 and LLaMA-family models across an array of tasks.
- Score: 57.28934343207042
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Classifier-Free Guidance (CFG) has recently emerged in text-to-image
generation as a lightweight technique to encourage prompt-adherence in
generations. In this work, we demonstrate that CFG can be used broadly as an
inference-time technique in pure language modeling. We show that CFG (1)
improves the performance of Pythia, GPT-2 and LLaMA-family models across an
array of tasks: Q\&A, reasoning, code generation, and machine translation,
achieving SOTA on LAMBADA with LLaMA-7B over PaLM-540B; (2) brings improvements
equivalent to a model with twice the parameter-count; (3) can stack alongside
other inference-time methods like Chain-of-Thought and Self-Consistency,
yielding further improvements in difficult tasks; (4) can be used to increase
the faithfulness and coherence of assistants in challenging form-driven and
content-driven prompts: in a human evaluation we show a 75\% preference for
GPT4All using CFG over baseline.
Related papers
- Adaptable Logical Control for Large Language Models [68.27725600175013]
Ctrl-G is an adaptable framework that facilitates tractable and flexible control of model generation at inference time.
We show that Ctrl-G, when applied to a TULU2-7B model, outperforms GPT3.5 and GPT4 on the task of interactive text editing.
arXiv Detail & Related papers (2024-06-19T23:47:59Z) - On Zero-Shot Counterspeech Generation by LLMs [23.39818166945086]
We present a comprehensive analysis of the performances of four Large Language Models (LLM) in zero-shot settings for counterspeech generation.
Considering type of model, GPT-2 and FlanT5 models are significantly better in terms of counterspeech quality.
ChatGPT are much better at generating counter speech than other models across all metrics.
arXiv Detail & Related papers (2024-03-22T04:13:10Z) - Contextualization Distillation from Large Language Model for Knowledge
Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z) - Investigating the Efficacy of Large Language Models for Code Clone
Detection [2.0749231618270803]
Large Language Models (LLMs) have demonstrated remarkable success in various natural language processing and software engineering tasks.
In this study, we investigated the applicability of LLMs for Code Clone Detection (CCD), a non-generative task.
ChatGPT surpasses the baselines in cross-language CCD attaining an F1-score of 0.877 and achieves comparable performance to fully fine-tuned models for mono-lingual CCD.
arXiv Detail & Related papers (2024-01-24T20:43:36Z) - APoLLo: Unified Adapter and Prompt Learning for Vision Language Models [58.9772868980283]
We present APoLLo, a unified multi-modal approach that combines Adapter and Prompt learning for Vision-Language models.
APoLLo achieves a relative gain up to 6.03% over MaPLe (SOTA) on novel classes for 10 diverse image recognition datasets.
arXiv Detail & Related papers (2023-12-04T01:42:09Z) - Contrastive Decoding Improves Reasoning in Large Language Models [55.16503283583076]
We show that Contrastive Decoding achieves large out-of-the-box improvements over greedy decoding on a variety of reasoning tasks.
We show that Contrastive Decoding leads LLaMA-65B to outperform LLaMA 2, GPT-3.5 and PaLM 2-L on the HellaSwag commonsense reasoning benchmark.
arXiv Detail & Related papers (2023-09-17T00:29:32Z) - Chain-of-Thought Hub: A Continuous Effort to Measure Large Language
Models' Reasoning Performance [35.38549845444575]
Chain-of-Thought Hub is an open-source evaluation suite on the multi-step reasoning capabilities of large language models.
This work proposes Chain-of-Thought Hub, an open-source evaluation suite on the multi-step reasoning capabilities of large language models.
arXiv Detail & Related papers (2023-05-26T23:46:42Z) - Elaboration-Generating Commonsense Question Answering at Scale [77.96137534751445]
In question answering requiring common sense, language models (e.g., GPT-3) have been used to generate text expressing background knowledge.
We finetune smaller language models to generate useful intermediate context, referred to here as elaborations.
Our framework alternates between updating two language models -- an elaboration generator and an answer predictor -- allowing each to influence the other.
arXiv Detail & Related papers (2022-09-02T18:32:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.