Contrastive Decoding Improves Reasoning in Large Language Models
- URL: http://arxiv.org/abs/2309.09117v2
- Date: Fri, 29 Sep 2023 15:11:49 GMT
- Title: Contrastive Decoding Improves Reasoning in Large Language Models
- Authors: Sean O'Brien, Mike Lewis
- Abstract summary: We show that Contrastive Decoding achieves large out-of-the-box improvements over greedy decoding on a variety of reasoning tasks.
We show that Contrastive Decoding leads LLaMA-65B to outperform LLaMA 2, GPT-3.5 and PaLM 2-L on the HellaSwag commonsense reasoning benchmark.
- Score: 55.16503283583076
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We demonstrate that Contrastive Decoding -- a simple, computationally light,
and training-free text generation method proposed by Li et al 2022 -- achieves
large out-of-the-box improvements over greedy decoding on a variety of
reasoning tasks. Originally shown to improve the perceived quality of long-form
text generation, Contrastive Decoding searches for strings that maximize a
weighted difference in likelihood between strong and weak models. We show that
Contrastive Decoding leads LLaMA-65B to outperform LLaMA 2, GPT-3.5 and PaLM
2-L on the HellaSwag commonsense reasoning benchmark, and to outperform LLaMA
2, GPT-3.5 and PaLM-540B on the GSM8K math word reasoning benchmark, in
addition to improvements on a collection of other tasks. Analysis suggests that
Contrastive Decoding improves over existing methods by preventing some abstract
reasoning errors, as well as by avoiding simpler modes such as copying sections
of the input during chain-of-thought. Overall, Contrastive Decoding outperforms
nucleus sampling for long-form generation and greedy decoding for reasoning
tasks, making it a powerful general purpose method for generating text from
language models.
Related papers
- What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions.
We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types.
We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z) - M2CVD: Enhancing Vulnerability Semantic through Multi-Model Collaboration for Code Vulnerability Detection [52.4455893010468]
Large Language Models (LLMs) have strong capabilities in code comprehension, but fine-tuning costs and semantic alignment issues limit their project-specific optimization.
Code models such CodeBERT are easy to fine-tune, but it is often difficult to learn vulnerability semantics from complex code languages.
This paper introduces the Multi-Model Collaborative Vulnerability Detection approach (M2CVD) to improve the detection accuracy of code models.
arXiv Detail & Related papers (2024-06-10T00:05:49Z) - Learnable Item Tokenization for Generative Recommendation [78.30417863309061]
We propose LETTER (a LEarnable Tokenizer for generaTivE Recommendation), which integrates hierarchical semantics, collaborative signals, and code assignment diversity.
LETTER incorporates Residual Quantized VAE for semantic regularization, a contrastive alignment loss for collaborative regularization, and a diversity loss to mitigate code assignment bias.
arXiv Detail & Related papers (2024-05-12T15:49:38Z) - Chimera: A Lossless Decoding Method for Accelerating Large Language Models Inference by Fusing all Tokens [15.566726645722657]
We propose a novel framework specifically designed for speculative sampling.
Within this framework, we introduce a lightweight draft model that effectively utilizes previously generated tokens to predict subsequent words.
We demonstrate impressive results, achieving an average latency speedup ratio of 2.7x compared to the vanilla auto-regressive decoding approach.
arXiv Detail & Related papers (2024-02-24T08:10:39Z) - Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding [46.485363806259265]
Speculative Decoding has emerged as a novel decoding paradigm for Large Language Models (LLMs) inference.
In each decoding step, this method first drafts several future tokens efficiently and then verifies them in parallel.
This paper presents a comprehensive overview and analysis of this promising decoding paradigm.
arXiv Detail & Related papers (2024-01-15T17:26:50Z) - Speculative Contrastive Decoding [55.378200871224074]
Large language models(LLMs) exhibit exceptional performance in language tasks, yet their auto-regressive inference is limited due to high computational requirements and is sub-optimal due to the exposure bias.
Inspired by speculative decoding and contrastive decoding, we introduce Speculative Contrastive Decoding(SCD), a straightforward yet powerful decoding approach.
arXiv Detail & Related papers (2023-11-15T14:15:30Z) - Test-Case-Driven Programming Understanding in Large Language Models for
Better Code Generation [15.166827643436346]
muFiX is a novel prompting technique to improve the code generation performance of large language models (LLMs)
It first exploits test case analysis to obtain specification understanding and enables a self-improvement process.
muFiX further fixes the specification understanding towards the direction reducing the gap between the provided understanding and the actual understanding.
arXiv Detail & Related papers (2023-09-28T02:58:07Z) - ContraCLM: Contrastive Learning For Causal Language Model [54.828635613501376]
We present ContraCLM, a novel contrastive learning framework at both token-level and sequence-level.
We show that ContraCLM enhances discrimination of the representations and bridges the gap with the encoder-only models.
arXiv Detail & Related papers (2022-10-03T18:56:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.