$\mathbb{USCD}$: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding
- URL: http://arxiv.org/abs/2409.05923v1
- Date: Mon, 9 Sep 2024 02:07:41 GMT
- Title: $\mathbb{USCD}$: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding
- Authors: Shuai Wang, Liang Ding, Li Shen, Yong Luo, Zheng He, Wei Yu, Dacheng Tao,
- Abstract summary: Large language models (LLMs) have shown remarkable capabilities in code generation.
The effects of hallucinations (e.g., output noise) make it challenging for LLMs to generate high-quality code in one pass.
We propose a simple and effective textbfuncertainty-aware textbfselective textbfcontrastive textbfdecoding.
- Score: 64.00025564372095
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have shown remarkable capabilities in code generation. However, the effects of hallucinations (e.g., output noise) make it particularly challenging for LLMs to generate high-quality code in one pass. In this work, we propose a simple and effective \textbf{u}ncertainty-aware \textbf{s}elective \textbf{c}ontrastive \textbf{d}ecoding ($\mathbb{USCD}$) mechanism to improve the quality of one-pass code generation in LLMs and reduce the impact of output noise. To be specific, we first elaborately designed a negative prompt (namely lame prompt) to output noise by removing input-output examples from the standard few-shot prompt. Our preliminary study shows that the Jensen-Shannon divergence (JS divergence) between token distribution uncertainty and the output noise is relatively low (approximately $0.25$), indicating their high relevance. Then, we selectively eliminate output noise induced by lame prompts based on the uncertainty of the prediction distribution from the standard prompt. Notably, our proposed plug-and-play mechanism is an inference-only method, enjoying appealing flexibility. Extensive experiments on widely used benchmarks, e.g., HumanEval, MBPP, and MultiPL-E, upon several LLMs (i.e., Inocder-6b, CodeLlama-7b, WizardCoder-15b, StarCoder, and Llama2-7b), demonstrate that our proposed USCD significantly improves one-pass code generation, with an average \textit{pass@$1$} scores increase of 16.59\%. We will release code and data on GitHub.
Related papers
- Selective Prompt Anchoring for Code Generation [11.60432173396084]
Small version of DeepSeek-Coder (6.7B) can achieve better performance than an original much larger version (33B)
Our results demonstrate that using SPA can consistently improve Pass@1 rates by up to 9.7% in all settings.
arXiv Detail & Related papers (2024-08-17T07:11:02Z) - Lower Layer Matters: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused [44.37155553647802]
Large Language Models (LLMs) have demonstrated exceptional performance across various natural language processing tasks.
They occasionally yield content that factually inaccurate or discordant with the expected output.
Recent works have investigated contrastive decoding between the original model and an amateur model with induced hallucination.
We introduce a novel contrastive decoding framework termed LOL (LOwer Layer Matters)
arXiv Detail & Related papers (2024-08-16T14:23:59Z) - Decoding Matters: Addressing Amplification Bias and Homogeneity Issue for LLM-based Recommendation [32.85339480783571]
We introduce a new decoding approach named Debiasing-Diversifying Decoding (D3)
D3 disables length normalization for ghost tokens to alleviate amplification bias.
Experiments on real-world datasets demonstrate the method's effectiveness.
arXiv Detail & Related papers (2024-06-21T06:47:28Z) - Nearest Neighbor Speculative Decoding for LLM Generation and Attribution [87.3259169631789]
Nearest Speculative Decoding (NEST) is capable of incorporating real-world text spans of arbitrary length into the LM generations and providing attribution to their sources.
NEST significantly enhances the generation quality and attribution rate of the base LM across a variety of knowledge-intensive tasks.
In addition, NEST substantially improves the generation speed, achieving a 1.8x speedup in inference time when applied to Llama-2-Chat 70B.
arXiv Detail & Related papers (2024-05-29T17:55:03Z) - FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping [49.66872823080736]
Autoregressive Large Language Models (e.g., LLaMa, GPTs) are omnipresent achieving remarkable success in language understanding and generation.
To mitigate overload incurred during generation, several early-exit and layer-dropping strategies have been proposed.
We propose FFN-SkipLLM, which is an input-adaptive feed-forward skipping strategy.
arXiv Detail & Related papers (2024-04-05T02:35:43Z) - Augmenting Greybox Fuzzing with Generative AI [0.0]
We propose ChatFuzz, a greybox fuzzer augmented by generative AI.
We conduct extensive experiments to explore the best practice for harvesting the power of generative LLM models.
Experiment results show that our approach improves the edge coverage by 12.77% over the SOTA greybox fuzzer.
arXiv Detail & Related papers (2023-06-11T21:44:47Z) - Contrastive Decoding: Open-ended Text Generation as Optimization [153.35961722855686]
We propose contrastive decoding (CD), a reliable decoding approach.
It is inspired by the fact that the failures of larger LMs are even more prevalent in smaller LMs.
CD requires zero additional training, and produces higher quality text than decoding from the larger LM alone.
arXiv Detail & Related papers (2022-10-27T00:58:21Z) - Bridging the Gap Between Clean Data Training and Real-World Inference
for Spoken Language Understanding [76.89426311082927]
Existing models are trained on clean data, which causes a textitgap between clean data training and real-world inference.
We propose a method from the perspective of domain adaptation, by which both high- and low-quality samples are embedding into similar vector space.
Experiments on the widely-used dataset, Snips, and large scale in-house dataset (10 million training examples) demonstrate that this method not only outperforms the baseline models on real-world (noisy) corpus but also enhances the robustness, that is, it produces high-quality results under a noisy environment.
arXiv Detail & Related papers (2021-04-13T17:54:33Z) - Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for
Improved Generalization [93.95299500688286]
We focus on prediction problems with structured outputs subject to output validity constraints.
We propose composed fine-tuning, which fine-tunes a predictor composed with the pre-trained denoiser.
For two-layer ReLU networks, we prove that composed fine-tuning significantly reduces the complexity of the predictor.
arXiv Detail & Related papers (2020-06-29T17:14:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.