Rethinking Repetition Problems of LLMs in Code Generation
- URL: http://arxiv.org/abs/2505.10402v1
- Date: Thu, 15 May 2025 15:26:32 GMT
- Title: Rethinking Repetition Problems of LLMs in Code Generation
- Authors: Yihong Dong, Yuchen Liu, Xue Jiang, Zhi Jin, Ge Li,
- Abstract summary: We propose an efficient decoding approach called RPG, which stands for Repetition Penalization based on Grammar.<n> RPG first leverages grammar rules to identify repetition problems during code generation, and then strategically decays the likelihood of critical tokens that contribute to repetitions.<n>Extensive experimental results demonstrate that RPG substantially outperforms the best-performing baselines on CodeRepetEval dataset.
- Score: 36.42947561896802
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the advent of neural language models, the performance of code generation has been significantly boosted. However, the problem of repetitions during the generation process continues to linger. Previous work has primarily focused on content repetition, which is merely a fraction of the broader repetition problem in code generation. A more prevalent and challenging problem is structural repetition. In structural repetition, the repeated code appears in various patterns but possesses a fixed structure, which can be inherently reflected in grammar. In this paper, we formally define structural repetition and propose an efficient decoding approach called RPG, which stands for Repetition Penalization based on Grammar, to alleviate the repetition problems in code generation for LLMs. Specifically, RPG first leverages grammar rules to identify repetition problems during code generation, and then strategically decays the likelihood of critical tokens that contribute to repetitions, thereby mitigating them in code generation. To facilitate this study, we construct a new dataset CodeRepetEval to comprehensively evaluate approaches for mitigating the repetition problems in code generation. Extensive experimental results demonstrate that RPG substantially outperforms the best-performing baselines on CodeRepetEval dataset as well as HumanEval and MBPP benchmarks, effectively reducing repetitions and enhancing the quality of generated code.
Related papers
- Turning the Tide: Repository-based Code Reflection [52.13709676656648]
We introduce LiveRepoReflection, a benchmark for evaluating code understanding and generation in multi-file repository contexts.<n>1,888 rigorously filtered test cases across $6$ programming languages to ensure diversity, correctness, and high difficulty.<n>We also create RepoReflection-Instruct, a large-scale, quality-filtered instruction-tuning dataset derived from diverse sources.
arXiv Detail & Related papers (2025-07-14T02:36:27Z) - Understanding the Repeat Curse in Large Language Models from a Feature Perspective [10.413608338398785]
Large language models (LLMs) often suffer from repetitive text generation.<n>We propose a novel approach, "Duplicatus Charm", to induce and analyze the Repeat Curse.
arXiv Detail & Related papers (2025-04-19T07:53:37Z) - Code Copycat Conundrum: Demystifying Repetition in LLM-based Code Generation [22.100878392437508]
We conduct the first empirical study to investigate the prevalence and nature of repetition across 19 state-of-the-art code LLMs.<n>We propose DeRep, a rule-based technique designed to detect and mitigate repetition in generated code.
arXiv Detail & Related papers (2025-04-17T03:13:39Z) - Memorize or Generalize? Evaluating LLM Code Generation with Evolved Questions [33.58518352911762]
Large Language Models (LLMs) are known to exhibit a memorization phenomenon in code generation.<n>In this paper, we investigate this phenomenon by designing three evolution strategies to create variants: mutation, paraphrasing, and code-rewriting.<n>As expected, as supervised fine-tuning goes on, the memorization score rises before overfitting, suggesting more severe memorization.
arXiv Detail & Related papers (2025-03-04T05:39:24Z) - RefineCoder: Iterative Improving of Large Language Models via Adaptive Critique Refinement for Code Generation [13.75248879205993]
We propose Adaptive Critique Refinement (ACR), which enables the model to refine itself by self-generated code and external critique.<n>ACR includes a composite scoring system with LLM-as-a-Judge to evaluate the quality of code responses.<n>We develop the RefineCoder series by iteratively applying ACR, achieving continuous performance improvement on multiple code generation benchmarks.
arXiv Detail & Related papers (2025-02-13T11:17:53Z) - CodeXEmbed: A Generalist Embedding Model Family for Multiligual and Multi-task Code Retrieval [103.116634967815]
We introduce CodeXEmbed, a family of large-scale code embedding models ranging from 400M to 7B parameters.
Our novel training pipeline unifies multiple programming languages and transforms various code-related tasks into a common retrieval framework.
Our 7B model sets a new state-of-the-art (SOTA) in code retrieval, outperforming the previous leading model, Voyage-Code, by over 20% on CoIR benchmark.
arXiv Detail & Related papers (2024-11-19T16:54:45Z) - CodeRAG-Bench: Can Retrieval Augment Code Generation? [78.37076502395699]
We conduct a systematic, large-scale analysis of code generation using retrieval-augmented generation.<n>We first curate a comprehensive evaluation benchmark, CodeRAG-Bench, encompassing three categories of code generation tasks.<n>We examine top-performing models on CodeRAG-Bench by providing contexts retrieved from one or multiple sources.
arXiv Detail & Related papers (2024-06-20T16:59:52Z) - Mitigating the Learning Bias towards Repetition by Self-Contrastive
Training for Open-Ended Generation [92.42032403795879]
We show that pretrained language models (LMs) such as GPT2 still tend to generate repetitive texts.
We attribute their overestimation of token-level repetition probabilities to the learning bias.
We find that LMs use longer-range dependencies to predict repetitive tokens than non-repetitive ones, which may be the cause of sentence-level repetition loops.
arXiv Detail & Related papers (2023-07-04T07:53:55Z) - CodeRL: Mastering Code Generation through Pretrained Models and Deep
Reinforcement Learning [92.36705236706678]
"CodeRL" is a new framework for program synthesis tasks through pretrained LMs and deep reinforcement learning.
During inference, we introduce a new generation procedure with a critical sampling strategy.
For the model backbones, we extended the encoder-decoder architecture of CodeT5 with enhanced learning objectives.
arXiv Detail & Related papers (2022-07-05T02:42:15Z) - Pattern-aware Data Augmentation for Query Rewriting in Voice Assistant
Systems [10.332550622090718]
We propose an augmentation framework that learns patterns from existing training pairs and generates rewrite candidates from rewrite labels inversely to compensate for insufficient QR training data.
Our experimental results show its effectiveness compared with a fully trained QR baseline and demonstrate its potential application in boosting the QR performance on low-resource domains or locales.
arXiv Detail & Related papers (2020-12-21T16:36:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.