An Evaluation of Context Length Extrapolation in Long Code via Positional Embeddings and Efficient Attention
- URL: http://arxiv.org/abs/2602.21800v1
- Date: Wed, 25 Feb 2026 11:27:34 GMT
- Title: An Evaluation of Context Length Extrapolation in Long Code via Positional Embeddings and Efficient Attention
- Authors: Madhusudan Ghosh, Rishabh Gupta,
- Abstract summary: Large language models (LLMs) have led to a significant increase in automated tools in the software engineering.<n>Despite these advancements, its effectiveness is constrained by fixed context lengths.<n>We investigate zero-shot, inference-only methods aimed at improving position encodings and optimizing attention mechanisms.
- Score: 4.307992396152879
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid advancement of large language models (LLMs) has led to a significant increase in automated tools in the software engineering, capable of performing various code-related tasks such as code generation, completion, and translation. Despite these advancements, its effectiveness is constrained by fixed context lengths, limiting its ability to generalize across long, domain-specific code sequences. To address this challenge, we investigate zero-shot, inference-only methods aimed at improving position encodings and optimizing attention mechanisms. Our goal is to provide a thorough analysis of current approaches that facilitate context length extrapolation in code, particularly in the context of long code completion tasks.
Related papers
- Modality Agnostic Efficient Long Range Encoder [14.705955027331674]
We address the challenge of long-context processing on a single device using generic implementations.<n>To overcome these limitations, we propose MAELRE, a unified and efficient transformer architecture.<n>We demonstrate that MAELRE achieves superior accuracy while reducing computational cost compared to existing long-context models.
arXiv Detail & Related papers (2025-07-25T16:19:47Z) - Training Language Models to Generate Quality Code with Program Analysis Feedback [66.0854002147103]
Code generation with large language models (LLMs) is increasingly adopted in production but fails to ensure code quality.<n>We propose REAL, a reinforcement learning framework that incentivizes LLMs to generate production-quality code.
arXiv Detail & Related papers (2025-05-28T17:57:47Z) - LSR-MCTS: Alleviating Long Range Dependency in Code Generation [42.10272627826627]
Large language models (LLMs) have significantly promoted the development of code generation task.<n>We propose the textbfLSR-MCTS algorithm, which leverages MCTS to determine the code line-by-line and select the optimal path.
arXiv Detail & Related papers (2025-04-10T04:03:25Z) - Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs [53.00384299879513]
In large language models (LLMs), code and reasoning reinforce each other.<n>Code provides verifiable execution paths, enforces logical decomposition, and enables runtime validation.<n>We identify key challenges and propose future research directions to strengthen this synergy.
arXiv Detail & Related papers (2025-02-26T18:55:42Z) - CodeIF: Benchmarking the Instruction-Following Capabilities of Large Language Models for Code Generation [20.013757490442064]
We introduce CodeIF, the first benchmark designed to assess the abilities of Large Language Models (LLMs) to adhere to task-oriented instructions.<n>CodeIF encompasses a broad range of tasks, including function synthesis, algorithmic instructions, and code explanation.<n>We conduct extensive experiments with LLMs, analyzing their strengths and limitations in meeting the demands of these tasks.
arXiv Detail & Related papers (2025-02-26T14:19:49Z) - LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification [42.54363549922909]
LongSpec is a framework that addresses the challenges of efficient inference over long contexts.<n>LongSpec achieves up to a 3.26x speedup over strong Flash Attention baselines.<n>The code is available at https://github.com/sail-sg/LongSpec.
arXiv Detail & Related papers (2025-02-24T18:53:31Z) - Chain of Grounded Objectives: Bridging Process and Goal-oriented Prompting for Code Generation [21.084058098777803]
Chain of Grounded Objectives (CGO) is a method that embeds functional objectives into input prompts to enhance code generation.<n>By leveraging appropriately structured objectives as input and avoiding explicit sequential procedures, CGO adapts effectively to the structured nature of programming tasks.
arXiv Detail & Related papers (2025-01-23T01:45:09Z) - Guided Code Generation with LLMs: A Multi-Agent Framework for Complex Code Tasks [1.9198713957364215]
Large Language Models (LLMs) have shown remarkable capabilities in code generation tasks.<n>They face significant limitations in handling complex, long-context programming challenges.<n>This paper introduces a novel agentic framework for guided code generation''
arXiv Detail & Related papers (2025-01-11T19:21:53Z) - Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning [94.76546523689113]
We introduce CodePlan, a framework that generates and follows textcode-form plans -- pseudocode that outlines high-level, structured reasoning processes.
CodePlan effectively captures the rich semantics and control flows inherent to sophisticated reasoning tasks.
It achieves a 25.1% relative improvement compared with directly generating responses.
arXiv Detail & Related papers (2024-09-19T04:13:58Z) - KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches [52.02764371205856]
Long context capability is a crucial competency for large language models (LLMs)
This work provides a taxonomy of current methods and evaluating 10+ state-of-the-art approaches across seven categories of long context tasks.
arXiv Detail & Related papers (2024-07-01T17:59:47Z) - A Thorough Examination of Decoding Methods in the Era of LLMs [72.65956436513241]
Decoding methods play an indispensable role in converting language models from next-token predictors into practical task solvers.
This paper provides a comprehensive and multifaceted analysis of various decoding methods within the context of large language models.
Our findings reveal that decoding method performance is notably task-dependent and influenced by factors such as alignment, model size, and quantization.
arXiv Detail & Related papers (2024-02-10T11:14:53Z) - ReACC: A Retrieval-Augmented Code Completion Framework [53.49707123661763]
We propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval.
We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.
arXiv Detail & Related papers (2022-03-15T08:25:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.