Related papers: Token-Aware Coding Flow: A Study with Nano Surge in Reasoning Model

Token-Aware Coding Flow: A Study with Nano Surge in Reasoning Model

URL: http://arxiv.org/abs/2504.15989v1
Date: Tue, 22 Apr 2025 15:51:00 GMT
Title: Token-Aware Coding Flow: A Study with Nano Surge in Reasoning Model
Authors: Junwei Hu, Weicheng Zheng, Yan Liu, Yihan Liu,
Abstract summary: Token inflation during the reasoning process remains a formidable challenge to model performance and efficiency.<n>This paper introduces an innovative Token-Aware Coding Flow method, aimed at addressing the token inflation caused by smelly code in the Chain of Thought (CoT) process.
Score: 5.044393644778693
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the widespread application of large-scale language models (LLMs) in software engineering, the Chain of Thought (CoT) approach has emerged as a crucial tool for driving automated code generation and optimization. However, despite the significant success of CoT methods in generating high-quality code, the issue of token inflation during the reasoning process remains a formidable challenge to model performance and efficiency, particularly when dealing with complex code smells. Code smells not only affect the maintainability and scalability of code but also significantly increase the computational burden during LLM inference, leading to excessive token consumption and, consequently, reduced reasoning efficiency. This paper introduces an innovative Token-Aware Coding Flow method, aimed at addressing the token inflation problem caused by smelly code in the CoT process. Through experimentation, we validate the synergistic effect of code refactoring and prompt engineering strategies, demonstrating that after eliminating code smells, token consumption during model inference is significantly reduced. The experimental results show that refactored code, while maintaining functional consistency, can reduce token consumption by up to 50\%. Additionally, by explicitly prompting the type of code smells in the prompt and incorporating strategies such as context awareness and role constraints, we further optimize the reasoning process, achieving a 24.5\% to 30\% reduction in token consumption. These optimizations not only significantly enhance the model's reasoning efficiency and improve code generation quality but also provide new insights for addressing performance bottlenecks in complex code generation tasks.

Related papers

LLM4EFFI: Leveraging Large Language Models to Enhance Code Efficiency and Correctness [38.399282089600284]
Large Language Models (LLMs) have demonstrated impressive performance in code generation.<n>tool: ulineLarge ulineLanguage ulineModel for Code ulineEfficiency is a novel framework that enables LLMs to generate code that balances both efficiency and correctness.
arXiv Detail & Related papers (2025-02-17T07:01:18Z)
Enhancing Large Language Model Efficiencyvia Symbolic Compression: A Formal Approach Towards Interpretability [3.9122242678047456]
Large language models (LLMs) face significant token efficiency bottlenecks in code generation and logical reasoning tasks.<n>This paper proposes a formal framework based on symbolic compression,integrating logic, information-theoretic optimal encoding, and context-aware inference techniques.
arXiv Detail & Related papers (2025-01-30T06:40:52Z)
Generating refactored code accurately using reinforcement learning [3.179831861897336]
We propose a novel reinforcement learning-based approach for fine-tuning and aligning code language models to perform automated, intelligent extract method on Java source code. Our approach fine-tunes sequence-to-sequence generative models and aligns them using the Proximal Policy Optimization (PPO) algorithm. Our experiments demonstrate that our approach significantly enhances the performance of large language models in code.
arXiv Detail & Related papers (2024-12-23T23:09:48Z)
Less is More: Towards Green Code Large Language Models via Unified Structural Pruning [27.428983811427827]
We propose Flab-Pruner, an innovative unified structural pruning method that combines vocabulary, layer, and Feed-Forward Network (FFN) pruning.<n>The results demonstrate that Flab-Pruner retains 97% of the original performance after pruning 22% of the parameters and achieves the same or even better performance after post-training.
arXiv Detail & Related papers (2024-12-20T14:13:09Z)
A Theoretical Perspective for Speculative Decoding Algorithm [60.79447486066416]
One effective way to accelerate inference is emphSpeculative Decoding, which employs a small model to sample a sequence of draft tokens and a large model to validate. This paper tackles this gap by conceptualizing the decoding problem via markov chain abstraction and studying the key properties, emphoutput quality and inference acceleration, from a theoretical perspective.
arXiv Detail & Related papers (2024-10-30T01:53:04Z)
SwiftCoder: Enhancing Code Generation in Large Language Models through Efficiency-Aware Fine-tuning [17.355845751737423]
Current methods primarily focus on correctness, often overlooking efficiency.<n>dataset offers a scalable and effective solution for advancing AI-driven code generation.
arXiv Detail & Related papers (2024-10-14T07:05:51Z)
COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement [80.18490952057125]
Iterative refinement has emerged as an effective paradigm for enhancing the capabilities of large language models (LLMs) on complex tasks. We propose Context-Wise Order-Agnostic Language Modeling (COrAL) to overcome these challenges. Our approach models multiple token dependencies within manageable context windows, enabling the model to perform iterative refinement internally.
arXiv Detail & Related papers (2024-10-12T23:56:19Z)
CodeDPO: Aligning Code Models with Self Generated and Verified Source Code [52.70310361822519]
We propose CodeDPO, a framework that integrates preference learning into code generation to improve two key code preference factors: code correctness and efficiency. CodeDPO employs a novel dataset construction method, utilizing a self-generation-and-validation mechanism that simultaneously generates and evaluates code and test cases.
arXiv Detail & Related papers (2024-10-08T01:36:15Z)
Measuring Code Efficiency Optimization Capabilities with ACEOB [7.4056083791645495]
We conduct an in-depth analysis of "code patterns" in the model training dataset, meticulously exploring human-written code. We introduce the Automatic Code Efficiency Optimization Benchmark (ACEOB), which consists of 95,359 pairs of efficient-inefficient code. To our knowledge, ACEOB is the first dataset specifically targeting Python code efficiency optimization.
arXiv Detail & Related papers (2024-08-23T10:10:37Z)
Factor Graph Optimization of Error-Correcting Codes for Belief Propagation Decoding [62.25533750469467]
Low-Density Parity-Check (LDPC) codes possess several advantages over other families of codes. The proposed approach is shown to outperform the decoding performance of existing popular codes by orders of magnitude.
arXiv Detail & Related papers (2024-06-09T12:08:56Z)
Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective [85.48043537327258]
We propose MANGO (comMents As Natural loGic pivOts), including a comment contrastive training strategy and a corresponding logical comment decoding strategy. Results indicate that MANGO significantly improves the code pass rate based on the strong baselines. The robustness of the logical comment decoding strategy is notably higher than the Chain-of-thoughts prompting.
arXiv Detail & Related papers (2024-04-11T08:30:46Z)
A Transformer-based Approach for Source Code Summarization [86.08359401867577]
We learn code representation for summarization by modeling the pairwise relationship between code tokens. We show that despite the approach is simple, it outperforms the state-of-the-art techniques by a significant margin.
arXiv Detail & Related papers (2020-05-01T23:29:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.