Related papers: Prompt engineering and framework: implementation to increase code reliability based guideline for LLMs

Prompt engineering and framework: implementation to increase code reliability based guideline for LLMs

URL: http://arxiv.org/abs/2506.10989v1
Date: Wed, 19 Mar 2025 18:33:08 GMT
Title: Prompt engineering and framework: implementation to increase code reliability based guideline for LLMs
Authors: Rogelio Cruz, Jonatan Contreras, Francisco Guerrero, Ezequiel Rodriguez, Carlos Valdez, Citlali Carrillo,
Abstract summary: We introduce a prompt template designed to improve the quality and correctness of generated code snippets.<n>We demonstrate that our approach outperforms widely studied zero-shot and Chain-of-Thought (CoT) methods in terms of the Pass@k metric.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we propose a novel prompting approach aimed at enhancing the ability of Large Language Models (LLMs) to generate accurate Python code. Specifically, we introduce a prompt template designed to improve the quality and correctness of generated code snippets, enabling them to pass tests and produce reliable results. Through experiments conducted on two state-of-the-art LLMs using the HumanEval dataset, we demonstrate that our approach outperforms widely studied zero-shot and Chain-of-Thought (CoT) methods in terms of the Pass@k metric. Furthermore, our method achieves these improvements with significantly reduced token usage compared to the CoT approach, making it both effective and resource-efficient, thereby lowering the computational demands and improving the eco-footprint of LLM capabilities. These findings highlight the potential of tailored prompting strategies to optimize code generation performance, paving the way for broader applications in AI-driven programming tasks.

Related papers

TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity Tree [52.44403214958304]
In this paper, we introduce TreeLoRA, a novel approach that constructs layer-wise adapters by leveraging hierarchical gradient similarity.<n>To reduce the computational burden of task similarity estimation, we employ bandit techniques to develop an algorithm based on lower confidence bounds.<n> experiments on both vision transformers (ViTs) and large language models (LLMs) demonstrate the effectiveness and efficiency of our approach.
arXiv Detail & Related papers (2025-06-12T05:25:35Z)
Less is More: Towards Green Code Large Language Models via Unified Structural Pruning [27.428983811427827]
We propose Flab-Pruner, an innovative unified structural pruning method that combines vocabulary, layer, and Feed-Forward Network (FFN) pruning.<n>The results demonstrate that Flab-Pruner retains 97% of the original performance after pruning 22% of the parameters and achieves the same or even better performance after post-training.
arXiv Detail & Related papers (2024-12-20T14:13:09Z)
Closer Look at Efficient Inference Methods: A Survey of Speculative Decoding [1.3479499607624648]
Speculative decoding addresses bottleneck by introducing a two-stage framework: drafting and verification.<n>A smaller, efficient model generates a preliminary draft, which is then refined by a larger, more sophisticated model.<n>This paper provides a comprehensive survey of speculative decoding methods, categorizing them into draft-centric and model-centric approaches.
arXiv Detail & Related papers (2024-11-20T09:46:30Z)
An Effective Approach to Embedding Source Code by Combining Large Language and Sentence Embedding Models [6.976968804436321]
This paper proposes a novel approach to embedding source code by combining large language and sentence embedding models.<n>To evaluate the performance of our proposed approach, we conducted a series of experiments on three datasets with different programming languages.
arXiv Detail & Related papers (2024-09-23T01:03:15Z)
EPiC: Cost-effective Search-based Prompt Engineering of LLMs for Code Generation [8.009881267479189]
Large Language Models (LLMs) have seen increasing use in various software development tasks, especially in code generation. We propose an alternative approach named Evolutionary Prompt Engineering for Code (EPiC) to evolve the original prompts toward better ones that produce high-quality code. Our evaluation against state-of-the-art (SOTA) LLM-based code generation models shows that EPiC outperforms all the baselines in terms of cost-effectiveness.
arXiv Detail & Related papers (2024-08-20T21:15:36Z)
Adaptive Draft-Verification for Efficient Large Language Model Decoding [24.347886232342862]
Large language model (LLM) decoding involves generating a sequence of tokens based on a given context. The typical autoregressive decoding method requires a separate forward pass through the model for each token generated. We introduce ADED, which accelerates LLM decoding without requiring fine-tuning.
arXiv Detail & Related papers (2024-06-27T22:20:39Z)
One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs)<n>We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z)
How Can LLM Guide RL? A Value-Based Approach [68.55316627400683]
Reinforcement learning (RL) has become the de facto standard practice for sequential decision-making problems by improving future acting policies with feedback. Recent developments in large language models (LLMs) have showcased impressive capabilities in language understanding and generation, yet they fall short in exploration and self-improvement capabilities. We develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning.
arXiv Detail & Related papers (2024-02-25T20:07:13Z)
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components. CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks. FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization. Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z)
Accelerating LLaMA Inference by Enabling Intermediate Layer Decoding via Instruction Tuning with LITE [62.13435256279566]
Large Language Models (LLMs) have achieved remarkable performance across a wide variety of natural language tasks. However, their large size makes their inference slow and computationally expensive. We show that it enables these layers to acquire 'good' generation ability without affecting the generation ability of the final layer.
arXiv Detail & Related papers (2023-10-28T04:07:58Z)
Let's reward step by step: Step-Level reward model as the Navigators for Reasoning [64.27898739929734]
Process-Supervised Reward Model (PRM) furnishes LLMs with step-by-step feedback during the training phase. We propose a greedy search algorithm that employs the step-level feedback from PRM to optimize the reasoning pathways explored by LLMs. To explore the versatility of our approach, we develop a novel method to automatically generate step-level reward dataset for coding tasks and observed similar improved performance in the code generation tasks.
arXiv Detail & Related papers (2023-10-16T05:21:50Z)
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning [52.257422715393574]
We introduce a self-guided methodology for Large Language Models (LLMs) to autonomously discern and select cherry samples from open-source datasets. Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal metric to identify discrepancies between a model's expected responses and its intrinsic generation capability.
arXiv Detail & Related papers (2023-08-23T09:45:29Z)
Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs. Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.