Related papers: Studying How Configurations Impact Code Generation in LLMs: the Case of ChatGPT

Studying How Configurations Impact Code Generation in LLMs: the Case of ChatGPT

URL: http://arxiv.org/abs/2502.17450v1
Date: Fri, 07 Feb 2025 18:04:14 GMT
Title: Studying How Configurations Impact Code Generation in LLMs: the Case of ChatGPT
Authors: Benedetta Donato, Leonardo Mariani, Daniela Micucci, Oliviero Riganelli,
Abstract summary: This paper systematically studies the impact of temperature and top-p parameters on code generation models.<n>We show how creativity can enhance code generation tasks.<n>We provide concrete recommendations for addressing the non-determinism of the model.
Score: 4.8748194765816955
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Leveraging LLMs for code generation is becoming increasingly common, as tools like ChatGPT can suggest method implementations with minimal input, such as a method signature and brief description. Empirical studies further highlight the effectiveness of LLMs in handling such tasks, demonstrating notable performance in code generation scenarios. However, LLMs are inherently non-deterministic, with their output influenced by parameters such as temperature, which regulates the model's level of creativity, and top-p, which controls the choice of the tokens that shall appear in the output. Despite their significance, the role of these parameters is often overlooked. This paper systematically studies the impact of these parameters, as well as the number of prompt repetitions required to account for non-determinism, in the context of 548 Java methods. We observe significantly different performances across different configurations of ChatGPT, with temperature having a marginal impact compared to the more prominent influence of the top-p parameter. Additionally, we show how creativity can enhance code generation tasks. Finally, we provide concrete recommendations for addressing the non-determinism of the model.

Related papers

An Empirical Study of Conformal Prediction in LLM with ASP Scaffolds for Robust Reasoning [52.29223403698673]
This paper examines the use of Conformal Language Modelling (CLM) alongside Answer Set Programming (ASP) We apply CLM to generate sets of ASP programs from an LLM, providing statistical guarantees on the correctness of the outputs. Experimental results show that CLM significantly outperforms baseline models that use standard sampling methods.
arXiv Detail & Related papers (2025-03-07T14:10:10Z)
Large Language Models Know What Makes Exemplary Contexts [42.90814615222177]
In-context learning (ICL) has proven to be a significant capability with the advancement of Large Language models (LLMs) This paper presents a unified framework for LLMs that allows them to self-select influential in-context examples to compose their contexts.
arXiv Detail & Related papers (2024-08-14T12:32:41Z)
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks. LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning. We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z)
Aligning Language Models with Demonstrated Feedback [58.834937450242975]
Demonstration ITerated Task Optimization (DITTO) directly aligns language model outputs to a user's demonstrated behaviors. We evaluate DITTO's ability to learn fine-grained style and task alignment across domains such as news articles, emails, and blog posts.
arXiv Detail & Related papers (2024-06-02T23:13:56Z)
One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs)<n>We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z)
SED: Self-Evaluation Decoding Enhances Large Language Models for Better Generation [35.10931307279044]
This paper proposes Self-Evaluation Decoding, SED, a decoding method for enhancing model generation. It integrates speculation and evaluation steps into the decoding process, allowing LLMs to make more careful decisions.
arXiv Detail & Related papers (2024-05-26T12:43:18Z)
Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs [23.766782325052418]
This paper introduces Sample Design Engineering (SDE), a methodical approach to enhancing Large Language Models' post-tuning performance. We conduct a series of in-domain (ID) and out-of-domain (OOD) experiments to assess the impact of various design options on LLMs' downstream performance. We propose an integrated SDE strategy, combining the most effective options, and validate its consistent superiority over sample designs in complex downstream tasks.
arXiv Detail & Related papers (2024-04-19T17:47:02Z)
A Thorough Examination of Decoding Methods in the Era of LLMs [72.65956436513241]
Decoding methods play an indispensable role in converting language models from next-token predictors into practical task solvers. This paper provides a comprehensive and multifaceted analysis of various decoding methods within the context of large language models. Our findings reveal that decoding method performance is notably task-dependent and influenced by factors such as alignment, model size, and quantization.
arXiv Detail & Related papers (2024-02-10T11:14:53Z)
Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation with Large Language Models [11.845239346943067]
parameter-efficient fine-tuning (PEFT) is a promising approach to efficiently specialize large language models (LLMs) to task-specific data.<n>Our study highlights the potential for tuning larger LLMs and significant reductions in memory usage by combining PEFT with quantization.
arXiv Detail & Related papers (2023-08-21T04:31:06Z)
Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs. Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.