Aligning Crowd-sourced Human Feedback for Reinforcement Learning on Code Generation by Large Language Models
- URL: http://arxiv.org/abs/2503.15129v1
- Date: Wed, 19 Mar 2025 11:44:47 GMT
- Title: Aligning Crowd-sourced Human Feedback for Reinforcement Learning on Code Generation by Large Language Models
- Authors: Man Fai Wong, Chee Wei Tan,
- Abstract summary: We study how AI-assisted programming and large language models (LLM) improve software developers' ability via AI tools like Github Copilot and Amazon CodeWhisperer.<n>We show that our Bayesian optimization framework supports AI alignment in code generation by distributing the feedback collection burden.
- Score: 2.6641834518599308
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper studies how AI-assisted programming and large language models (LLM) improve software developers' ability via AI tools (LLM agents) like Github Copilot and Amazon CodeWhisperer, while integrating human feedback to enhance reinforcement learning (RLHF) with crowd-sourced computation to enhance text-to-code generation. Additionally, we demonstrate that our Bayesian optimization framework supports AI alignment in code generation by distributing the feedback collection burden, highlighting the value of collecting human feedback of good quality. Our empirical evaluations demonstrate the efficacy of this approach, showcasing how LLM agents can be effectively trained for improved text-to-code generation. Our Bayesian optimization framework can be designed for general domain-specific languages, promoting the alignment of large language model capabilities with human feedback in AI-assisted programming for code generation.
Related papers
- Enhancing Trust in Language Model-Based Code Optimization through RLHF: A Research Design [0.0]
This research aims to develop reliable, LM-powered methods for code optimization that effectively integrate human feedback.<n>This work aligns with the broader objectives of advancing cooperative and human-centric aspects of software engineering.
arXiv Detail & Related papers (2025-02-10T18:48:45Z) - Optimizing AI-Assisted Code Generation [0.8901073744693314]
AI-assisted code-generation tools have significantly transformed software development.
The security, reliability, functionality, and quality of the generated code must be guaranteed.
This paper examines the implementation of these goals to date and explores strategies to optimize them.
arXiv Detail & Related papers (2024-12-14T20:14:44Z) - Leveraging Large Language Models for Code Translation and Software Development in Scientific Computing [0.9668407688201359]
generative artificial intelligence (GenAI) is poised to transform productivity in scientific computing.<n>We developed a tool, CodeScribe, which combines prompt engineering with user supervision to establish an efficient process for code conversion.<n>We also address the challenges of AI-driven code translation and highlight its benefits for enhancing productivity in scientific computing.
arXiv Detail & Related papers (2024-10-31T16:48:41Z) - TG-LLaVA: Text Guided LLaVA via Learnable Latent Embeddings [61.9257731511557]
We propose Text Guided LLaVA (TG-LLaVA) to optimize vision-language models (VLMs)
We use learnable latent embeddings as a bridge to analyze textual instruction and add the analysis results to the vision encoder as guidance.
With the guidance of text, the vision encoder can extract text-related features, similar to how humans focus on the most relevant parts of an image when considering a question.
arXiv Detail & Related papers (2024-09-15T00:38:34Z) - SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models [54.78329741186446]
We propose a novel paradigm that uses a code-based critic model to guide steps including question-code data construction, quality control, and complementary evaluation.
Experiments across both in-domain and out-of-domain benchmarks in English and Chinese demonstrate the effectiveness of the proposed paradigm.
arXiv Detail & Related papers (2024-08-28T06:33:03Z) - CodeGRAG: Bridging the Gap between Natural Language and Programming Language via Graphical Retrieval Augmented Generation [58.84212778960507]
We propose CodeGRAG, a Graphical Retrieval Augmented Code Generation framework to enhance the performance of LLMs.
CodeGRAG builds the graphical view of code blocks based on the control flow and data flow of them to fill the gap between programming languages and natural language.
Various experiments and ablations are done on four datasets including both the C++ and python languages to validate the hard meta-graph prompt, the soft prompting technique, and the effectiveness of the objectives for pretrained GNN expert.
arXiv Detail & Related papers (2024-05-03T02:48:55Z) - AI-powered Code Review with LLMs: Early Results [10.37036924997437]
We present a novel approach to improving software quality and efficiency through a Large Language Model (LLM)-based model.
Our proposed LLM-based AI agent model is trained on large code repositories.
It aims to detect code smells, identify potential bugs, provide suggestions for improvement, and optimize the code.
arXiv Detail & Related papers (2024-04-29T08:27:50Z) - UltraFeedback: Boosting Language Models with Scaled AI Feedback [99.4633351133207]
We present textscUltraFeedback, a large-scale, high-quality, and diversified AI feedback dataset.
Our work validates the effectiveness of scaled AI feedback data in constructing strong open-source chat language models.
arXiv Detail & Related papers (2023-10-02T17:40:01Z) - Exploring Large Language Model for Graph Data Understanding in Online
Job Recommendations [63.19448893196642]
We present a novel framework that harnesses the rich contextual information and semantic representations provided by large language models to analyze behavior graphs.
By leveraging this capability, our framework enables personalized and accurate job recommendations for individual users.
arXiv Detail & Related papers (2023-07-10T11:29:41Z) - Natural Language Generation and Understanding of Big Code for
AI-Assisted Programming: A Review [9.355153561673855]
This paper focuses on transformer-based large language models (LLMs) trained using Big Code.
LLMs have played a crucial role in facilitating AI-assisted programming applications, including code generation, code completion, code translation, code refinement, code summarization, defect detection, and clone detection.
It explores the challenges and opportunities associated with incorporating NLP techniques with software naturalness in these applications.
arXiv Detail & Related papers (2023-07-04T21:26:51Z) - Improving Code Generation by Training with Natural Language Feedback [69.52985513422381]
We formalize an algorithm for learning from natural language feedback at training time instead, which we call learning from Language Feedback (ILF)
ILF requires only a small amount of human-written feedback during training and does not require the same feedback at test time, making it both user-friendly and sample-efficient.
We use ILF to improve a Codegen-Mono 6.1B model's pass@1 rate by 38% relative (and 10% absolute) on the Mostly Basic Python Problems (MBPP) benchmark.
arXiv Detail & Related papers (2023-03-28T16:15:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.