Enhancing Trust in Language Model-Based Code Optimization through RLHF: A Research Design
- URL: http://arxiv.org/abs/2502.06769v2
- Date: Tue, 18 Mar 2025 11:12:46 GMT
- Title: Enhancing Trust in Language Model-Based Code Optimization through RLHF: A Research Design
- Authors: Jingzhi Gong,
- Abstract summary: This research aims to develop reliable, LM-powered methods for code optimization that effectively integrate human feedback.<n>This work aligns with the broader objectives of advancing cooperative and human-centric aspects of software engineering.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the rapid advancement of AI, software engineering increasingly relies on AI-driven approaches, particularly language models (LMs), to enhance code performance. However, the trustworthiness and reliability of LMs remain significant challenges due to the potential for hallucinations - unreliable or incorrect responses. To fill this gap, this research aims to develop reliable, LM-powered methods for code optimization that effectively integrate human feedback. This work aligns with the broader objectives of advancing cooperative and human-centric aspects of software engineering, contributing to the development of trustworthy AI-driven solutions.
Related papers
- Aligning Crowd-sourced Human Feedback for Reinforcement Learning on Code Generation by Large Language Models [2.6641834518599308]
We study how AI-assisted programming and large language models (LLM) improve software developers' ability via AI tools like Github Copilot and Amazon CodeWhisperer.
We show that our Bayesian optimization framework supports AI alignment in code generation by distributing the feedback collection burden.
arXiv Detail & Related papers (2025-03-19T11:44:47Z) - A Survey on Post-training of Large Language Models [185.51013463503946]
Large Language Models (LLMs) have fundamentally transformed natural language processing, making them indispensable across domains ranging from conversational systems to scientific exploration.
These challenges necessitate advanced post-training language models (PoLMs) to address shortcomings, such as restricted reasoning capacities, ethical uncertainties, and suboptimal domain-specific performance.
This paper presents the first comprehensive survey of PoLMs, systematically tracing their evolution across five core paradigms.
arXiv Detail & Related papers (2025-03-08T05:41:42Z) - Improving Retrospective Language Agents via Joint Policy Gradient Optimization [57.35348425288859]
RetroAct is a framework that jointly optimize both task-planning and self-reflective evolution capabilities in language agents.
We develop a two-stage joint optimization process that integrates imitation learning and reinforcement learning.
We conduct extensive experiments across various testing environments, demonstrating RetroAct has substantial improvements in task performance and decision-making processes.
arXiv Detail & Related papers (2025-03-03T12:54:54Z) - Language Models for Code Optimization: Survey, Challenges and Future Directions [7.928856221466083]
Language models (LMs) built upon deep neural networks (DNNs) have recently demonstrated breakthrough effectiveness in software engineering tasks.<n>This study aims to provide actionable insights and references for both researchers and practitioners in this rapidly evolving field.
arXiv Detail & Related papers (2025-01-02T14:20:36Z) - Optimizing AI-Assisted Code Generation [0.8901073744693314]
AI-assisted code-generation tools have significantly transformed software development.<n>The security, reliability, functionality, and quality of the generated code must be guaranteed.<n>This paper examines the implementation of these goals to date and explores strategies to optimize them.
arXiv Detail & Related papers (2024-12-14T20:14:44Z) - The Fusion of Large Language Models and Formal Methods for Trustworthy AI Agents: A Roadmap [12.363424584297974]
This paper outlines a roadmap for advancing the next generation of trustworthy AI systems.<n>We show how FMs can help LLMs generate more reliable and formally certified outputs.<n>We acknowledge that this integration has the potential to enhance both the trustworthiness and efficiency of software engineering practices.
arXiv Detail & Related papers (2024-12-09T14:14:21Z) - Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs)
The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation.
We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z) - Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment [103.05005690990271]
Traditional alignment strategies rely heavily on human intervention, such asSupervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF)
We propose a novel self-alignment method that utilizes a Chain of Thought (CoT) approach, termed AlignCoT.
We introduce the Mixture of insighTful Experts (MoTE) architecture, which applies mixture of experts to enhance each component of the AlignCoT process, markedly increasing alignment efficiency.
arXiv Detail & Related papers (2024-05-01T15:06:05Z) - Large Language Model-based Human-Agent Collaboration for Complex Task
Solving [94.3914058341565]
We introduce the problem of Large Language Models (LLMs)-based human-agent collaboration for complex task-solving.
We propose a Reinforcement Learning-based Human-Agent Collaboration method, ReHAC.
This approach includes a policy model designed to determine the most opportune stages for human intervention within the task-solving process.
arXiv Detail & Related papers (2024-02-20T11:03:36Z) - DeAL: Decoding-time Alignment for Large Language Models [59.63643988872571]
Large Language Models (LLMs) are nowadays expected to generate content aligned with human preferences.
We propose DeAL, a framework that allows the user to customize reward functions and enables Detime Alignment of LLMs.
Our experiments show that we can DeAL with fine-grained trade-offs, improve adherence to alignment objectives, and address residual gaps in LLMs.
arXiv Detail & Related papers (2024-02-05T06:12:29Z) - Accelerating LLaMA Inference by Enabling Intermediate Layer Decoding via
Instruction Tuning with LITE [62.13435256279566]
Large Language Models (LLMs) have achieved remarkable performance across a wide variety of natural language tasks.
However, their large size makes their inference slow and computationally expensive.
We show that it enables these layers to acquire 'good' generation ability without affecting the generation ability of the final layer.
arXiv Detail & Related papers (2023-10-28T04:07:58Z) - Data-Driven and SE-assisted AI Model Signal-Awareness Enhancement and
Introspection [61.571331422347875]
We propose a data-driven approach to enhance models' signal-awareness.
We combine the SE concept of code complexity with the AI technique of curriculum learning.
We achieve up to 4.8x improvement in model signal awareness.
arXiv Detail & Related papers (2021-11-10T17:58:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.