INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness
- URL: http://arxiv.org/abs/2407.02518v2
- Date: Tue, 29 Oct 2024 08:20:28 GMT
- Title: INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness
- Authors: Hung Le, Yingbo Zhou, Caiming Xiong, Silvio Savarese, Doyen Sahoo,
- Abstract summary: We introduce INDICT: a new framework that empowers large language models with Internal Dialogues of Critiques for both safety and helpfulness guidance.
The internal dialogue is a dual cooperative system between a safety-driven critic and a helpfulness-driven critic.
We observed that our approach can provide an advanced level of critiques of both safety and helpfulness analysis, significantly improving the quality of output codes.
- Score: 110.6921470281479
- License:
- Abstract: Large language models (LLMs) for code are typically trained to align with natural language instructions to closely follow their intentions and requirements. However, in many practical scenarios, it becomes increasingly challenging for these models to navigate the intricate boundary between helpfulness and safety, especially against highly complex yet potentially malicious instructions. In this work, we introduce INDICT: a new framework that empowers LLMs with Internal Dialogues of Critiques for both safety and helpfulness guidance. The internal dialogue is a dual cooperative system between a safety-driven critic and a helpfulness-driven critic. Each critic provides analysis against the given task and corresponding generated response, equipped with external knowledge queried through relevant code snippets and tools like web search and code interpreter. We engage the dual critic system in both code generation stage as well as code execution stage, providing preemptive and post-hoc guidance respectively to LLMs. We evaluated INDICT on 8 diverse tasks across 8 programming languages from 5 benchmarks, using LLMs from 7B to 70B parameters. We observed that our approach can provide an advanced level of critiques of both safety and helpfulness analysis, significantly improving the quality of output codes ($+10\%$ absolute improvements in all models).
Related papers
- CodEv: An Automated Grading Framework Leveraging Large Language Models for Consistent and Constructive Feedback [0.0]
This study presents an automated grading framework, CodEv, which leverages Large Language Models (LLMs) to provide consistent and constructive feedback.
Our framework also integrates LLM ensembles to improve the accuracy and consistency of scores, along with agreement tests to deliver reliable feedback and code review comments.
arXiv Detail & Related papers (2025-01-10T03:09:46Z) - Can Large Language Models Serve as Evaluators for Code Summarization? [47.21347974031545]
Large Language Models (LLMs) serve as effective evaluators for code summarization methods.
LLMs prompt an agent to play diverse roles, such as code reviewer, code author, code editor, and system analyst.
CODERPE achieves an 81.59% Spearman correlation with human evaluations, outperforming the existing BERTScore metric by 17.27%.
arXiv Detail & Related papers (2024-12-02T09:56:18Z) - Athena: Safe Autonomous Agents with Verbal Contrastive Learning [3.102303947219617]
Large language models (LLMs) have been utilized as language-based agents to perform a variety of tasks.
In this study, we introduce the Athena framework which leverages the concept of verbal contrastive learning.
The framework also incorporates a critiquing mechanism to guide the agent to prevent risky actions at every step.
arXiv Detail & Related papers (2024-08-20T17:21:10Z) - How Far Have We Gone in Binary Code Understanding Using Large Language Models [51.527805834378974]
We propose a benchmark to evaluate the effectiveness of Large Language Models (LLMs) in binary code understanding.
Our evaluations reveal that existing LLMs can understand binary code to a certain extent, thereby improving the efficiency of binary code analysis.
arXiv Detail & Related papers (2024-04-15T14:44:08Z) - ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming [64.86326523181553]
ALERT is a large-scale benchmark to assess safety based on a novel fine-grained risk taxonomy.
It aims to identify vulnerabilities, inform improvements, and enhance the overall safety of the language models.
arXiv Detail & Related papers (2024-04-06T15:01:47Z) - CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion [117.178835165855]
This paper introduces CodeAttack, a framework that transforms natural language inputs into code inputs.
Our studies reveal a new and universal safety vulnerability of these models against code input.
We find that a larger distribution gap between CodeAttack and natural language leads to weaker safety generalization.
arXiv Detail & Related papers (2024-03-12T17:55:38Z) - SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models [107.82336341926134]
SALAD-Bench is a safety benchmark specifically designed for evaluating Large Language Models (LLMs)
It transcends conventional benchmarks through its large scale, rich diversity, intricate taxonomy spanning three levels, and versatile functionalities.
arXiv Detail & Related papers (2024-02-07T17:33:54Z) - Ocassionally Secure: A Comparative Analysis of Code Generation
Assistants [8.573156248244695]
This paper focuses on identifying and understanding the conditions and contexts in which LLMs can be effectively and safely deployed.
We conducted a comparative analysis of four advanced LLMs--GPT-3.5 and GPT-4 using ChatGPT and Bard and Gemini from Google--using 9 separate tasks to assess each model's code generation capabilities.
We collected 61 code outputs and analyzed them across several aspects: functionality, security, performance, complexity, and reliability.
arXiv Detail & Related papers (2024-02-01T15:49:47Z) - Through the Lens of Core Competency: Survey on Evaluation of Large
Language Models [27.271533306818732]
Large language model (LLM) has excellent performance and wide practical uses.
Existing evaluation tasks are difficult to keep up with the wide range of applications in real-world scenarios.
We summarize 4 core competencies of LLM, including reasoning, knowledge, reliability, and safety.
Under this competency architecture, similar tasks are combined to reflect corresponding ability, while new tasks can also be easily added into the system.
arXiv Detail & Related papers (2023-08-15T17:40:34Z) - Safety Assessment of Chinese Large Language Models [51.83369778259149]
Large language models (LLMs) may generate insulting and discriminatory content, reflect incorrect social values, and may be used for malicious purposes.
To promote the deployment of safe, responsible, and ethical AI, we release SafetyPrompts including 100k augmented prompts and responses by LLMs.
arXiv Detail & Related papers (2023-04-20T16:27:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.