A Hazard Analysis Framework for Code Synthesis Large Language Models
- URL: http://arxiv.org/abs/2207.14157v1
- Date: Mon, 25 Jul 2022 20:44:40 GMT
- Title: A Hazard Analysis Framework for Code Synthesis Large Language Models
- Authors: Heidy Khlaaf, Pamela Mishkin, Joshua Achiam, Gretchen Krueger, Miles
Brundage
- Abstract summary: Codex, a large language model (LLM) trained on a variety of Codes, exceeds the previous state of the art in its capacity to synthesize and generate code.
This paper outlines a hazard analysis framework constructed at OpenAI to uncover hazards or safety risks that the deployment of models like Codex may impose technically, socially, politically, and economically.
- Score: 2.535935501467612
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Codex, a large language model (LLM) trained on a variety of codebases,
exceeds the previous state of the art in its capacity to synthesize and
generate code. Although Codex provides a plethora of benefits, models that may
generate code on such scale have significant limitations, alignment problems,
the potential to be misused, and the possibility to increase the rate of
progress in technical fields that may themselves have destabilizing impacts or
have misuse potential. Yet such safety impacts are not yet known or remain to
be explored. In this paper, we outline a hazard analysis framework constructed
at OpenAI to uncover hazards or safety risks that the deployment of models like
Codex may impose technically, socially, politically, and economically. The
analysis is informed by a novel evaluation framework that determines the
capacity of advanced code generation techniques against the complexity and
expressivity of specification prompts, and their capability to understand and
execute them relative to human ability.
Related papers
- Toward Risk Thresholds for AI-Enabled Cyber Threats: Enhancing Decision-Making Under Uncertainty with Bayesian Networks [0.3151064009829256]
We propose a structured approach to developing and evaluating AI cyber risk thresholds.<n>First, we analyze existing industry cyber thresholds and identify common threshold elements.<n>Second, we propose the use of Bayesian networks as a tool for modeling AI-enabled cyber risk.
arXiv Detail & Related papers (2026-01-23T23:23:12Z) - Toward Quantitative Modeling of Cybersecurity Risks Due to AI Misuse [50.87630846876635]
We develop nine detailed cyber risk models.<n>Each model decomposes attacks into steps using the MITRE ATT&CK framework.<n>Individual estimates are aggregated through Monte Carlo simulation.
arXiv Detail & Related papers (2025-12-09T17:54:17Z) - Code Vulnerability Detection Across Different Programming Languages with AI Models [0.0]
This paper presents the implementations of transformer-based models like CodeBERT and CodeLlama.<n>It shows how off-the-shelf models can successfully produce predictive capacity in models through dynamic fine-tuning of the models on vulnerable and safe code fragments.<n>Experiments show that a well-trained CodeBERT can be as good as or even better than some existing static analyzers in terms of accuracy greater than 97%.
arXiv Detail & Related papers (2025-08-14T05:41:58Z) - Towards Safety and Security Testing of Cyberphysical Power Systems by Shape Validation [42.350737545269105]
complexity of cyberphysical power systems leads to larger attack surfaces to be exploited by malicious actors.<n>We propose to meet those risks with a declarative approach to describe cyber power systems and automatically evaluate security and safety controls.
arXiv Detail & Related papers (2025-06-14T12:07:44Z) - Systematic Hazard Analysis for Frontier AI using STPA [0.0]
frontier AI companies currently do not describe in detail any structured approach to identifying and analysing hazards.<n>A (Systems-Theoretic Process Analysis) is a systematic methodology for identifying how complex systems can become unsafe, leading to hazards.<n>We evaluateA's ability to broaden the scope, improve traceability and strengthen the robustness of safety assurance for frontier AI systems.
arXiv Detail & Related papers (2025-06-02T15:28:34Z) - Advancing Neural Network Verification through Hierarchical Safety Abstract Interpretation [52.626086874715284]
We introduce a novel problem formulation called Abstract DNN-Verification, which verifies a hierarchical structure of unsafe outputs.<n>By leveraging abstract interpretation and reasoning about output reachable sets, our approach enables assessing multiple safety levels during the formal verification process.<n>Our contributions include a theoretical exploration of the relationship between our novel abstract safety formulation and existing approaches.
arXiv Detail & Related papers (2025-05-08T13:29:46Z) - Improving Automated Secure Code Reviews: A Synthetic Dataset for Code Vulnerability Flaws [0.0]
We propose the creation of a synthetic dataset consisting of vulnerability-focused reviews that specifically comment on security flaws.
Our approach leverages Large Language Models (LLMs) to generate human-like code review comments for vulnerabilities.
arXiv Detail & Related papers (2025-04-22T23:07:24Z) - Computational Safety for Generative AI: A Signal Processing Perspective [65.268245109828]
computational safety is a mathematical framework that enables the quantitative assessment, formulation, and study of safety challenges in GenAI.
We show how sensitivity analysis and loss landscape analysis can be used to detect malicious prompts with jailbreak attempts.
We discuss key open research challenges, opportunities, and the essential role of signal processing in computational AI safety.
arXiv Detail & Related papers (2025-02-18T02:26:50Z) - SOK: Exploring Hallucinations and Security Risks in AI-Assisted Software Development with Insights for LLM Deployment [0.0]
Large Language Models (LLMs) such as GitHub Copilot, ChatGPT, Cursor AI, and Codeium AI have revolutionized the coding landscape.
This paper provides a comprehensive analysis of the benefits and risks associated with AI-powered coding tools.
arXiv Detail & Related papers (2025-01-31T06:00:27Z) - Sycophancy in Large Language Models: Causes and Mitigations [0.0]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide range of natural language processing tasks.
Their tendency to exhibit sycophantic behavior poses significant risks to their reliability and ethical deployment.
This paper provides a technical survey of sycophancy in LLMs, analyzing its causes, impacts, and potential mitigation strategies.
arXiv Detail & Related papers (2024-11-22T16:56:49Z) - HarmLevelBench: Evaluating Harm-Level Compliance and the Impact of Quantization on Model Alignment [1.8843687952462742]
This paper aims to address gaps in the current literature on jailbreaking techniques and the evaluation of LLM vulnerabilities.
Our contributions include the creation of a novel dataset designed to assess the harmfulness of model outputs across multiple harm levels.
We provide a comprehensive benchmark of state-of-the-art jailbreaking attacks, specifically targeting the Vicuna 13B v1.5 model.
arXiv Detail & Related papers (2024-11-11T10:02:49Z) - SeCodePLT: A Unified Platform for Evaluating the Security of Code GenAI [58.29510889419971]
Existing benchmarks for evaluating the security risks and capabilities of code-generating large language models (LLMs) face several key limitations.<n>We introduce a general and scalable benchmark construction framework that begins with manually validated, high-quality seed examples and expands them via targeted mutations.<n>Applying this framework to Python, C/C++, and Java, we build SeCodePLT, a dataset of more than 5.9k samples spanning 44 CWE-based risk categories and three security capabilities.
arXiv Detail & Related papers (2024-10-14T21:17:22Z) - HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data [60.75578581719921]
Large language models (LLMs) have shown great potential for automatic code generation.
Recent studies highlight that many LLM-generated code contains serious security vulnerabilities.
We introduce HexaCoder, a novel approach to enhance the ability of LLMs to generate secure codes.
arXiv Detail & Related papers (2024-09-10T12:01:43Z) - Generative AI Models: Opportunities and Risks for Industry and Authorities [1.3914994102950027]
Generative AI models are capable of performing a wide range of tasks that traditionally require creativity and human understanding.
They learn patterns from existing data during training and can subsequently generate new content.
The use of generative AI models introduces novel IT security risks that need to be considered.
arXiv Detail & Related papers (2024-06-07T08:34:30Z) - Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning [61.2224355547598]
Open-sourcing of large language models (LLMs) accelerates application development, innovation, and scientific progress.
Our investigation exposes a critical oversight in this belief.
By deploying carefully designed demonstrations, our research demonstrates that base LLMs could effectively interpret and execute malicious instructions.
arXiv Detail & Related papers (2024-04-16T13:22:54Z) - CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion [117.178835165855]
This paper introduces CodeAttack, a framework that transforms natural language inputs into code inputs.
Our studies reveal a new and universal safety vulnerability of these models against code input.
We find that a larger distribution gap between CodeAttack and natural language leads to weaker safety generalization.
arXiv Detail & Related papers (2024-03-12T17:55:38Z) - Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification.
We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations.
Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z) - The Role of Foundation Models in Neuro-Symbolic Learning and Reasoning [54.56905063752427]
Neuro-Symbolic AI (NeSy) holds promise to ensure the safe deployment of AI systems.
Existing pipelines that train the neural and symbolic components sequentially require extensive labelling.
New architecture, NeSyGPT, fine-tunes a vision-language foundation model to extract symbolic features from raw data.
arXiv Detail & Related papers (2024-02-02T20:33:14Z) - CodeLMSec Benchmark: Systematically Evaluating and Finding Security
Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks.
Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z) - A Simple, Yet Effective Approach to Finding Biases in Code Generation [16.094062131137722]
This work shows that current code generation systems exhibit undesired biases inherited from their large language model backbones.
We propose the "block of influence" concept, which enables a modular decomposition and analysis of the coding challenges.
arXiv Detail & Related papers (2022-10-31T15:06:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.