A Novel Approach to Identify Security Controls in Source Code
- URL: http://arxiv.org/abs/2307.05605v1
- Date: Mon, 10 Jul 2023 21:14:39 GMT
- Title: A Novel Approach to Identify Security Controls in Source Code
- Authors: Ahmet Okutan, Ali Shokri, Viktoria Koscinski, Mohamad Fazelinia, Mehdi
Mirakhorli
- Abstract summary: This paper enumerates a comprehensive list of commonly used security controls and creates a dataset for each one of them.
It uses the state-of-the-art NLP technique Bidirectional Representations from Transformers (BERT) and the Tactic Detector from our prior work to show that security controls could be identified with high confidence.
- Score: 4.598579706242066
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Secure by Design has become the mainstream development approach ensuring that
software systems are not vulnerable to cyberattacks. Architectural security
controls need to be carefully monitored over the software development life
cycle to avoid critical design flaws. Unfortunately, functional requirements
usually get in the way of the security features, and the development team may
not correctly address critical security requirements. Identifying
tactic-related code pieces in a software project enables an efficient review of
the security controls' implementation as well as a resilient software
architecture. This paper enumerates a comprehensive list of commonly used
security controls and creates a dataset for each one of them by pulling related
and unrelated code snippets from the open API of the StackOverflow question and
answer platform. It uses the state-of-the-art NLP technique Bidirectional
Encoder Representations from Transformers (BERT) and the Tactic Detector from
our prior work to show that code pieces that implement security controls could
be identified with high confidence. The results show that our model trained on
tactic-related and unrelated code snippets derived from StackOverflow is able
to identify tactic-related code pieces with F-Measure values above 0.9.
Related papers
- HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data [60.75578581719921]
Large language models (LLMs) have shown great potential for automatic code generation.
Recent studies highlight that many LLM-generated code contains serious security vulnerabilities.
We introduce HexaCoder, a novel approach to enhance the ability of LLMs to generate secure codes.
arXiv Detail & Related papers (2024-09-10T12:01:43Z) - Is Your AI-Generated Code Really Safe? Evaluating Large Language Models on Secure Code Generation with CodeSecEval [20.959848710829878]
Large language models (LLMs) have brought significant advancements to code generation and code repair.
However, their training using unsanitized data from open-source repositories, like GitHub, raises the risk of inadvertently propagating security vulnerabilities.
We aim to present a comprehensive study aimed at precisely evaluating and enhancing the security aspects of code LLMs.
arXiv Detail & Related papers (2024-07-02T16:13:21Z) - An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection [17.948513691133037]
We introduce CodeBreaker, a pioneering LLM-assisted backdoor attack framework on code completion models.
By integrating malicious payloads directly into the source code with minimal transformation, CodeBreaker challenges current security measures.
arXiv Detail & Related papers (2024-06-10T22:10:05Z) - A Survey and Comparative Analysis of Security Properties of CAN Authentication Protocols [92.81385447582882]
The Controller Area Network (CAN) bus leaves in-vehicle communications inherently non-secure.
This paper reviews and compares the 15 most prominent authentication protocols for the CAN bus.
We evaluate protocols based on essential operational criteria that contribute to ease of implementation.
arXiv Detail & Related papers (2024-01-19T14:52:04Z) - LLM-Powered Code Vulnerability Repair with Reinforcement Learning and
Semantic Reward [3.729516018513228]
We introduce a multipurpose code vulnerability analysis system textttSecRepair, powered by a large language model, CodeGen2.
Inspired by how humans fix code issues, we propose an instruction-based dataset suitable for vulnerability analysis with LLMs.
We identify zero-day and N-day vulnerabilities in 6 Open Source IoT Operating Systems on GitHub.
arXiv Detail & Related papers (2024-01-07T02:46:39Z) - Finding Software Vulnerabilities in Open-Source C Projects via Bounded
Model Checking [2.9129603096077332]
We advocate that bounded model-checking techniques can efficiently detect vulnerabilities in general software systems.
We have developed and evaluated a methodology to verify large software systems using a state-of-the-art bounded model checker.
arXiv Detail & Related papers (2023-11-09T11:25:24Z) - CodeLMSec Benchmark: Systematically Evaluating and Finding Security
Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks.
Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - "Yeah, it does have a...Windows `98 Vibe'': Usability Study of Security
Features in Programmable Logic Controllers [19.08543677650948]
Misconfigurations of Programmable Logic Controllers (PLCs) are often left exposed to the Internet.
We explore the usability of PLC connection configurations and two key security mechanisms.
We find that the use of unfamiliar labels, layouts and misleading terminology exacerbates an already complex process.
arXiv Detail & Related papers (2022-08-04T07:20:00Z) - Safe RAN control: A Symbolic Reinforcement Learning Approach [62.997667081978825]
We present a Symbolic Reinforcement Learning (SRL) based architecture for safety control of Radio Access Network (RAN) applications.
We provide a purely automated procedure in which a user can specify high-level logical safety specifications for a given cellular network topology.
We introduce a user interface (UI) developed to help a user set intent specifications to the system, and inspect the difference in agent proposed actions.
arXiv Detail & Related papers (2021-06-03T16:45:40Z) - Dos and Don'ts of Machine Learning in Computer Security [74.1816306998445]
Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance.
We identify common pitfalls in the design, implementation, and evaluation of learning-based security systems.
We propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible.
arXiv Detail & Related papers (2020-10-19T13:09:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.