Detecting Vulnerabilities in Encrypted Software Code while Ensuring Code Privacy
- URL: http://arxiv.org/abs/2501.09191v1
- Date: Wed, 15 Jan 2025 22:39:50 GMT
- Title: Detecting Vulnerabilities in Encrypted Software Code while Ensuring Code Privacy
- Authors: Jorge Martins, David Dantas, Rafael Ramires, Bernardo Ferreira, Ibéria Medeiros,
- Abstract summary: Testing companies can perform code analysis tasks on encrypted software code provided by software companies while code privacy is preserved.<n>The approach combines Static Code Analysis and Searchable Symmetric Encryption.<n>The index is then used to discover vulnerabilities by carrying out static analysis tasks in a confidential way.
- Score: 1.6291732599945337
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Software vulnerabilities continue to be the main cause of occurrence for cyber attacks. In an attempt to reduce them and improve software quality, software code analysis has emerged as a service offered by companies specialising in software testing. However, this service requires software companies to provide access to their software's code, which raises concerns about code privacy and intellectual property theft. This paper presents a novel approach to Software Quality and Privacy, in which testing companies can perform code analysis tasks on encrypted software code provided by software companies while code privacy is preserved. The approach combines Static Code Analysis and Searchable Symmetric Encryption in order to process the source code and build an encrypted inverted index that represents its data and control flows. The index is then used to discover vulnerabilities by carrying out static analysis tasks in a confidential way. With this approach, this paper also defines a new research field -- Confidential Code Analysis --, from which other types of code analysis tasks and approaches can be derived. We implemented the approach in a new tool called CoCoA and evaluated it experimentally with synthetic and real PHP web applications. The results show that the tool has similar precision as standard (non-confidential) static analysis tools and a modest average performance overhead of 42.7%.
Related papers
- Improving Automated Secure Code Reviews: A Synthetic Dataset for Code Vulnerability Flaws [0.0]
We propose the creation of a synthetic dataset consisting of vulnerability-focused reviews that specifically comment on security flaws.
Our approach leverages Large Language Models (LLMs) to generate human-like code review comments for vulnerabilities.
arXiv Detail & Related papers (2025-04-22T23:07:24Z) - A Comprehensive Study of LLM Secure Code Generation [19.82291066720634]
Previous research primarily relies on a single static analyzer, CodeQL, to detect vulnerabilities in generated code.
We apply both security inspection and functionality validation to the same generated code and evaluate these two aspects together.
Our study reveals that existing techniques often compromise the functionality of generated code to enhance security.
arXiv Detail & Related papers (2025-03-18T20:12:50Z) - The Good, the Bad, and the (Un)Usable: A Rapid Literature Review on Privacy as Code [4.479352653343731]
Privacy and security are central to the design of information systems endowed with sound data protection and cyber resilience capabilities.
Developers often struggle to incorporate these properties into software projects as they either lack proper cybersecurity training or do not consider them a priority.
arXiv Detail & Related papers (2024-12-21T15:30:17Z) - RedCode: Risky Code Execution and Generation Benchmark for Code Agents [50.81206098588923]
RedCode is a benchmark for risky code execution and generation.
RedCode-Exec provides challenging prompts that could lead to risky code execution.
RedCode-Gen provides 160 prompts with function signatures and docstrings as input to assess whether code agents will follow instructions.
arXiv Detail & Related papers (2024-11-12T13:30:06Z) - Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework.
Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z) - LLMs in Web Development: Evaluating LLM-Generated PHP Code Unveiling Vulnerabilities and Limitations [0.0]
This study evaluates the security of web application code generated by Large Language Models, analyzing 2,500 GPT-4 generated PHP websites.
Our investigation focuses on identifying Insecure File Upload,sql Injection, Stored XSS, and Reflected XSS in GPT-4 generated PHP code.
According to Burp's Scan, 11.56% of the sites can be straight out compromised. Adding static scan results, 26% had at least one vulnerability that can be exploited through web interaction.
arXiv Detail & Related papers (2024-04-21T20:56:02Z) - Towards Efficient Verification of Constant-Time Cryptographic
Implementations [5.433710892250037]
Constant-time programming discipline is an effective software-based countermeasure against timing side-channel attacks.
We put forward practical verification approaches based on a novel synergy of taint analysis and safety verification of self-composed programs.
Our approach is implemented as a cross-platform and fully automated tool CT-Prover.
arXiv Detail & Related papers (2024-02-21T03:39:14Z) - Using Machine Learning To Identify Software Weaknesses From Software
Requirement Specifications [49.1574468325115]
This research focuses on finding an efficient machine learning algorithm to identify software weaknesses from requirement specifications.
Keywords extracted using latent semantic analysis help map the CWE categories to PROMISE_exp. Naive Bayes, support vector machine (SVM), decision trees, neural network, and convolutional neural network (CNN) algorithms were tested.
arXiv Detail & Related papers (2023-08-10T13:19:10Z) - SecureFalcon: Are We There Yet in Automated Software Vulnerability Detection with LLMs? [3.566250952750758]
We introduce SecureFalcon, an innovative model architecture with only 121 million parameters derived from the Falcon-40B model.
SecureFalcon achieves 94% accuracy in binary classification and up to 92% in multiclassification, with instant CPU inference times.
arXiv Detail & Related papers (2023-07-13T08:34:09Z) - CodeLMSec Benchmark: Systematically Evaluating and Finding Security
Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks.
Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z) - VELVET: a noVel Ensemble Learning approach to automatically locate
VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code.
Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph.
VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z) - Software Vulnerability Detection via Deep Learning over Disaggregated
Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora.
Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z) - A Smart and Defensive Human-Machine Approach to Code Analysis [0.0]
We propose a method that employs the use of virtual assistants to work with programmers to ensure that software are as safe as possible.
The pro- posed method employs a recommender system that uses various metrics to help programmers select the most appropriate code analysis tool for their project.
arXiv Detail & Related papers (2021-08-06T20:42:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.