Using AI/ML to Find and Remediate Enterprise Secrets in Code & Document
Sharing Platforms
- URL: http://arxiv.org/abs/2401.01754v1
- Date: Wed, 3 Jan 2024 14:15:25 GMT
- Title: Using AI/ML to Find and Remediate Enterprise Secrets in Code & Document
Sharing Platforms
- Authors: Gregor Kerr, David Algorry, Senad Ibraimoski, Peter Maciver, Sean
Moran
- Abstract summary: We introduce a new challenge to the software development community: 1) leveraging AI to accurately detect and flag up secrets in code and on popular document sharing platforms.
We introduce two baseline AI models that have good detection performance and propose an automatic mechanism for remediating secrets found in code.
- Score: 2.9248916859490173
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We introduce a new challenge to the software development community: 1)
leveraging AI to accurately detect and flag up secrets in code and on popular
document sharing platforms that frequently used by developers, such as
Confluence and 2) automatically remediating the detections (e.g. by suggesting
password vault functionality). This is a challenging, and mostly unaddressed
task. Existing methods leverage heuristics and regular expressions, that can be
very noisy, and therefore increase toil on developers. The next step -
modifying code itself - to automatically remediate a detection, is a complex
task. We introduce two baseline AI models that have good detection performance
and propose an automatic mechanism for remediating secrets found in code,
opening up the study of this task to the wider community.
Related papers
- Development of an automatic modification system for generated programs using ChatGPT [0.12233362977312943]
OpenAI's ChatGPT excels at natural language processing tasks and can also generate source code.
We developed a system that tests the code generated by ChatGPT, automatically corrects it if it is inappropriate, and presents the appropriate code to the user.
arXiv Detail & Related papers (2024-07-10T08:54:23Z) - Code Compass: A Study on the Challenges of Navigating Unfamiliar Codebases [2.808331566391181]
We propose a novel tool, Code, to address these issues.
Our study highlights a significant gap in current tools and methodologies.
Our formative study demonstrates how effectively the tool reduces the time developers spend navigating documentation.
arXiv Detail & Related papers (2024-05-10T06:58:31Z) - Enhancing Security of AI-Based Code Synthesis with GitHub Copilot via Cheap and Efficient Prompt-Engineering [1.7702475609045947]
One of the reasons developers and companies avoid harnessing their full potential is the questionable security of the generated code.
This paper first reviews the current state-of-the-art and identifies areas for improvement on this issue.
We propose a systematic approach based on prompt-altering methods to achieve better code security of AI-based code generators such as GitHub Copilot.
arXiv Detail & Related papers (2024-03-19T12:13:33Z) - CodeAgent: Collaborative Agents for Software Engineering [11.476666454138021]
Code review aims at ensuring the overall quality and reliability of software.
Existing automated methods rely on single input-output generative models.
This work introduces CodeAgent, a novel multi-agent Large Language Model (LLM) system for code review automation.
arXiv Detail & Related papers (2024-02-03T14:43:14Z) - CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules [51.82044734879657]
We propose CodeChain, a novel framework for inference that elicits modularized code generation through a chain of self-revisions.
We find that CodeChain can significantly boost both modularity as well as correctness of the generated solutions, achieving relative pass@1 improvements of 35% on APPS and 76% on CodeContests.
arXiv Detail & Related papers (2023-10-13T10:17:48Z) - Zero-Shot Detection of Machine-Generated Codes [83.0342513054389]
This work proposes a training-free approach for the detection of LLMs-generated codes.
We find that existing training-based or zero-shot text detectors are ineffective in detecting code.
Our method exhibits robustness against revision attacks and generalizes well to Java codes.
arXiv Detail & Related papers (2023-10-08T10:08:21Z) - FacTool: Factuality Detection in Generative AI -- A Tool Augmented
Framework for Multi-Task and Multi-Domain Scenarios [87.12753459582116]
A wider range of tasks now face an increasing risk of containing factual errors when handled by generative models.
We propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models.
arXiv Detail & Related papers (2023-07-25T14:20:51Z) - Who Wrote this Code? Watermarking for Code Generation [53.24895162874416]
We propose Selective WatErmarking via Entropy Thresholding (SWEET) to detect machine-generated text.
Our experiments show that SWEET significantly improves code quality preservation while outperforming all baselines.
arXiv Detail & Related papers (2023-05-24T11:49:52Z) - OpenAGI: When LLM Meets Domain Experts [51.86179657467822]
Human Intelligence (HI) excels at combining basic skills to solve complex tasks.
This capability is vital for Artificial Intelligence (AI) and should be embedded in comprehensive AI Agents.
We introduce OpenAGI, an open-source platform designed for solving multi-step, real-world tasks.
arXiv Detail & Related papers (2023-04-10T03:55:35Z) - Chatbots As Fluent Polyglots: Revisiting Breakthrough Code Snippets [0.0]
The research applies AI-driven code assistants to analyze a selection of influential computer code that has shaped modern technology.
The original contribution of this study was to examine half of the most significant code advances in the last 50 years.
arXiv Detail & Related papers (2023-01-05T23:17:17Z) - Measuring Coding Challenge Competence With APPS [54.22600767666257]
We introduce APPS, a benchmark for code generation.
Our benchmark includes 10,000 problems, which range from having simple one-line solutions to being substantial algorithmic challenges.
Recent models such as GPT-Neo can pass approximately 15% of the test cases of introductory problems.
arXiv Detail & Related papers (2021-05-20T17:58:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.