GWP-ASan: Sampling-Based Detection of Memory-Safety Bugs in Production
- URL: http://arxiv.org/abs/2311.09394v2
- Date: Sat, 13 Jan 2024 14:42:26 GMT
- Title: GWP-ASan: Sampling-Based Detection of Memory-Safety Bugs in Production
- Authors: Kostya Serebryany, Chris Kennelly, Mitch Phillips, Matt Denton, Marco
Elver, Alexander Potapenko, Matt Morehouse, Vlad Tsyrklevich, Christian
Holler, Julian Lettner, David Kilzer, Lander Brandt
- Abstract summary: heap-use-after-free and heap-buffer-overflow bugs remain the primary problem for security, reliability, and developer productivity for applications written in C or C++.
This paper describes a family of tools that detect these two classes of memory-safety bugs, while running in production, at near-zero overhead.
- Score: 30.534320345970286
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the recent advances in pre-production bug detection,
heap-use-after-free and heap-buffer-overflow bugs remain the primary problem
for security, reliability, and developer productivity for applications written
in C or C++, across all major software ecosystems. Memory-safe languages solve
this problem when they are used, but the existing code bases consisting of
billions of lines of C and C++ continue to grow, and we need additional bug
detection mechanisms.
This paper describes a family of tools that detect these two classes of
memory-safety bugs, while running in production, at near-zero overhead. These
tools combine page-granular guarded allocation and low-rate sampling. In other
words, we added an "if" statement to a 36-year-old idea and made it work at
scale.
We describe the basic algorithm, several of its variants and implementations,
and the results of multi-year deployments across mobile, desktop, and server
applications.
Related papers
- KGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution [59.20933707301566]
Large Language Models (LLMs) are consistently improving at increasingly realistic software engineering (SE) tasks.
In real-world software stacks, significant SE effort is spent developing foundational system software like the Linux kernel.
To evaluate if ML models are useful while developing such large-scale systems-level software, we introduce kGym and kBench.
arXiv Detail & Related papers (2024-07-02T21:44:22Z) - Automated Repair of AI Code with Large Language Models and Formal Verification [4.9975496263385875]
Next generation of AI systems requires strong safety guarantees.
This report looks at the software implementation of neural networks and related memory safety properties.
We detect these vulnerabilities, and automatically repair them with the help of large language models.
arXiv Detail & Related papers (2024-05-14T11:52:56Z) - DebugBench: Evaluating Debugging Capability of Large Language Models [80.73121177868357]
DebugBench is a benchmark for Large Language Models (LLMs)
It covers four major bug categories and 18 minor types in C++, Java, and Python.
We evaluate two commercial and four open-source models in a zero-shot scenario.
arXiv Detail & Related papers (2024-01-09T15:46:38Z) - Large Language Models of Code Fail at Completing Code with Potential
Bugs [30.80172644795715]
We study the buggy-code completion problem inspired by real-time code suggestion.
We find that the presence of potential bugs significantly degrades the generation performance of the high-performing Code-LLMs.
arXiv Detail & Related papers (2023-06-06T06:35:27Z) - Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation.
We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z) - CodeLMSec Benchmark: Systematically Evaluating and Finding Security
Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks.
Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z) - BigIssue: A Realistic Bug Localization Benchmark [89.8240118116093]
BigIssue is a benchmark for realistic bug localization.
We provide a general benchmark with a diversity of real and synthetic Java bugs.
We hope to advance the state of the art in bug localization, in turn improving APR performance and increasing its applicability to the modern development cycle.
arXiv Detail & Related papers (2022-07-21T20:17:53Z) - Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it.
Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z) - What to Prioritize? Natural Language Processing for the Development of a
Modern Bug Tracking Solution in Hardware Development [0.0]
We present an approach to predict the time to fix, the risk and the complexity of a bug report using different supervised machine learning algorithms.
The evaluation shows that a combination of text embeddings generated through the Universal Sentence model outperforms all other methods.
arXiv Detail & Related papers (2021-09-28T15:55:10Z) - Generating Bug-Fixes Using Pretrained Transformers [11.012132897417592]
We introduce a data-driven program repair approach which learns to detect and fix bugs in Java methods mined from real-world GitHub.
We show that pretraining on source code programs improves the number of patches found by 33% as compared to supervised training from scratch.
We refine the standard accuracy evaluation metric into non-deletion and deletion-only fixes, and show that our best model generates 75% more non-deletion fixes than the previous state of the art.
arXiv Detail & Related papers (2021-04-16T05:27:04Z) - Predicting Vulnerability In Large Codebases With Deep Code
Representation [6.357681017646283]
Software engineers write code for various modules, quite often, various types of errors get introduced.
Same or similar issues/bugs, which were fixed in the past (although in different modules), tend to get introduced in production code again.
We developed a novel AI-based system which uses the deep representation of Abstract Syntax Tree (AST) created from the source code and also the active feedback loop.
arXiv Detail & Related papers (2020-04-24T13:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.