Automated Static Warning Identification via Path-based Semantic
Representation
- URL: http://arxiv.org/abs/2306.15568v1
- Date: Tue, 27 Jun 2023 15:46:45 GMT
- Title: Automated Static Warning Identification via Path-based Semantic
Representation
- Authors: Yuwei Zhang and Ying Xing and Ge Li and Zhi Jin
- Abstract summary: This paper employs deep neural networks' powerful feature extraction and representation abilities to generate code semantics from control flow graph paths for warning identification.
We fine-tune the pre-trained language model to encode the path sequences and capture the semantic representations for model building.
- Score: 37.70518599085676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite their ability to aid developers in detecting potential defects early
in the software development life cycle, static analysis tools often suffer from
precision issues (i.e., high false positive rates of reported alarms). To
improve the availability of these tools, many automated warning identification
techniques have been proposed to assist developers in classifying false
positive alarms. However, existing approaches mainly focus on using
hand-engineered features or statement-level abstract syntax tree token
sequences to represent the defective code, failing to capture semantics from
the reported alarms. To overcome the limitations of traditional approaches,
this paper employs deep neural networks' powerful feature extraction and
representation abilities to generate code semantics from control flow graph
paths for warning identification. The control flow graph abstractly represents
the execution process of a given program. Thus, the generated path sequences of
the control flow graph can guide the deep neural networks to learn semantic
information about the potential defect more accurately. In this paper, we
fine-tune the pre-trained language model to encode the path sequences and
capture the semantic representations for model building. Finally, this paper
conducts extensive experiments on eight open-source projects to verify the
effectiveness of the proposed approach by comparing it with the
state-of-the-art baselines.
Related papers
- Beyond Fidelity: Explaining Vulnerability Localization of Learning-based
Detectors [10.316819421902363]
Vulnerability detectors based on deep learning (DL) models have proven their effectiveness in recent years.
The shroud of opacity surrounding the decision-making process of these detectors makes it difficult for security analysts to comprehend.
We evaluate the performance of ten explanation approaches for vulnerability detectors based on graph and sequence representations.
arXiv Detail & Related papers (2024-01-05T07:37:35Z) - Token-Level Adversarial Prompt Detection Based on Perplexity Measures
and Contextual Information [67.78183175605761]
Large Language Models are susceptible to adversarial prompt attacks.
This vulnerability underscores a significant concern regarding the robustness and reliability of LLMs.
We introduce a novel approach to detecting adversarial prompts at a token level.
arXiv Detail & Related papers (2023-11-20T03:17:21Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - An Unbiased Transformer Source Code Learning with Semantic Vulnerability
Graph [3.3598755777055374]
Current vulnerability screening techniques are ineffective at identifying novel vulnerabilities or providing developers with code vulnerability and classification.
To address these issues, we propose a joint multitasked unbiased vulnerability classifier comprising a transformer "RoBERTa" and graph convolution neural network (GCN)
We present a training process utilizing a semantic vulnerability graph (SVG) representation from source code, created by integrating edges from a sequential flow, control flow, and data flow, as well as a novel flow dubbed Poacher Flow (PF)
arXiv Detail & Related papers (2023-04-17T20:54:14Z) - A Hierarchical Deep Neural Network for Detecting Lines of Codes with
Vulnerabilities [6.09170287691728]
Software vulnerabilities, caused by unintentional flaws in source codes, are the main root cause of cyberattacks.
We propose a deep learning approach to detect vulnerabilities from their LLVM IR representations based on the techniques that have been used in natural language processing.
arXiv Detail & Related papers (2022-11-15T21:21:27Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - Software Vulnerability Detection via Deep Learning over Disaggregated
Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora.
Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z) - Certifying Decision Trees Against Evasion Attacks by Program Analysis [9.290879387995401]
We propose a novel technique to verify the security of machine learning models against evasion attacks.
Our approach exploits the interpretability property of decision trees to transform them into imperative programs.
Our experiments show that our technique is both precise and efficient, yielding only a minimal number of false positives.
arXiv Detail & Related papers (2020-07-06T14:18:10Z) - Graph Backdoor [53.70971502299977]
We present GTA, the first backdoor attack on graph neural networks (GNNs)
GTA departs in significant ways: it defines triggers as specific subgraphs, including both topological structures and descriptive features.
It can be instantiated for both transductive (e.g., node classification) and inductive (e.g., graph classification) tasks.
arXiv Detail & Related papers (2020-06-21T19:45:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.