Related papers: FineWAVE: Fine-Grained Warning Verification of Bugs for Automated Static Analysis Tools

FineWAVE: Fine-Grained Warning Verification of Bugs for Automated Static Analysis Tools

URL: http://arxiv.org/abs/2403.16032v2
Date: Sat, 6 Apr 2024 06:50:12 GMT
Title: FineWAVE: Fine-Grained Warning Verification of Bugs for Automated Static Analysis Tools
Authors: Han Liu, Jian Zhang, Cen Zhang, Xiaohan Zhang, Kaixuan Li, Sen Chen, Shang-Wei Lin, Yixiang Chen, Xinhua Li, Yang Liu,
Abstract summary: Automated Static Analysis Tools (ASATs) have evolved over time to assist in detecting bugs. Previous research efforts have explored learning-based methods to validate the reported warnings. We propose FineWAVE, a learning-based approach that verifies bug-sensitive warnings at a fine-grained granularity.
Score: 18.927121513404924
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Automated Static Analysis Tools (ASATs) have evolved over time to assist in detecting bugs. However, the excessive false warnings can impede developers' productivity and confidence in the tools. Previous research efforts have explored learning-based methods to validate the reported warnings. Nevertheless, their coarse granularity, focusing on either long-term warnings or function-level alerts, which are insensitive to individual bugs. Also, they rely on manually crafted features or solely on source code semantics, which is inadequate for effective learning. In this paper, we propose FineWAVE, a learning-based approach that verifies bug-sensitive warnings at a fine-grained granularity. Specifically, we design a novel LSTM-based model that captures multi-modal semantics of source code and warnings from ASATs and highlights their correlations with cross-attention. To tackle the data scarcity of training and evaluation, we collected a large-scale dataset of 280,273 warnings. We conducted extensive experiments on the dataset to evaluate FineWAVE. The experimental results demonstrate the effectiveness of our approach, with an F1-score of 97.79\% for reducing false alarms and 67.06% for confirming actual warnings, significantly outperforming all baselines. Moreover, we have applied our FineWAVE to filter out about 92% warnings in four popular real-world projects, and found 25 new bugs with minimal manual effort.

Related papers

Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses. Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives. The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z)
Robust Tiny Object Detection in Aerial Images amidst Label Noise [50.257696872021164]
This study addresses the issue of tiny object detection under noisy label supervision. We propose a DeNoising Tiny Object Detector (DN-TOD), which incorporates a Class-aware Label Correction scheme. Our method can be seamlessly integrated into both one-stage and two-stage object detection pipelines.
arXiv Detail & Related papers (2024-01-16T02:14:33Z)
Quieting the Static: A Study of Static Analysis Alert Suppressions [7.324969824727792]
We examine 1,425 open-source Java-based projects that utilize Findbugs or Spotbugs for warning-suppressing configurations and source code annotations. We find that although most warnings are suppressed, only a small portion of them get frequently suppressed. Findings underscore the need for better communication and education related to the use of static analysis tools.
arXiv Detail & Related papers (2023-11-13T17:16:25Z)
ACWRecommender: A Tool for Validating Actionable Warnings with Weak Supervision [10.040337069728569]
Static analysis tools have gained popularity among developers for finding potential bugs, but their widespread adoption is hindered by the high false alarm rates. Previous studies proposed the concept of actionable warnings, and apply machine-learning methods to distinguish actionable warnings from false alarms. We propose a two-stage framework called ACWRecommender to automatically identify actionable warnings and recommend those with a high probability of being real bugs.
arXiv Detail & Related papers (2023-09-18T12:35:28Z)
Infrared: A Meta Bug Detector [10.541969253100815]
We propose a new approach, called meta bug detection, which offers three crucial advantages over existing learning-based bug detectors. Our evaluation shows our meta bug detector (MBD) is effective in catching a variety of bugs including null pointer dereference, array index out-of-bound, file handle leak, and even data races in concurrent programs.
arXiv Detail & Related papers (2022-09-18T09:08:51Z)
Improving the Adversarial Robustness of NLP Models by Information Bottleneck [112.44039792098579]
Non-robust features can be easily manipulated by adversaries to fool NLP models. In this study, we explore the feasibility of capturing task-specific robust features, while eliminating the non-robust ones by using the information bottleneck theory. We show that the models trained with our information bottleneck-based method are able to achieve a significant improvement in robust accuracy.
arXiv Detail & Related papers (2022-06-11T12:12:20Z)
Learning to Reduce False Positives in Analytic Bug Detectors [12.733531603080674]
We propose a Transformer-based learning approach to identify false positive bug warnings. We demonstrate that our models can improve the precision of static analysis by 17.5%.
arXiv Detail & Related papers (2022-03-08T04:26:26Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
VELVET: a noVel Ensemble Learning approach to automatically locate VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code. Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph. VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z)
Tracking the risk of a deployed model and detecting harmful distribution shifts [105.27463615756733]
In practice, it may make sense to ignore benign shifts, under which the performance of a deployed model does not degrade substantially. We argue that a sensible method for firing off a warning has to both (a) detect harmful shifts while ignoring benign ones, and (b) allow continuous monitoring of model performance without increasing the false alarm rate.
arXiv Detail & Related papers (2021-10-12T17:21:41Z)
Assessing Validity of Static Analysis Warnings using Ensemble Learning [4.05739885420409]
Static Analysis (SA) tools are used to identify potential weaknesses in code and fix them in advance, while the code is being developed. These rules-based static analysis tools generally report a lot of false warnings along with the actual ones. We propose a Machine Learning (ML)-based learning process that uses source codes, historic commit data, and classifier-ensembles to prioritize the True warnings.
arXiv Detail & Related papers (2021-04-21T19:39:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.