Actionable Warning Is Not Enough: Recommending Valid Actionable Warnings with Weak Supervision
- URL: http://arxiv.org/abs/2511.12229v1
- Date: Sat, 15 Nov 2025 14:10:56 GMT
- Title: Actionable Warning Is Not Enough: Recommending Valid Actionable Warnings with Weak Supervision
- Authors: Zhipeng Xue, Zhipeng Gao, Tongtong Xu, Xing Hu, Xin Xia, Shanping Li,
- Abstract summary: This study builds the first large actionable warning dataset by mining 68,274 reversions from Top-500 GitHub C repositories.<n>We then take one step further by assigning each actionable warning a weak label regarding its likelihood of being a real bug.<n>Following that, we propose a two-stage framework called ACWRecommender to automatically recommend the actionable warnings with high probability to be real bugs.
- Score: 15.030122326395977
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The use of static analysis tools has gained increasing popularity among developers in the last few years. However, the widespread adoption of static analysis tools is hindered by their high false alarm rates. Previous studies have introduced the concept of actionable warnings and built a machine-learning method to distinguish actionable warnings from false alarms. However, according to our empirical observation, the current assumption used for actionable warning(s) collection is rather shaky and inaccurate, leading to a large number of invalid actionable warnings. To address this problem, in this study, we build the first large actionable warning dataset by mining 68,274 reversions from Top-500 GitHub C repositories, we then take one step further by assigning each actionable warning a weak label regarding its likelihood of being a real bug. Following that, we propose a two-stage framework called ACWRecommender to automatically recommend the actionable warnings with high probability to be real bugs (AWHB). Our approach warms up the pre-trained model UniXcoder by identifying actionable warnings task (coarse-grained detection stage) and rerank AWHB to the top by weakly supervised learning (fine-grained reranking stage). Experimental results show that our proposed model outperforms several baselines by a large margin in terms of nDCG and MRR for AWHB recommendation. Moreover, we ran our tool on 6 randomly selected projects and manually checked the top-ranked warnings from 2,197 reported warnings, we reported top-10 recommended warnings to developers, 27 of them were already confirmed by developers as real bugs. Developers can quickly find real bugs among the massive amount of reported warnings, which verifies the practical usage of our tool.
Related papers
- Steering Externalities: Benign Activation Steering Unintentionally Increases Jailbreak Risk for Large Language Models [62.16655896700062]
Activation steering is a technique to enhance the utility of Large Language Models (LLMs)<n>We show that it unintentionally introduces critical and under-explored safety risks.<n>Experiments reveal that these interventions act as a force multiplier, creating new vulnerabilities to jailbreaks and increasing attack success rates to over 80% on standard benchmarks.
arXiv Detail & Related papers (2026-02-03T12:32:35Z) - Malice in Agentland: Down the Rabbit Hole of Backdoors in the AI Supply Chain [82.98626829232899]
Fine-tuning AI agents on data from their own interactions introduces a critical security vulnerability within the AI supply chain.<n>We show that adversaries can easily poison the data collection pipeline to embed hard-to-detect backdoors.
arXiv Detail & Related papers (2025-10-03T12:47:21Z) - A Granular Study of Safety Pretraining under Model Abliteration [64.24346997570275]
We study model abliteration, a lightweight projection technique designed to remove refusal-sensitive directions.<n>We issue 100 prompts with balanced harmful and harmless cases, classify responses as **Refusal** or **Non-Refusal** using multiple judges, and validate judge fidelity.<n>Our study produces a checkpoint-level characterization of which data-centric safety components remain robust under abliteration.
arXiv Detail & Related papers (2025-10-03T07:01:45Z) - FineWAVE: Fine-Grained Warning Verification of Bugs for Automated Static Analysis Tools [18.927121513404924]
Automated Static Analysis Tools (ASATs) have evolved over time to assist in detecting bugs.
Previous research efforts have explored learning-based methods to validate the reported warnings.
We propose FineWAVE, a learning-based approach that verifies bug-sensitive warnings at a fine-grained granularity.
arXiv Detail & Related papers (2024-03-24T06:21:35Z) - ACWRecommender: A Tool for Validating Actionable Warnings with Weak
Supervision [10.040337069728569]
Static analysis tools have gained popularity among developers for finding potential bugs, but their widespread adoption is hindered by the high false alarm rates.
Previous studies proposed the concept of actionable warnings, and apply machine-learning methods to distinguish actionable warnings from false alarms.
We propose a two-stage framework called ACWRecommender to automatically identify actionable warnings and recommend those with a high probability of being real bugs.
arXiv Detail & Related papers (2023-09-18T12:35:28Z) - DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified
Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables.
We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z) - Robustness of Unsupervised Representation Learning without Labels [92.90480374344777]
We propose a family of unsupervised robustness measures, which are model- and task-agnostic and label-free.
We validate our results against a linear probe and show that, for MOCOv2, adversarial training results in 3 times higher certified accuracy.
arXiv Detail & Related papers (2022-10-08T18:03:28Z) - Tracking the Evolution of Static Code Warnings: the State-of-the-Art and
a Better Approach [18.350023994564904]
Static bug detection tools help developers detect problems in the code, including bad programming practices and potential defects.
Recent efforts to integrate static bug detectors in modern software development, such as in code review and continuous integration, are shown to better motivate developers to fix the reported warnings on the fly.
arXiv Detail & Related papers (2022-10-06T03:02:32Z) - How to Find Actionable Static Analysis Warnings [28.866251060033537]
We show that effective predictors of such warnings can be created by methods that adjust the decision boundary.
For eight open-source Java projects (CASSANDRA, JMETER, COMMONS, LUCENE-SOLR, ANT, TOMCAT, DERBY) we achieve perfect test results on 4/8 datasets.
arXiv Detail & Related papers (2022-05-21T04:47:02Z) - Sample-Efficient Safety Assurances using Conformal Prediction [57.92013073974406]
Early warning systems can provide alerts when an unsafe situation is imminent.
To reliably improve safety, these warning systems should have a provable false negative rate.
We present a framework that combines a statistical inference technique known as conformal prediction with a simulator of robot/environment dynamics.
arXiv Detail & Related papers (2021-09-28T23:00:30Z) - Assessing Validity of Static Analysis Warnings using Ensemble Learning [4.05739885420409]
Static Analysis (SA) tools are used to identify potential weaknesses in code and fix them in advance, while the code is being developed.
These rules-based static analysis tools generally report a lot of false warnings along with the actual ones.
We propose a Machine Learning (ML)-based learning process that uses source codes, historic commit data, and classifier-ensembles to prioritize the True warnings.
arXiv Detail & Related papers (2021-04-21T19:39:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.