Probing Model Signal-Awareness via Prediction-Preserving Input
Minimization
- URL: http://arxiv.org/abs/2011.14934v2
- Date: Tue, 22 Jun 2021 21:44:44 GMT
- Title: Probing Model Signal-Awareness via Prediction-Preserving Input
Minimization
- Authors: Sahil Suneja, Yunhui Zheng, Yufan Zhuang, Jim Laredo, Alessandro
Morari
- Abstract summary: We evaluate models' ability to capture the correct vulnerability signals to produce their predictions.
We measure the signal awareness of models using a new metric we propose- Signal-aware Recall (SAR)
The results show a sharp drop in the model's Recall from the high 90s to sub-60s with the new metric.
- Score: 67.62847721118142
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work explores the signal awareness of AI models for source code
understanding. Using a software vulnerability detection use case, we evaluate
the models' ability to capture the correct vulnerability signals to produce
their predictions. Our prediction-preserving input minimization (P2IM) approach
systematically reduces the original source code to a minimal snippet which a
model needs to maintain its prediction. The model's reliance on incorrect
signals is then uncovered when the vulnerability in the original code is
missing in the minimal snippet, both of which the model however predicts as
being vulnerable. We measure the signal awareness of models using a new metric
we propose- Signal-aware Recall (SAR). We apply P2IM on three different neural
network architectures across multiple datasets. The results show a sharp drop
in the model's Recall from the high 90s to sub-60s with the new metric,
highlighting that the models are presumably picking up a lot of noise or
dataset nuances while learning their vulnerability detection logic. Although
the drop in model performance may be perceived as an adversarial attack, but
this isn't P2IM's objective. The idea is rather to uncover the signal-awareness
of a black-box model in a data-driven manner via controlled queries. SAR's
purpose is to measure the impact of task-agnostic model training, and not to
suggest a shortcoming in the Recall metric. The expectation, in fact, is for
SAR to match Recall in the ideal scenario where the model truly captures
task-specific signals.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.