LIVABLE: Exploring Long-Tailed Classification of Software Vulnerability
Types
- URL: http://arxiv.org/abs/2306.06935v1
- Date: Mon, 12 Jun 2023 08:14:16 GMT
- Title: LIVABLE: Exploring Long-Tailed Classification of Software Vulnerability
Types
- Authors: Xin-Cheng Wen, Cuiyun Gao, Feng Luo, Haoyu Wang, Ge Li, and Qing Liao
- Abstract summary: We propose a Long-taIled software VulnerABiLity typE classification approach, called LIVABLE.
LIVABLE consists of two modules, including (1) vulnerability representation learning module, which improves the propagation steps in GNN.
A sequence-to-sequence model is also involved to enhance the vulnerability representations.
- Score: 18.949810432641772
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prior studies generally focus on software vulnerability detection and have
demonstrated the effectiveness of Graph Neural Network (GNN)-based approaches
for the task. Considering the various types of software vulnerabilities and the
associated different degrees of severity, it is also beneficial to determine
the type of each vulnerable code for developers. In this paper, we observe that
the distribution of vulnerability type is long-tailed in practice, where a
small portion of classes have massive samples (i.e., head classes) but the
others contain only a few samples (i.e., tail classes). Directly adopting
previous vulnerability detection approaches tends to result in poor detection
performance, mainly due to two reasons. First, it is difficult to effectively
learn the vulnerability representation due to the over-smoothing issue of GNNs.
Second, vulnerability types in tails are hard to be predicted due to the
extremely few associated samples.To alleviate these issues, we propose a
Long-taIled software VulnerABiLity typE classification approach, called
LIVABLE. LIVABLE mainly consists of two modules, including (1) vulnerability
representation learning module, which improves the propagation steps in GNN to
distinguish node representations by a differentiated propagation method. A
sequence-to-sequence model is also involved to enhance the vulnerability
representations. (2) adaptive re-weighting module, which adjusts the learning
weights for different types according to the training epochs and numbers of
associated samples by a novel training loss.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.