A Survey on Data-driven Software Vulnerability Assessment and
Prioritization
- URL: http://arxiv.org/abs/2107.08364v1
- Date: Sun, 18 Jul 2021 04:49:22 GMT
- Title: A Survey on Data-driven Software Vulnerability Assessment and
Prioritization
- Authors: Triet H. M. Le, Huaming Chen, M. Ali Babar
- Abstract summary: Software Vulnerabilities (SVs) are increasing in complexity and scale, posing great security risks to many software systems.
Data-driven techniques such as Machine Learning and Deep Learning have taken SV assessment and prioritization to the next level.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Software Vulnerabilities (SVs) are increasing in complexity and scale, posing
great security risks to many software systems. Given the limited resources in
practice, SV assessment and prioritization help practitioners devise optimal SV
mitigation plans based on various SV characteristics. The surge in SV data
sources and data-driven techniques such as Machine Learning and Deep Learning
have taken SV assessment and prioritization to the next level. Our survey
provides a taxonomy of the past research efforts and highlights the best
practices for data-driven SV assessment and prioritization. We also discuss the
current limitations and propose potential solutions to address such issues.
Related papers
- Mitigating Data Imbalance for Software Vulnerability Assessment: Does Data Augmentation Help? [0.0]
We show that mitigating data imbalance can significantly improve the predictive performance of models for all the Common Vulnerability Scoring System (CVSS) tasks.
We also discover that simple text augmentation like combining random text insertion, deletion, and replacement can outperform the baseline across the board.
arXiv Detail & Related papers (2024-07-15T13:47:55Z) - A Comprehensive Survey on Underwater Image Enhancement Based on Deep Learning [51.7818820745221]
Underwater image enhancement (UIE) presents a significant challenge within computer vision research.
Despite the development of numerous UIE algorithms, a thorough and systematic review is still absent.
arXiv Detail & Related papers (2024-05-30T04:46:40Z) - What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases [87.65903426052155]
We perform a large-scale transfer learning experiment aimed at discovering latent vision-language skills from data.
We show that generation tasks suffer from a length bias, suggesting benchmarks should balance tasks with varying output lengths.
We present a new dataset, OLIVE, which simulates user instructions in the wild and presents challenges dissimilar to all datasets we tested.
arXiv Detail & Related papers (2024-04-03T02:40:35Z) - Are Latent Vulnerabilities Hidden Gems for Software Vulnerability
Prediction? An Empirical Study [4.830367174383139]
latent vulnerable functions can increase the number of SVs by 4x on average and correct up to 5k mislabeled functions.
Despite the noise, we show that the state-of-the-art SV prediction model can significantly benefit from such latent SVs.
arXiv Detail & Related papers (2024-01-20T03:36:01Z) - Data Management For Large Language Models: A Survey [66.59562797566163]
Data plays a fundamental role in the training of Large Language Models (LLMs)
This survey provides a comprehensive overview of current research in data management within both the pretraining and supervised fine-tuning stages of LLMs.
arXiv Detail & Related papers (2023-12-04T07:42:16Z) - A Survey of Federated Unlearning: A Taxonomy, Challenges and Future
Directions [71.16718184611673]
The evolution of privacy-preserving Federated Learning (FL) has led to an increasing demand for implementing the right to be forgotten.
The implementation of selective forgetting is particularly challenging in FL due to its decentralized nature.
Federated Unlearning (FU) emerges as a strategic solution to address the increasing need for data privacy.
arXiv Detail & Related papers (2023-10-30T01:34:33Z) - A Note on "Towards Efficient Data Valuation Based on the Shapley Value'' [7.4011772612133475]
The Shapley value (SV) has emerged as a promising method for data valuation.
Group Testing-based SV estimator achieves favorable sample complexity.
arXiv Detail & Related papers (2023-02-22T15:13:45Z) - On the Use of Fine-grained Vulnerable Code Statements for Software
Vulnerability Assessment Models [0.0]
We use large-scale data from 1,782 functions of 429 SVs in 200 real-world projects to develop Machine Learning models for function-level SV assessment tasks.
We show that vulnerable statements are 5.8 times smaller in size, yet exhibit 7.5-114.5% stronger assessment performance.
arXiv Detail & Related papers (2022-03-16T06:29:40Z) - DeepCVA: Automated Commit-level Vulnerability Assessment with Deep
Multi-task Learning [0.0]
We propose a novel Deep multi-task learning model, DeepCVA, to automate seven Commit-level Vulnerability Assessment tasks simultaneously.
We conduct large-scale experiments on 1,229 vulnerability-contributing commits containing 542 different SVs in 246 real-world software projects.
DeepCVA is the best-performing model with 38% to 59.8% higher Matthews Correlation Coefficient than many supervised and unsupervised baseline models.
arXiv Detail & Related papers (2021-08-18T08:43:36Z) - A Principled Approach to Data Valuation for Federated Learning [73.19984041333599]
Federated learning (FL) is a popular technique to train machine learning (ML) models on decentralized data sources.
The Shapley value (SV) defines a unique payoff scheme that satisfies many desiderata for a data value notion.
This paper proposes a variant of the SV amenable to FL, which we call the federated Shapley value.
arXiv Detail & Related papers (2020-09-14T04:37:54Z) - Chance-Constrained Trajectory Optimization for Safe Exploration and
Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training.
We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.