Related papers: OutCenTR: A novel semi-supervised framework for predicting exploits of vulnerabilities in high-dimensional datasets

OutCenTR: A novel semi-supervised framework for predicting exploits of vulnerabilities in high-dimensional datasets

URL: http://arxiv.org/abs/2304.10511v1
Date: Mon, 3 Apr 2023 00:34:41 GMT
Title: OutCenTR: A novel semi-supervised framework for predicting exploits of vulnerabilities in high-dimensional datasets
Authors: Hadi Eskandari, Michael Bewong, Sabih ur Rehman
Abstract summary: We make use of outlier detection techniques to predict vulnerabilities that are likely to be exploited. We propose a dimensionality reduction technique, OutCenTR, that enhances the baseline outlier detection models. The results of our experiments show on average a 5-fold improvement of F1 score in comparison with state-of-the-art dimensionality reduction techniques.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: An ever-growing number of vulnerabilities are reported every day. Yet these vulnerabilities are not all the same; Some are more targeted than others. Correctly estimating the likelihood of a vulnerability being exploited is a critical task for system administrators. This aids the system administrators in prioritizing and patching the right vulnerabilities. Our work makes use of outlier detection techniques to predict vulnerabilities that are likely to be exploited in highly imbalanced and high-dimensional datasets such as the National Vulnerability Database. We propose a dimensionality reduction technique, OutCenTR, that enhances the baseline outlier detection models. We further demonstrate the effectiveness and efficiency of OutCenTR empirically with 4 benchmark and 12 synthetic datasets. The results of our experiments show on average a 5-fold improvement of F1 score in comparison with state-of-the-art dimensionality reduction techniques such as PCA and GRP.

Related papers

A Multi-Dataset Evaluation of Models for Automated Vulnerability Repair [2.7674959824386858]
This study investigates pre-trained language models, CodeBERT and CodeT5, for automated vulnerability patching across six datasets and four languages.<n>We evaluate their accuracy and generalization to unknown vulnerabilities.<n>Results show that while both models face challenges with fragmented or sparse context, CodeBERT performs comparatively better in such scenarios, whereas CodeT5 excels in capturing complex vulnerability patterns.
arXiv Detail & Related papers (2025-06-05T13:00:19Z)
Evaluating Pre-Trained Models for Multi-Language Vulnerability Patching [3.220818227251765]
This paper investigates the potential of pre-trained language models, CodeBERT and CodeT5, for automated vulnerability patching. We evaluate these models on their accuracy, computational efficiency, and how the length of vulnerable code patches impacts performance.
arXiv Detail & Related papers (2025-01-13T13:51:05Z)
CRepair: CVAE-based Automatic Vulnerability Repair Technology [1.147605955490786]
Software vulnerabilities pose significant threats to the integrity, security, and reliability of modern software and its application data. To address the challenges of vulnerability repair, researchers have proposed various solutions, with learning-based automatic vulnerability repair techniques gaining widespread attention. This paper proposes CRepair, a CVAE-based automatic vulnerability repair technology aimed at fixing security vulnerabilities in system code.
arXiv Detail & Related papers (2024-11-08T12:55:04Z)
Enhancing Pre-Trained Language Models for Vulnerability Detection via Semantic-Preserving Data Augmentation [4.374800396968465]
We propose a data augmentation technique aimed at enhancing the performance of pre-trained language models for vulnerability detection. By incorporating our augmented dataset in fine-tuning a series of representative code pre-trained models, up to 10.1% increase in accuracy and 23.6% increase in F1 can be achieved.
arXiv Detail & Related papers (2024-09-30T21:44:05Z)
A Relevance Model for Threat-Centric Ranking of Cybersecurity Vulnerabilities [0.29998889086656577]
The relentless process of tracking and remediating vulnerabilities is a top concern for cybersecurity professionals. We provide a framework for vulnerability management specifically focused on mitigating threats using adversary criteria derived from MITRE ATT&CK. Our results show an average 71.5% - 91.3% improvement towards the identification of vulnerabilities likely to be targeted and exploited by cyber threat actors.
arXiv Detail & Related papers (2024-06-09T23:29:12Z)
Vulnerability Detection with Code Language Models: How Far Are We? [40.455600722638906]
PrimeVul is a new dataset for training and evaluating code LMs for vulnerability detection. It incorporates a novel set of data labeling techniques that achieve comparable label accuracy to human-verified benchmarks. It also implements a rigorous data de-duplication and chronological data splitting strategy to mitigate data leakage issues.
arXiv Detail & Related papers (2024-03-27T14:34:29Z)
FaultGuard: A Generative Approach to Resilient Fault Prediction in Smart Electrical Grids [53.2306792009435]
FaultGuard is the first framework for fault type and zone classification resilient to adversarial attacks. We propose a low-complexity fault prediction model and an online adversarial training technique to enhance robustness. Our model outclasses the state-of-the-art for resilient fault prediction benchmarking, with an accuracy of up to 0.958.
arXiv Detail & Related papers (2024-03-26T08:51:23Z)
DefectHunter: A Novel LLM-Driven Boosted-Conformer-based Code Vulnerability Detection Mechanism [3.9377491512285157]
DefectHunter is an innovative model for vulnerability identification that employs the Conformer mechanism. This mechanism fuses self-attention with convolutional networks to capture both local, position-wise features and global, content-based interactions.
arXiv Detail & Related papers (2023-09-27T00:10:29Z)
RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target. RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead. Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
VELVET: a noVel Ensemble Learning approach to automatically locate VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code. Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph. VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z)
Federated Learning with Unreliable Clients: Performance Analysis and Mechanism Design [76.29738151117583]
Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients. However, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training. We model these unreliable behaviors of clients and propose a defensive mechanism to mitigate such a security risk.
arXiv Detail & Related papers (2021-05-10T08:02:27Z)
Bayesian Optimization with Machine Learning Algorithms Towards Anomaly Detection [66.05992706105224]
In this paper, an effective anomaly detection framework is proposed utilizing Bayesian Optimization technique. The performance of the considered algorithms is evaluated using the ISCX 2012 dataset. Experimental results show the effectiveness of the proposed framework in term of accuracy rate, precision, low-false alarm rate, and recall.
arXiv Detail & Related papers (2020-08-05T19:29:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.