Unsupervised User-Based Insider Threat Detection Using Bayesian Gaussian
Mixture Models
- URL: http://arxiv.org/abs/2211.14437v1
- Date: Wed, 23 Nov 2022 13:45:19 GMT
- Title: Unsupervised User-Based Insider Threat Detection Using Bayesian Gaussian
Mixture Models
- Authors: Simon Bertrand, Nadia Tawbi, Jos\'ee Desharnais
- Abstract summary: In this paper, we propose an unsupervised insider threat detection system based on audit data.
The proposed approach leverages a user-based model to optimize specific behaviors modelization and an automatic feature extraction system based on Word2Vec.
Results indicate that the proposed method competes with state-of-the-art approaches, presenting a good recall of 88%, accuracy and true negative rate of 93%, and a false positive rate of 6.9%.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Insider threats are a growing concern for organizations due to the amount of
damage that their members can inflict by combining their privileged access and
domain knowledge. Nonetheless, the detection of such threats is challenging,
precisely because of the ability of the authorized personnel to easily conduct
malicious actions and because of the immense size and diversity of audit data
produced by organizations in which the few malicious footprints are hidden. In
this paper, we propose an unsupervised insider threat detection system based on
audit data using Bayesian Gaussian Mixture Models. The proposed approach
leverages a user-based model to optimize specific behaviors modelization and an
automatic feature extraction system based on Word2Vec for ease of use in a
real-life scenario. The solution distinguishes itself by not requiring data
balancing nor to be trained only on normal instances, and by its little domain
knowledge required to implement. Still, results indicate that the proposed
method competes with state-of-the-art approaches, presenting a good recall of
88\%, accuracy and true negative rate of 93%, and a false positive rate of
6.9%. For our experiments, we used the benchmark dataset CERT version 4.2.
Related papers
- BoBa: Boosting Backdoor Detection through Data Distribution Inference in Federated Learning [26.714674251814586]
Federated learning is susceptible to poisoning attacks due to its decentralized nature.
We propose a novel distribution-aware anomaly detection mechanism, BoBa, to address this problem.
arXiv Detail & Related papers (2024-07-12T19:38:42Z) - Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable [70.77600345240867]
A novel arbitrary-in-arbitrary-out (AIAO) strategy makes watermarks resilient to fine-tuning-based removal.
Unlike the existing methods of designing a backdoor for the input/output space of diffusion models, in our method, we propose to embed the backdoor into the feature space of sampled subpaths.
Our empirical studies on the MS-COCO, AFHQ, LSUN, CUB-200, and DreamBooth datasets confirm the robustness of AIAO.
arXiv Detail & Related papers (2024-05-01T12:03:39Z) - Bayesian Learned Models Can Detect Adversarial Malware For Free [28.498994871579985]
Adversarial training is an effective method but is computationally expensive to scale up to large datasets.
In particular, a Bayesian formulation can capture the model parameters' distribution and quantify uncertainty without sacrificing model performance.
We found, quantifying uncertainty through Bayesian learning methods can defend against adversarial malware.
arXiv Detail & Related papers (2024-03-27T07:16:48Z) - Fairness Without Harm: An Influence-Guided Active Sampling Approach [32.173195437797766]
We aim to train models that mitigate group fairness disparity without causing harm to model accuracy.
The current data acquisition methods, such as fair active learning approaches, typically require annotating sensitive attributes.
We propose a tractable active data sampling algorithm that does not rely on training group annotations.
arXiv Detail & Related papers (2024-02-20T07:57:38Z) - Unsupervised Accuracy Estimation of Deep Visual Models using
Domain-Adaptive Adversarial Perturbation without Source Samples [1.1852406625172216]
We propose a new framework to estimate model accuracy on unlabeled target data without access to source data.
Our approach measures the disagreement rate between the source hypothesis and the target pseudo-labeling function.
Our proposed source-free framework effectively addresses the challenging distribution shift scenarios and outperforms existing methods requiring source data and labels for training.
arXiv Detail & Related papers (2023-07-19T15:33:11Z) - Conservative Prediction via Data-Driven Confidence Minimization [70.93946578046003]
In safety-critical applications of machine learning, it is often desirable for a model to be conservative.
We propose the Data-Driven Confidence Minimization framework, which minimizes confidence on an uncertainty dataset.
arXiv Detail & Related papers (2023-06-08T07:05:36Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - Canary in a Coalmine: Better Membership Inference with Ensembled
Adversarial Queries [53.222218035435006]
We use adversarial tools to optimize for queries that are discriminative and diverse.
Our improvements achieve significantly more accurate membership inference than existing methods.
arXiv Detail & Related papers (2022-10-19T17:46:50Z) - DAD: Data-free Adversarial Defense at Test Time [21.741026088202126]
Deep models are highly susceptible to adversarial attacks.
Privacy has become an important concern, restricting access to only trained models but not the training data.
We propose a completely novel problem of 'test-time adversarial defense in absence of training data and even their statistics'
arXiv Detail & Related papers (2022-04-04T15:16:13Z) - Towards Reducing Labeling Cost in Deep Object Detection [61.010693873330446]
We propose a unified framework for active learning, that considers both the uncertainty and the robustness of the detector.
Our method is able to pseudo-label the very confident predictions, suppressing a potential distribution drift.
arXiv Detail & Related papers (2021-06-22T16:53:09Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.