Improving the Reliability of Network Intrusion Detection Systems through
Dataset Integration
- URL: http://arxiv.org/abs/2112.02080v1
- Date: Thu, 2 Dec 2021 09:30:18 GMT
- Title: Improving the Reliability of Network Intrusion Detection Systems through
Dataset Integration
- Authors: Roberto Mag\'an-Carri\'on, Daniel Urda, Ignacio D\'iaz-Cano, Bernab\'e
Dorronsoro
- Abstract summary: This work presents Reliable-NIDS (R-NIDS), a novel methodology for Machine Learning (ML) based Network Intrusion Detection Systems (NIDSs)
R-NIDS allows ML models to work on integrated datasets, empowering the learning process with diverse information from different datasets.
In this work we propose to build two well-known ML models based on the information of three of the most common datasets in the literature for NIDS evaluation.
- Score: 0.20646127669654826
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work presents Reliable-NIDS (R-NIDS), a novel methodology for Machine
Learning (ML) based Network Intrusion Detection Systems (NIDSs) that allows ML
models to work on integrated datasets, empowering the learning process with
diverse information from different datasets. Therefore, R-NIDS targets the
design of more robust models, that generalize better than traditional
approaches. We also propose a new dataset, called UNK21. It is built from three
of the most well-known network datasets (UGR'16, USNW-NB15 and NLS-KDD), each
one gathered from its own network environment, with different features and
classes, by using a data aggregation approach present in R-NIDS. Following
R-NIDS, in this work we propose to build two well-known ML models (a linear and
a non-linear one) based on the information of three of the most common datasets
in the literature for NIDS evaluation, those integrated in UNK21. The results
that the proposed methodology offers show how these two ML models trained as a
NIDS solution could benefit from this approach, being able to generalize better
when training on the newly proposed UNK21 dataset. Furthermore, these results
are carefully analyzed with statistical tools that provide high confidence on
our conclusions.
Related papers
- On the Cross-Dataset Generalization of Machine Learning for Network
Intrusion Detection [50.38534263407915]
Network Intrusion Detection Systems (NIDS) are a fundamental tool in cybersecurity.
Their ability to generalize across diverse networks is a critical factor in their effectiveness and a prerequisite for real-world applications.
In this study, we conduct a comprehensive analysis on the generalization of machine-learning-based NIDS through an extensive experimentation in a cross-dataset framework.
arXiv Detail & Related papers (2024-02-15T14:39:58Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs)
NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge.
NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - Better Modelling Out-of-Distribution Regression on Distributed Acoustic
Sensor Data Using Anchored Hidden State Mixup [0.7455546102930911]
Generalizing the application of machine learning models to situations where the statistical distribution of training and test data are different has been a complex problem.
We introduce an anchored-based Out of Distribution (OOD) Regression Mixup algorithm, leveraging manifold hidden state mixup and observation similarities to form a novel regularization penalty.
We demonstrate with an extensive evaluation the generalization performance of the proposed method against existing approaches, then show that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2022-02-23T03:12:21Z) - Bridging the gap to real-world for network intrusion detection systems
with data-centric approach [1.4699455652461724]
This paper presents a systematic data-centric approach to address the current limitations of NIDS research.
It generates NIDS datasets composed of the most recent network traffic and attacks, with the labeling process integrated by design.
arXiv Detail & Related papers (2021-10-25T04:50:12Z) - Feature Extraction for Machine Learning-based Intrusion Detection in IoT
Networks [6.6147550436077776]
This paper aims to discover whether Feature Reduction (FR) and Machine Learning (ML) techniques can be generalised across various datasets.
The detection accuracy of three Feature Extraction (FE) algorithms; Principal Component Analysis (PCA), Auto-encoder (AE), and Linear Discriminant Analysis (LDA) is evaluated.
arXiv Detail & Related papers (2021-08-28T23:52:18Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Edge-assisted Democratized Learning Towards Federated Analytics [67.44078999945722]
We show the hierarchical learning structure of the proposed edge-assisted democratized learning mechanism, namely Edge-DemLearn.
We also validate Edge-DemLearn as a flexible model training mechanism to build a distributed control and aggregation methodology in regions.
arXiv Detail & Related papers (2020-12-01T11:46:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.