The Cross-evaluation of Machine Learning-based Network Intrusion
Detection Systems
- URL: http://arxiv.org/abs/2203.04686v1
- Date: Wed, 9 Mar 2022 12:59:44 GMT
- Title: The Cross-evaluation of Machine Learning-based Network Intrusion
Detection Systems
- Authors: Giovanni Apruzzese and Luca Pajola and Mauro Conti
- Abstract summary: ML-NIDS must be trained and evaluated, operations requiring data where benign and malicious samples are clearly labelled.
We propose the first framework, XeNIDS, for reliable cross-evaluations based on Network Flows.
By using XeNIDS on six well-known datasets, we demonstrate the concealed potential, but also the risks of cross-evaluations of ML-NIDS.
- Score: 23.652608408269366
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Enhancing Network Intrusion Detection Systems (NIDS) with supervised Machine
Learning (ML) is tough. ML-NIDS must be trained and evaluated, operations
requiring data where benign and malicious samples are clearly labelled. Such
labels demand costly expert knowledge, resulting in a lack of real deployments,
as well as on papers always relying on the same outdated data. The situation
improved recently, as some efforts disclosed their labelled datasets. However,
most past works used such datasets just as a 'yet another' testbed, overlooking
the added potential provided by such availability.
In contrast, we promote using such existing labelled data to cross-evaluate
ML-NIDS. Such approach received only limited attention and, due to its
complexity, requires a dedicated treatment. We hence propose the first
cross-evaluation model. Our model highlights the broader range of realistic
use-cases that can be assessed via cross-evaluations, allowing the discovery of
still unknown qualities of state-of-the-art ML-NIDS. For instance, their
detection surface can be extended--at no additional labelling cost. However,
conducting such cross-evaluations is challenging. Hence, we propose the first
framework, XeNIDS, for reliable cross-evaluations based on Network Flows. By
using XeNIDS on six well-known datasets, we demonstrate the concealed
potential, but also the risks, of cross-evaluations of ML-NIDS.
Related papers
- Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods.
We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results.
We also present an effective and robust post-processing technique, Bias Pruning with Fair Activations (BPFA), which improves fairness without requiring retraining or weight updates.
arXiv Detail & Related papers (2024-07-19T14:53:18Z) - Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection [71.93411099797308]
Out-of-distribution (OOD) samples are crucial when deploying machine learning models in open-world scenarios.
We propose to tackle this constraint by leveraging the expert knowledge and reasoning capability of large language models (LLM) to potential Outlier Exposure, termed EOE.
EOE can be generalized to different tasks, including far, near, and fine-language OOD detection.
EOE achieves state-of-the-art performance across different OOD tasks and can be effectively scaled to the ImageNet-1K dataset.
arXiv Detail & Related papers (2024-06-02T17:09:48Z) - Self-Supervised Learning for User Localization [8.529237718266042]
Machine learning techniques have shown remarkable accuracy in localization tasks.
Their dependency on vast amounts of labeled data, particularly Channel State Information (CSI) and corresponding coordinates, remains a bottleneck.
We propose a pioneering approach that leverages self-supervised pretraining on unlabeled data to boost the performance of supervised learning for user localization based on CSI.
arXiv Detail & Related papers (2024-04-19T21:49:10Z) - Enhancing Trustworthiness in ML-Based Network Intrusion Detection with Uncertainty Quantification [0.0]
Intrusion Detection Systems (IDSs) are security devices designed to identify and mitigate attacks to modern networks.
Data-driven approaches based on Machine Learning (ML) have gained more and more popularity for executing the classification tasks.
However, typical ML models adopted for this purpose do not properly take into account the uncertainty associated with their prediction.
arXiv Detail & Related papers (2023-09-05T13:52:41Z) - Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection
Capability [70.72426887518517]
Out-of-distribution (OOD) detection is an indispensable aspect of secure AI when deploying machine learning models in real-world applications.
We propose a novel method, Unleashing Mask, which aims to restore the OOD discriminative capabilities of the well-trained model with ID data.
Our method utilizes a mask to figure out the memorized atypical samples, and then finetune the model or prune it with the introduced mask to forget them.
arXiv Detail & Related papers (2023-06-06T14:23:34Z) - VOS: Learning What You Don't Know by Virtual Outlier Synthesis [23.67449949146439]
Out-of-distribution (OOD) detection has received much attention lately due to its importance in the safe deployment of neural networks.
Previous approaches rely on real outlier datasets for model regularization.
We present VOS, a novel framework for OOD detection by adaptively synthesizing virtual outliers.
arXiv Detail & Related papers (2022-02-02T18:43:01Z) - Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D
Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on.
We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z) - Bridging the gap to real-world for network intrusion detection systems
with data-centric approach [1.4699455652461724]
This paper presents a systematic data-centric approach to address the current limitations of NIDS research.
It generates NIDS datasets composed of the most recent network traffic and attacks, with the labeling process integrated by design.
arXiv Detail & Related papers (2021-10-25T04:50:12Z) - Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for
Open-Set Semi-Supervised Learning [101.28281124670647]
Open-set semi-supervised learning (open-set SSL) investigates a challenging but practical scenario where out-of-distribution (OOD) samples are contained in the unlabeled data.
We propose a novel training mechanism that could effectively exploit the presence of OOD data for enhanced feature learning.
Our approach substantially lifts the performance on open-set SSL and outperforms the state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-08-12T09:14:44Z) - WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
We propose a weakly- and semi-supervised object detection framework (WSSOD)
An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images.
The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
arXiv Detail & Related papers (2021-05-21T11:58:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.