Feature Extraction for Machine Learning-based Intrusion Detection in IoT
Networks
- URL: http://arxiv.org/abs/2108.12722v1
- Date: Sat, 28 Aug 2021 23:52:18 GMT
- Title: Feature Extraction for Machine Learning-based Intrusion Detection in IoT
Networks
- Authors: Mohanad Sarhan, Siamak Layeghy, Nour Moustafa, Marcus Gallagher,
Marius Portmann
- Abstract summary: This paper aims to discover whether Feature Reduction (FR) and Machine Learning (ML) techniques can be generalised across various datasets.
The detection accuracy of three Feature Extraction (FE) algorithms; Principal Component Analysis (PCA), Auto-encoder (AE), and Linear Discriminant Analysis (LDA) is evaluated.
- Score: 6.6147550436077776
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The tremendous numbers of network security breaches that have occurred in IoT
networks have demonstrated the unreliability of current Network Intrusion
Detection Systems (NIDSs). Consequently, network interruptions and loss of
sensitive data have occurred which led to an active research area for improving
NIDS technologies. During an analysis of related works, it was observed that
most researchers aimed to obtain better classification results by using a set
of untried combinations of Feature Reduction (FR) and Machine Learning (ML)
techniques on NIDS datasets. However, these datasets are different in feature
sets, attack types, and network design. Therefore, this paper aims to discover
whether these techniques can be generalised across various datasets. Six ML
models are utilised: a Deep Feed Forward, Convolutional Neural Network,
Recurrent Neural Network, Decision Tree, Logistic Regression, and Naive Bayes.
The detection accuracy of three Feature Extraction (FE) algorithms; Principal
Component Analysis (PCA), Auto-encoder (AE), and Linear Discriminant Analysis
(LDA) is evaluated using three benchmark datasets; UNSW-NB15, ToN-IoT and
CSE-CIC-IDS2018. Although PCA and AE algorithms have been widely used,
determining their optimal number of extracted dimensions has been overlooked.
The results obtained indicate that there is no clear FE method or ML model that
can achieve the best scores for all datasets. The optimal number of extracted
dimensions has been identified for each dataset and LDA decreases the
performance of the ML models on two datasets. The variance is used to analyse
the extracted dimensions of LDA and PCA. Finally, this paper concludes that the
choice of datasets significantly alters the performance of the applied
techniques and we argue for the need for a universal (benchmark) feature set to
facilitate further advancement and progress in this field of research.
Related papers
- Efficient Network Traffic Feature Sets for IoT Intrusion Detection [0.0]
This work evaluates the feature sets provided by a combination of different feature selection methods, namely Information Gain, Chi-Squared Test, Recursive Feature Elimination, Mean Absolute Deviation, and Dispersion Ratio, in multiple IoT network datasets.
The influence of the smaller feature sets on both the classification performance and the training time of ML models is compared, with the aim of increasing the computational efficiency of IoT intrusion detection.
arXiv Detail & Related papers (2024-06-12T09:51:29Z) - On the Cross-Dataset Generalization of Machine Learning for Network
Intrusion Detection [50.38534263407915]
Network Intrusion Detection Systems (NIDS) are a fundamental tool in cybersecurity.
Their ability to generalize across diverse networks is a critical factor in their effectiveness and a prerequisite for real-world applications.
In this study, we conduct a comprehensive analysis on the generalization of machine-learning-based NIDS through an extensive experimentation in a cross-dataset framework.
arXiv Detail & Related papers (2024-02-15T14:39:58Z) - Machine learning-based network intrusion detection for big and
imbalanced data using oversampling, stacking feature embedding and feature
extraction [6.374540518226326]
Intrusion Detection Systems (IDS) play a critical role in protecting interconnected networks by detecting malicious actors and activities.
This paper introduces a novel ML-based network intrusion detection model that uses Random Oversampling (RO) to address data imbalance and Stacking Feature Embedding (PCA) for dimension reduction.
Using the CIC-IDS 2017 dataset, DT, RF, and ET models reach 99.99% accuracy, while DT and RF models obtain 99.94% accuracy on CIC-IDS 2018 dataset.
arXiv Detail & Related papers (2024-01-22T05:49:41Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Systematic Evaluation of Deep Learning Models for Log-based Failure Prediction [3.3810628880631226]
This paper systematically investigates the combination of log data embedding strategies and Deep Learning (DL) types for failure prediction.
To that end, we propose a modular architecture to accommodate various configurations of embedding strategies and DL-based encoders.
Using the F1 score metric, our results show that the best overall performing configuration is a CNN-based encoder with Logkey2vec.
arXiv Detail & Related papers (2023-03-13T16:04:14Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - Feature Analysis for ML-based IIoT Intrusion Detection [0.0]
Powerful Machine Learning models have been adopted to implement Network Intrusion Detection Systems (NIDSs)
It is important to select the right set of data features, which maximise the detection accuracy as well as computational efficiency.
This paper provides an extensive analysis of the optimal feature sets in terms of the importance and predictive power of network attacks.
arXiv Detail & Related papers (2021-08-29T02:19:37Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - An Explainable Machine Learning-based Network Intrusion Detection System
for Enabling Generalisability in Securing IoT Networks [0.0]
Machine Learning (ML)-based network intrusion detection systems bring many benefits for enhancing the security posture of an organisation.
Many systems have been designed and developed in the research community, often achieving a perfect detection rate when evaluated using certain datasets.
This paper tightens the gap by evaluating the generalisability of a common feature set to different network environments and attack types.
arXiv Detail & Related papers (2021-04-15T00:44:45Z) - Contextual-Bandit Anomaly Detection for IoT Data in Distributed
Hierarchical Edge Computing [65.78881372074983]
IoT devices can hardly afford complex deep neural networks (DNN) models, and offloading anomaly detection tasks to the cloud incurs long delay.
We propose and build a demo for an adaptive anomaly detection approach for distributed hierarchical edge computing (HEC) systems.
We show that our proposed approach significantly reduces detection delay without sacrificing accuracy, as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-04-15T06:13:33Z) - Adaptive Anomaly Detection for IoT Data in Hierarchical Edge Computing [71.86955275376604]
We propose an adaptive anomaly detection approach for hierarchical edge computing (HEC) systems to solve this problem.
We design an adaptive scheme to select one of the models based on the contextual information extracted from input data, to perform anomaly detection.
We evaluate our proposed approach using a real IoT dataset, and demonstrate that it reduces detection delay by 84% while maintaining almost the same accuracy as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-01-10T05:29:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.