Machine learning-based network intrusion detection for big and
imbalanced data using oversampling, stacking feature embedding and feature
extraction
- URL: http://arxiv.org/abs/2401.12262v1
- Date: Mon, 22 Jan 2024 05:49:41 GMT
- Title: Machine learning-based network intrusion detection for big and
imbalanced data using oversampling, stacking feature embedding and feature
extraction
- Authors: Md. Alamin Talukder, Md. Manowarul Islam, Md Ashraf Uddin, Khondokar
Fida Hasan, Selina Sharmin, Salem A. Alyami and Mohammad Ali Moni
- Abstract summary: Intrusion Detection Systems (IDS) play a critical role in protecting interconnected networks by detecting malicious actors and activities.
This paper introduces a novel ML-based network intrusion detection model that uses Random Oversampling (RO) to address data imbalance and Stacking Feature Embedding (PCA) for dimension reduction.
Using the CIC-IDS 2017 dataset, DT, RF, and ET models reach 99.99% accuracy, while DT and RF models obtain 99.94% accuracy on CIC-IDS 2018 dataset.
- Score: 6.374540518226326
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cybersecurity has emerged as a critical global concern. Intrusion Detection
Systems (IDS) play a critical role in protecting interconnected networks by
detecting malicious actors and activities. Machine Learning (ML)-based behavior
analysis within the IDS has considerable potential for detecting dynamic cyber
threats, identifying abnormalities, and identifying malicious conduct within
the network. However, as the number of data grows, dimension reduction becomes
an increasingly difficult task when training ML models. Addressing this, our
paper introduces a novel ML-based network intrusion detection model that uses
Random Oversampling (RO) to address data imbalance and Stacking Feature
Embedding based on clustering results, as well as Principal Component Analysis
(PCA) for dimension reduction and is specifically designed for large and
imbalanced datasets. This model's performance is carefully evaluated using
three cutting-edge benchmark datasets: UNSW-NB15, CIC-IDS-2017, and
CIC-IDS-2018. On the UNSW-NB15 dataset, our trials show that the RF and ET
models achieve accuracy rates of 99.59% and 99.95%, respectively. Furthermore,
using the CIC-IDS2017 dataset, DT, RF, and ET models reach 99.99% accuracy,
while DT and RF models obtain 99.94% accuracy on CIC-IDS2018. These performance
results continuously outperform the state-of-art, indicating significant
progress in the field of network intrusion detection. This achievement
demonstrates the efficacy of the suggested methodology, which can be used
practically to accurately monitor and identify network traffic intrusions,
thereby blocking possible threats.
Related papers
- Enhanced Convolution Neural Network with Optimized Pooling and Hyperparameter Tuning for Network Intrusion Detection [0.0]
We propose an Enhanced Convolutional Neural Network (EnCNN) for Network Intrusion Detection Systems (NIDS)
We compare EnCNN with various machine learning algorithms, including Logistic Regression, Decision Trees, Support Vector Machines (SVM), and ensemble methods like Random Forest, AdaBoost, and Voting Ensemble.
The results show that EnCNN significantly improves detection accuracy, with a notable 10% increase over state-of-art approaches.
arXiv Detail & Related papers (2024-09-27T11:20:20Z) - Optimizing Intrusion Detection System Performance Through Synergistic Hyperparameter Tuning and Advanced Data Processing [3.3148772440755527]
Intrusion detection is vital for securing computer networks against malicious activities.
To address this issue, we propose a system combining deep learning, data balancing, and high-dimensional reduction.
By training on extensive datasets like CIC IDS 2018 and CIC IDS 2017, our models demonstrate robust performance and generalization.
arXiv Detail & Related papers (2024-08-03T14:09:28Z) - Strengthening Network Intrusion Detection in IoT Environments with Self-Supervised Learning and Few Shot Learning [1.0678175996321808]
The Internet of Things (IoT) has been introduced as a breakthrough technology that integrates intelligence into everyday objects.
As the IoT networks grow and expand, they become more susceptible to cybersecurity attacks.
This paper introduces a novel intrusion detection approach designed to address these challenges.
arXiv Detail & Related papers (2024-06-04T06:30:22Z) - Enhancing IoT Security with CNN and LSTM-Based Intrusion Detection Systems [0.23408308015481666]
Our proposed model consists on a combination of convolutional neural network (CNN) and long short-term memory (LSTM) deep learning (DL) models.
This fusion facilitates the detection and classification of IoT traffic into binary categories, benign and malicious activities.
Our proposed model achieves an accuracy rate of 98.42%, accompanied by a minimal loss of 0.0275.
arXiv Detail & Related papers (2024-05-28T22:12:15Z) - Evaluating ML-Based Anomaly Detection Across Datasets of Varied Integrity: A Case Study [0.0]
We introduce two refined versions of the CICIDS-2017 dataset, processed using NFStream to ensure methodologically sound flow expiration and labeling.
Our research contrasts the performance of the Random Forest (RF) algorithm across the original CICIDS-2017, its refined counterparts WTMC-2021 and CRiSIS-2022, and our NFStream-generated datasets.
We observe that the RF model exhibits exceptional robustness, achieving consistent high-performance metrics irrespective of the underlying dataset quality.
arXiv Detail & Related papers (2024-01-30T09:34:15Z) - A Geometrical Approach to Evaluate the Adversarial Robustness of Deep
Neural Networks [52.09243852066406]
Adversarial Converging Time Score (ACTS) measures the converging time as an adversarial robustness metric.
We validate the effectiveness and generalization of the proposed ACTS metric against different adversarial attacks on the large-scale ImageNet dataset.
arXiv Detail & Related papers (2023-10-10T09:39:38Z) - From Environmental Sound Representation to Robustness of 2D CNN Models
Against Adversarial Attacks [82.21746840893658]
This paper investigates the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network.
We show that while the ResNet-18 model trained on DWT spectrograms achieves a high recognition accuracy, attacking this model is relatively more costly for the adversary.
arXiv Detail & Related papers (2022-04-14T15:14:08Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Uncertainty-Aware Deep Calibrated Salient Object Detection [74.58153220370527]
Existing deep neural network based salient object detection (SOD) methods mainly focus on pursuing high network accuracy.
These methods overlook the gap between network accuracy and prediction confidence, known as the confidence uncalibration problem.
We introduce an uncertaintyaware deep SOD network, and propose two strategies to prevent deep SOD networks from being overconfident.
arXiv Detail & Related papers (2020-12-10T23:28:36Z) - Bayesian Optimization with Machine Learning Algorithms Towards Anomaly
Detection [66.05992706105224]
In this paper, an effective anomaly detection framework is proposed utilizing Bayesian Optimization technique.
The performance of the considered algorithms is evaluated using the ISCX 2012 dataset.
Experimental results show the effectiveness of the proposed framework in term of accuracy rate, precision, low-false alarm rate, and recall.
arXiv Detail & Related papers (2020-08-05T19:29:35Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.