Related papers: Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction

Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction

URL: http://arxiv.org/abs/2401.12262v1
Date: Mon, 22 Jan 2024 05:49:41 GMT
Title: Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction
Authors: Md. Alamin Talukder, Md. Manowarul Islam, Md Ashraf Uddin, Khondokar Fida Hasan, Selina Sharmin, Salem A. Alyami and Mohammad Ali Moni
Abstract summary: Intrusion Detection Systems (IDS) play a critical role in protecting interconnected networks by detecting malicious actors and activities. This paper introduces a novel ML-based network intrusion detection model that uses Random Oversampling (RO) to address data imbalance and Stacking Feature Embedding (PCA) for dimension reduction. Using the CIC-IDS 2017 dataset, DT, RF, and ET models reach 99.99% accuracy, while DT and RF models obtain 99.94% accuracy on CIC-IDS 2018 dataset.
Score: 6.374540518226326
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Cybersecurity has emerged as a critical global concern. Intrusion Detection Systems (IDS) play a critical role in protecting interconnected networks by detecting malicious actors and activities. Machine Learning (ML)-based behavior analysis within the IDS has considerable potential for detecting dynamic cyber threats, identifying abnormalities, and identifying malicious conduct within the network. However, as the number of data grows, dimension reduction becomes an increasingly difficult task when training ML models. Addressing this, our paper introduces a novel ML-based network intrusion detection model that uses Random Oversampling (RO) to address data imbalance and Stacking Feature Embedding based on clustering results, as well as Principal Component Analysis (PCA) for dimension reduction and is specifically designed for large and imbalanced datasets. This model's performance is carefully evaluated using three cutting-edge benchmark datasets: UNSW-NB15, CIC-IDS-2017, and CIC-IDS-2018. On the UNSW-NB15 dataset, our trials show that the RF and ET models achieve accuracy rates of 99.59% and 99.95%, respectively. Furthermore, using the CIC-IDS2017 dataset, DT, RF, and ET models reach 99.99% accuracy, while DT and RF models obtain 99.94% accuracy on CIC-IDS2018. These performance results continuously outperform the state-of-art, indicating significant progress in the field of network intrusion detection. This achievement demonstrates the efficacy of the suggested methodology, which can be used practically to accurately monitor and identify network traffic intrusions, thereby blocking possible threats.

Related papers

Network Anomaly Detection for IoT Using Hyperdimensional Computing on NSL-KDD [0.2399911126932527]
This paper presents a novel approach to network anomaly detection using hyperdimensional computing (HDC) techniques. The proposed method leverages the efficiency of HDC in processing large-scale data to identify both known and unknown attack patterns. The model achieved an accuracy of 91.55% on the KDDTrain+ subset, outperforming traditional approaches.
arXiv Detail & Related papers (2025-03-04T22:19:26Z)
Enhancing Internet of Things Security throughSelf-Supervised Graph Neural Networks [1.0678175996321808]
New types of attacks often have significantly fewer samples than more common attacks, leading to unbalanced datasets. We suggest a new approach to IoT intrusion detection using Self-Supervised Learning (SSL) with a Markov Graph Convolutional Network (MarkovGCN) Our approach leverages the inherent structure of IoT networks to pre-train a GCN, which is then fine-tuned for the intrusion detection task.
arXiv Detail & Related papers (2024-12-17T17:40:14Z)
Enhanced Convolution Neural Network with Optimized Pooling and Hyperparameter Tuning for Network Intrusion Detection [0.0]
We propose an Enhanced Convolutional Neural Network (EnCNN) for Network Intrusion Detection Systems (NIDS) We compare EnCNN with various machine learning algorithms, including Logistic Regression, Decision Trees, Support Vector Machines (SVM), and ensemble methods like Random Forest, AdaBoost, and Voting Ensemble. The results show that EnCNN significantly improves detection accuracy, with a notable 10% increase over state-of-art approaches.
arXiv Detail & Related papers (2024-09-27T11:20:20Z)
Optimizing Intrusion Detection System Performance Through Synergistic Hyperparameter Tuning and Advanced Data Processing [3.3148772440755527]
Intrusion detection is vital for securing computer networks against malicious activities. To address this issue, we propose a system combining deep learning, data balancing, and high-dimensional reduction. By training on extensive datasets like CIC IDS 2018 and CIC IDS 2017, our models demonstrate robust performance and generalization.
arXiv Detail & Related papers (2024-08-03T14:09:28Z)
Strengthening Network Intrusion Detection in IoT Environments with Self-Supervised Learning and Few Shot Learning [1.0678175996321808]
The Internet of Things (IoT) has been introduced as a breakthrough technology that integrates intelligence into everyday objects. As the IoT networks grow and expand, they become more susceptible to cybersecurity attacks. This paper introduces a novel intrusion detection approach designed to address these challenges.
arXiv Detail & Related papers (2024-06-04T06:30:22Z)
Enhancing IoT Security with CNN and LSTM-Based Intrusion Detection Systems [0.23408308015481666]
Our proposed model consists on a combination of convolutional neural network (CNN) and long short-term memory (LSTM) deep learning (DL) models. This fusion facilitates the detection and classification of IoT traffic into binary categories, benign and malicious activities. Our proposed model achieves an accuracy rate of 98.42%, accompanied by a minimal loss of 0.0275.
arXiv Detail & Related papers (2024-05-28T22:12:15Z)
Evaluating ML-Based Anomaly Detection Across Datasets of Varied Integrity: A Case Study [0.0]
We introduce two refined versions of the CICIDS-2017 dataset, processed using NFStream to ensure methodologically sound flow expiration and labeling. Our research contrasts the performance of the Random Forest (RF) algorithm across the original CICIDS-2017, its refined counterparts WTMC-2021 and CRiSIS-2022, and our NFStream-generated datasets. We observe that the RF model exhibits exceptional robustness, achieving consistent high-performance metrics irrespective of the underlying dataset quality.
arXiv Detail & Related papers (2024-01-30T09:34:15Z)
A Geometrical Approach to Evaluate the Adversarial Robustness of Deep Neural Networks [52.09243852066406]
Adversarial Converging Time Score (ACTS) measures the converging time as an adversarial robustness metric. We validate the effectiveness and generalization of the proposed ACTS metric against different adversarial attacks on the large-scale ImageNet dataset.
arXiv Detail & Related papers (2023-10-10T09:39:38Z)
From Environmental Sound Representation to Robustness of 2D CNN Models Against Adversarial Attacks [82.21746840893658]
This paper investigates the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network. We show that while the ResNet-18 model trained on DWT spectrograms achieves a high recognition accuracy, attacking this model is relatively more costly for the adversary.
arXiv Detail & Related papers (2022-04-14T15:14:08Z)
Robust Self-Ensembling Network for Hyperspectral Image Classification [38.84831094095329]
We propose a robust self-ensembling network (RSEN) to address this problem. The proposed RSEN consists of twoworks including a base network and an ensemble network. We show that the proposed algorithm can yield competitive performance compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-04-08T13:33:14Z)
Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks. Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair. A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)
Uncertainty-Aware Deep Calibrated Salient Object Detection [74.58153220370527]
Existing deep neural network based salient object detection (SOD) methods mainly focus on pursuing high network accuracy. These methods overlook the gap between network accuracy and prediction confidence, known as the confidence uncalibration problem. We introduce an uncertaintyaware deep SOD network, and propose two strategies to prevent deep SOD networks from being overconfident.
arXiv Detail & Related papers (2020-12-10T23:28:36Z)
Bayesian Optimization with Machine Learning Algorithms Towards Anomaly Detection [66.05992706105224]
In this paper, an effective anomaly detection framework is proposed utilizing Bayesian Optimization technique. The performance of the considered algorithms is evaluated using the ISCX 2012 dataset. Experimental results show the effectiveness of the proposed framework in term of accuracy rate, precision, low-false alarm rate, and recall.
arXiv Detail & Related papers (2020-08-05T19:29:35Z)
Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass. We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.