Impacts of Data Preprocessing and Hyperparameter Optimization on the Performance of Machine Learning Models Applied to Intrusion Detection Systems
- URL: http://arxiv.org/abs/2407.11105v1
- Date: Mon, 15 Jul 2024 14:30:25 GMT
- Title: Impacts of Data Preprocessing and Hyperparameter Optimization on the Performance of Machine Learning Models Applied to Intrusion Detection Systems
- Authors: Mateus Guimarães Lima, Antony Carvalho, João Gabriel Álvares, Clayton Escouper das Chagas, Ronaldo Ribeiro Goldschmidt,
- Abstract summary: Intrusion Detection Systems (IDS) have been continuously improved.
Many of them incorporate machine learning (ML) techniques to identify threats.
This article aims to present a study that fills this research gap.
- Score: 0.8388591755871736
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In the context of cybersecurity of modern communications networks, Intrusion Detection Systems (IDS) have been continuously improved, many of them incorporating machine learning (ML) techniques to identify threats. Although there are researches focused on the study of these techniques applied to IDS, the state-of-the-art lacks works concentrated exclusively on the evaluation of the impacts of data pre-processing actions and the optimization of the values of the hyperparameters of the ML algorithms in the construction of the models of threat identification. This article aims to present a study that fills this research gap. For that, experiments were carried out with two data sets, comparing attack scenarios with variations of pre-processing techniques and optimization of hyperparameters. The results confirm that the proper application of these techniques, in general, makes the generated classification models more robust and greatly reduces the execution times of these models' training and testing processes.
Related papers
- Extending Network Intrusion Detection with Enhanced Particle Swarm Optimization Techniques [0.0]
The present research investigates how to improve Network Intrusion Detection Systems (NIDS) by combining Machine Learning (ML) and Deep Learning (DL) techniques.
The study uses the CSE-CIC-IDS 2018 and LITNET-2020 datasets to compare ML methods (Decision Trees, Random Forest, XGBoost) and DL models (CNNs, RNNs, DNNs) against key performance metrics.
The Decision Tree model performed better across all measures after being fine-tuned with Enhanced Particle Swarm Optimization (EPSO), demonstrating the model's ability to detect network breaches effectively.
arXiv Detail & Related papers (2024-08-14T17:11:36Z) - Learning Long-Horizon Predictions for Quadrotor Dynamics [48.08477275522024]
We study the key design choices for efficiently learning long-horizon prediction dynamics for quadrotors.
We show that sequential modeling techniques showcase their advantage in minimizing compounding errors compared to other types of solutions.
We propose a novel decoupled dynamics learning approach, which further simplifies the learning process while also enhancing the approach modularity.
arXiv Detail & Related papers (2024-07-17T19:06:47Z) - Intrusion Detection System with Machine Learning and Multiple Datasets [0.0]
In this paper, an enhanced intrusion detection system (IDS) that utilizes machine learning (ML) is explored.
Ultimately, this improved system can be used to combat the attacks made by unethical hackers.
arXiv Detail & Related papers (2023-12-04T14:58:19Z) - Latent Alignment with Deep Set EEG Decoders [44.128689862889715]
We introduce the Latent Alignment method that won the Benchmarks for EEG Transfer Learning competition.
We present its formulation as a deep set applied on the set of trials from a given subject.
Our experimental results show that performing statistical distribution alignment at later stages in a deep learning model is beneficial to the classification accuracy.
arXiv Detail & Related papers (2023-11-29T12:40:45Z) - Robustness and Generalization Performance of Deep Learning Models on
Cyber-Physical Systems: A Comparative Study [71.84852429039881]
Investigation focuses on the models' ability to handle a range of perturbations, such as sensor faults and noise.
We test the generalization and transfer learning capabilities of these models by exposing them to out-of-distribution (OOD) samples.
arXiv Detail & Related papers (2023-06-13T12:43:59Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Predictive machine learning for prescriptive applications: a coupled
training-validating approach [77.34726150561087]
We propose a new method for training predictive machine learning models for prescriptive applications.
This approach is based on tweaking the validation step in the standard training-validating-testing scheme.
Several experiments with synthetic data demonstrate promising results in reducing the prescription costs in both deterministic and real models.
arXiv Detail & Related papers (2021-10-22T15:03:20Z) - Real-World Anomaly Detection by using Digital Twin Systems and
Weakly-Supervised Learning [3.0100975935933567]
We present novel weakly-supervised approaches to anomaly detection for industrial settings.
The approaches make use of a Digital Twin to generate a training dataset which simulates the normal operation of the machinery.
The performance of the proposed methods is compared against various state-of-the-art anomaly detection algorithms on an application to a real-world dataset.
arXiv Detail & Related papers (2020-11-12T10:15:56Z) - Multi-Stage Optimized Machine Learning Framework for Network Intrusion
Detection [8.26773636337474]
This paper proposes a novel multi-stage optimized ML-based NIDS framework.
It reduces computational complexity while maintaining its detection performance.
The proposed framework significantly reduces the required training sample size (up to 74%) and feature set size (up to 50%)
arXiv Detail & Related papers (2020-08-09T03:18:00Z) - Predictive modeling approaches in laser-based material processing [59.04160452043105]
This study aims to automate and forecast the effect of laser processing on material structures.
The focus is centred on the performance of representative statistical and machine learning algorithms.
Results can set the basis for a systematic methodology towards reducing material design, testing and production cost.
arXiv Detail & Related papers (2020-06-13T17:28:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.