Impacts of Data Preprocessing and Hyperparameter Optimization on the Performance of Machine Learning Models Applied to Intrusion Detection Systems
- URL: http://arxiv.org/abs/2407.11105v1
- Date: Mon, 15 Jul 2024 14:30:25 GMT
- Title: Impacts of Data Preprocessing and Hyperparameter Optimization on the Performance of Machine Learning Models Applied to Intrusion Detection Systems
- Authors: Mateus Guimarães Lima, Antony Carvalho, João Gabriel Álvares, Clayton Escouper das Chagas, Ronaldo Ribeiro Goldschmidt,
- Abstract summary: Intrusion Detection Systems (IDS) have been continuously improved.
Many of them incorporate machine learning (ML) techniques to identify threats.
This article aims to present a study that fills this research gap.
- Score: 0.8388591755871736
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In the context of cybersecurity of modern communications networks, Intrusion Detection Systems (IDS) have been continuously improved, many of them incorporating machine learning (ML) techniques to identify threats. Although there are researches focused on the study of these techniques applied to IDS, the state-of-the-art lacks works concentrated exclusively on the evaluation of the impacts of data pre-processing actions and the optimization of the values of the hyperparameters of the ML algorithms in the construction of the models of threat identification. This article aims to present a study that fills this research gap. For that, experiments were carried out with two data sets, comparing attack scenarios with variations of pre-processing techniques and optimization of hyperparameters. The results confirm that the proper application of these techniques, in general, makes the generated classification models more robust and greatly reduces the execution times of these models' training and testing processes.
Related papers
- Comprehensive evaluation of Mal-API-2019 dataset by machine learning in malware detection [0.5475886285082937]
This study conducts a thorough examination of malware detection using machine learning techniques.
The aim is to advance cybersecurity capabilities by identifying and mitigating threats more effectively.
arXiv Detail & Related papers (2024-03-04T17:22:43Z) - Intrusion Detection System with Machine Learning and Multiple Datasets [0.0]
In this paper, an enhanced intrusion detection system (IDS) that utilizes machine learning (ML) is explored.
Ultimately, this improved system can be used to combat the attacks made by unethical hackers.
arXiv Detail & Related papers (2023-12-04T14:58:19Z) - Robustness and Generalization Performance of Deep Learning Models on
Cyber-Physical Systems: A Comparative Study [71.84852429039881]
Investigation focuses on the models' ability to handle a range of perturbations, such as sensor faults and noise.
We test the generalization and transfer learning capabilities of these models by exposing them to out-of-distribution (OOD) samples.
arXiv Detail & Related papers (2023-06-13T12:43:59Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Predictive machine learning for prescriptive applications: a coupled
training-validating approach [77.34726150561087]
We propose a new method for training predictive machine learning models for prescriptive applications.
This approach is based on tweaking the validation step in the standard training-validating-testing scheme.
Several experiments with synthetic data demonstrate promising results in reducing the prescription costs in both deterministic and real models.
arXiv Detail & Related papers (2021-10-22T15:03:20Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - Using Data Assimilation to Train a Hybrid Forecast System that Combines
Machine-Learning and Knowledge-Based Components [52.77024349608834]
We consider the problem of data-assisted forecasting of chaotic dynamical systems when the available data is noisy partial measurements.
We show that by using partial measurements of the state of the dynamical system, we can train a machine learning model to improve predictions made by an imperfect knowledge-based model.
arXiv Detail & Related papers (2021-02-15T19:56:48Z) - Real-World Anomaly Detection by using Digital Twin Systems and
Weakly-Supervised Learning [3.0100975935933567]
We present novel weakly-supervised approaches to anomaly detection for industrial settings.
The approaches make use of a Digital Twin to generate a training dataset which simulates the normal operation of the machinery.
The performance of the proposed methods is compared against various state-of-the-art anomaly detection algorithms on an application to a real-world dataset.
arXiv Detail & Related papers (2020-11-12T10:15:56Z) - Multi-Stage Optimized Machine Learning Framework for Network Intrusion
Detection [8.26773636337474]
This paper proposes a novel multi-stage optimized ML-based NIDS framework.
It reduces computational complexity while maintaining its detection performance.
The proposed framework significantly reduces the required training sample size (up to 74%) and feature set size (up to 50%)
arXiv Detail & Related papers (2020-08-09T03:18:00Z) - Predictive modeling approaches in laser-based material processing [59.04160452043105]
This study aims to automate and forecast the effect of laser processing on material structures.
The focus is centred on the performance of representative statistical and machine learning algorithms.
Results can set the basis for a systematic methodology towards reducing material design, testing and production cost.
arXiv Detail & Related papers (2020-06-13T17:28:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.