Related papers: NIDS Neural Networks Using Sliding Time Window Data Processing with Trainable Activations and its Generalization Capability

NIDS Neural Networks Using Sliding Time Window Data Processing with Trainable Activations and its Generalization Capability

URL: http://arxiv.org/abs/2410.18658v2
Date: Fri, 25 Oct 2024 10:05:17 GMT
Title: NIDS Neural Networks Using Sliding Time Window Data Processing with Trainable Activations and its Generalization Capability
Authors: Anton Raskovalov, Nikita Gabdullin, Ilya Androsov,
Abstract summary: This paper presents neural networks for network intrusion detection systems (NIDS) that operate on flow data preprocessed with a time window. It requires only eleven features which do not rely on deep packet inspection and can be found in most NIDS datasets and easily obtained from conventional flow collectors. The reported training accuracy exceeds 99% for the proposed method with as little as twenty neural network input features.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents neural networks for network intrusion detection systems (NIDS), that operate on flow data preprocessed with a time window. It requires only eleven features which do not rely on deep packet inspection and can be found in most NIDS datasets and easily obtained from conventional flow collectors. The time window aggregates information with respect to hosts facilitating the identification of flow signatures that are missed by other aggregation methods. Several network architectures are studied and the use of Kolmogorov-Arnold Network (KAN)-inspired trainable activation functions that help to achieve higher accuracy with simpler network structure is proposed. The reported training accuracy exceeds 99% for the proposed method with as little as twenty neural network input features. This work also studies the generalization capability of NIDS, a crucial aspect that has not been adequately addressed in the previous studies. The generalization experiments are conducted using CICIDS2017 dataset and a custom dataset collected as part of this study. It is shown that the performance metrics decline significantly when changing datasets, and the reduction in performance metrics can be attributed to the difference in signatures of the same type flows in different datasets, which in turn can be attributed to the differences between the underlying networks. It is shown that the generalization accuracy of some neural networks can be very unstable and sensitive to random initialization parameters, and neural networks with fewer parameters and well-tuned activations are more stable and achieve higher accuracy.

Related papers

Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $μ$P Parametrization [66.03821840425539]
In this paper, we investigate the training dynamics of $L$-layer neural networks using the tensor gradient program (SGD) framework. We show that SGD enables these networks to learn linearly independent features that substantially deviate from their initial values. This rich feature space captures relevant data information and ensures that any convergent point of the training process is a global minimum.
arXiv Detail & Related papers (2025-03-12T17:33:13Z)
Flow reconstruction in time-varying geometries using graph neural networks [1.0485739694839669]
The model incorporates a feature propagation algorithm as a preprocessing step to handle extremely sparse inputs. A binary indicator is introduced as a validity mask to distinguish between the original and propagated data points. The model is trained on a unique data set of Direct Numerical Simulations (DNS) of a motored engine at a technically relevant operating condition.
arXiv Detail & Related papers (2024-11-13T16:49:56Z)
Out-of-Distribution Detection using Neural Activation Prior [15.673290330356194]
Out-of-distribution detection (OOD) is a crucial technique for deploying machine learning models in the real world. We propose a simple yet effective Neural Activation Prior (NAP) for OOD detection. Our method achieves the state-of-the-art performance on CIFAR benchmark and ImageNet dataset.
arXiv Detail & Related papers (2024-02-28T08:45:07Z)
Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs. We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD. We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Investigation and rectification of NIDS datasets and standratized feature set derivation for network attack detection with graph neural networks [0.0]
Graph Neural Networks (GNNs) provide an opportunity to analyze network topology along with flow features. In this paper we inspect different versions of ToN-IoT dataset and point out inconsistencies in some versions. We propose a new standardized and compact set of flow features which are derived solely from NetFlowv5-compatible data.
arXiv Detail & Related papers (2022-12-26T07:42:25Z)
Neural networks trained with SGD learn distributions of increasing complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics. We then exploit higher-order statistics only later during training. We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z)
Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters. We find that our approach successfully generates parameters for a wide range of loss prompts. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Wireless Localisation in WiFi using Novel Deep Architectures [4.541069830146568]
This paper studies the indoor localisation of WiFi devices based on a commodity chipset and standard channel sounding. We present a novel shallow neural network (SNN) in which features are extracted from the channel state information corresponding to WiFi subcarriers received on different antennas.
arXiv Detail & Related papers (2020-10-16T22:48:29Z)
Self-Challenging Improves Cross-Domain Generalization [81.99554996975372]
Convolutional Neural Networks (CNN) conduct image classification by activating dominant features that correlated with labels. We introduce a simple training, Self-Challenging Representation (RSC), that significantly improves the generalization of CNN to the out-of-domain data. RSC iteratively challenges the dominant features activated on the training data, and forces the network to activate remaining features that correlates with labels.
arXiv Detail & Related papers (2020-07-05T21:42:26Z)
Ensembled sparse-input hierarchical networks for high-dimensional datasets [8.629912408966145]
We show that dense neural networks can be a practical data analysis tool in settings with small sample sizes. A proposed method appropriately prunes the network structure by tuning only two L1-penalty parameters. On a collection of real-world datasets with different sizes, EASIER-net selected network architectures in a data-adaptive manner and achieved higher prediction accuracy than off-the-shelf methods on average.
arXiv Detail & Related papers (2020-05-11T02:08:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.