Network Security Modelling with Distributional Data
- URL: http://arxiv.org/abs/2211.13419v1
- Date: Thu, 24 Nov 2022 05:18:17 GMT
- Title: Network Security Modelling with Distributional Data
- Authors: Subhabrata Majumdar, Ganesh Subramaniam
- Abstract summary: We investigate the detection of botnet command and control (C2) hosts in massive IP traffic using machine learning methods.
We use NetFlow data -- the industry standard for monitoring of IP traffic -- and ML models using two sets of features.
We use quantiles of their IP-level distributions as input features in predictive models to predict whether an IP belongs to known botnet families.
- Score: 4.133655523622441
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate the detection of botnet command and control (C2) hosts in
massive IP traffic using machine learning methods. To this end, we use NetFlow
data -- the industry standard for monitoring of IP traffic -- and ML models
using two sets of features: conventional NetFlow variables and distributional
features based on NetFlow variables. In addition to using static summaries of
NetFlow features, we use quantiles of their IP-level distributions as input
features in predictive models to predict whether an IP belongs to known botnet
families. These models are used to develop intrusion detection systems to
predict traffic traces identified with malicious attacks. The results are
validated by matching predictions to existing denylists of published malicious
IP addresses and deep packet inspection. The usage of our proposed novel
distributional features, combined with techniques that enable modelling complex
input feature spaces result in highly accurate predictions by our trained
models.
Related papers
- Distributional GFlowNets with Quantile Flows [73.73721901056662]
Generative Flow Networks (GFlowNets) are a new family of probabilistic samplers where an agent learns a policy for generating complex structure through a series of decision-making steps.
In this work, we adopt a distributional paradigm for GFlowNets, turning each flow function into a distribution, thus providing more informative learning signals during training.
Our proposed textitquantile matching GFlowNet learning algorithm is able to learn a risk-sensitive policy, an essential component for handling scenarios with risk uncertainty.
arXiv Detail & Related papers (2023-02-11T22:06:17Z) - Leveraging a Probabilistic PCA Model to Understand the Multivariate
Statistical Network Monitoring Framework for Network Security Anomaly
Detection [64.1680666036655]
We revisit anomaly detection techniques based on PCA from a probabilistic generative model point of view.
We have evaluated the mathematical model using two different datasets.
arXiv Detail & Related papers (2023-02-02T13:41:18Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - ENCODE: Encoding NetFlows for Network Anomaly Detection [17.94733537757708]
Many works have used machine learning to detect network attacks using NetFlow data.
We propose an encoding algorithm that takes the frequency and context of the feature values into account.
We train several machine learning models for anomaly detection using the data encoded with our algorithm.
arXiv Detail & Related papers (2022-07-08T13:25:06Z) - TFDPM: Attack detection for cyber-physical systems with diffusion
probabilistic models [10.389972581904999]
We propose TFDPM, a general framework for attack detection tasks in CPSs.
It simultaneously extracts temporal pattern and feature pattern given the historical data.
The noise scheduling network increases the detection speed by three times.
arXiv Detail & Related papers (2021-12-20T13:13:29Z) - SOME/IP Intrusion Detection using Deep Learning-based Sequential Models
in Automotive Ethernet Networks [2.3204135551124407]
Intrusion Detection Systems are widely used to detect cyberattacks.
We present a deep learning-based sequential model for offline intrusion detection on SOME/IP protocol.
arXiv Detail & Related papers (2021-08-04T09:58:06Z) - An Explainable Machine Learning-based Network Intrusion Detection System
for Enabling Generalisability in Securing IoT Networks [0.0]
Machine Learning (ML)-based network intrusion detection systems bring many benefits for enhancing the security posture of an organisation.
Many systems have been designed and developed in the research community, often achieving a perfect detection rate when evaluated using certain datasets.
This paper tightens the gap by evaluating the generalisability of a common feature set to different network environments and attack types.
arXiv Detail & Related papers (2021-04-15T00:44:45Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - Superiority of Simplicity: A Lightweight Model for Network Device
Workload Prediction [58.98112070128482]
We propose a lightweight solution for series prediction based on historic observations.
It consists of a heterogeneous ensemble method composed of two models - a neural network and a mean predictor.
It achieves an overall $R2$ score of 0.10 on the available FedCSIS 2020 challenge dataset.
arXiv Detail & Related papers (2020-07-07T15:44:16Z) - Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network.
PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z) - File Classification Based on Spiking Neural Networks [0.5065947993017157]
We propose a system for file classification in large data sets based on spiking neural networks (SNNs)
The proposed system may represent a valid alternative to classical machine learning algorithms for inference tasks.
arXiv Detail & Related papers (2020-04-08T11:50:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.