Related papers: HTTP2vec: Embedding of HTTP Requests for Detection of Anomalous Traffic

HTTP2vec: Embedding of HTTP Requests for Detection of Anomalous Traffic

URL: http://arxiv.org/abs/2108.01763v1
Date: Tue, 3 Aug 2021 21:53:31 GMT
Title: HTTP2vec: Embedding of HTTP Requests for Detection of Anomalous Traffic
Authors: Mateusz Gniewkowski, Henryk Maciejewski, Tomasz R. Surmacz, Wiktor Walentynowicz
Abstract summary: We propose an unsupervised language representation model for embedding HTTP requests and then using it to classify anomalies in the traffic. The solution is motivated by methods used in Natural Language Processing (NLP) such as Doc2Vec. To verify how the solution would work in real word conditions, we train the model using only legitimate traffic.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hypertext transfer protocol (HTTP) is one of the most widely used protocols on the Internet. As a consequence, most attacks (i.e., SQL injection, XSS) use HTTP as the transport mechanism. Therefore, it is crucial to develop an intelligent solution that would allow to effectively detect and filter out anomalies in HTTP traffic. Currently, most of the anomaly detection systems are either rule-based or trained using manually selected features. We propose utilizing modern unsupervised language representation model for embedding HTTP requests and then using it to classify anomalies in the traffic. The solution is motivated by methods used in Natural Language Processing (NLP) such as Doc2Vec which could potentially capture the true understanding of HTTP messages, and therefore improve the efficiency of Intrusion Detection System. In our work, we not only aim at generating a suitable embedding space, but also at the interpretability of the proposed model. We decided to use the current state-of-the-art RoBERTa, which, as far as we know, has never been used in a similar problem. To verify how the solution would work in real word conditions, we train the model using only legitimate traffic. We also try to explain the results based on clusters that occur in the vectorized requests space and a simple logistic regression classifier. We compared our approach with the similar, previously proposed methods. We evaluate the feasibility of our method on three different datasets: CSIC2010, CSE-CIC-IDS2018 and one that we prepared ourselves. The results we show are comparable to others or better, and most importantly - interpretable.

Related papers

Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent [53.637837706712794]
We propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs. Specifically, we introduce a Ghost Spatial Masking (GSM) module embedded within a Transformer encoder for spatial feature extraction. We benchmark three practical sports game datasets, Basketball-U, Football-U, and Soccer-U, for evaluation.
arXiv Detail & Related papers (2024-05-27T22:15:23Z)
Detecting unknown HTTP-based malicious communication behavior via generated adversarial flows and hierarchical traffic features [6.418271335117575]
Experienced adversaries often hide malicious information in HTTP traffic to evade detection. We propose an HTTP-based Malicious Communication traffic Detection Model based on generated adversarial flows and hierarchical traffic features.
arXiv Detail & Related papers (2023-09-07T14:28:31Z)
Classification and Explanation of Distributed Denial-of-Service (DDoS) Attack Detection using Machine Learning and Shapley Additive Explanation (SHAP) Methods [4.899818550820576]
Distinguishing between legitimate traffic and malicious traffic is a challenging task. An inter-model explanation implemented to classify a traffic flow whether is benign or malicious is an important investigation of the inner working theory of the model. We propose a framework that can not only classify legitimate traffic and malicious traffic of DDoS attacks but also use SHAP to explain the decision-making of the model.
arXiv Detail & Related papers (2023-06-27T04:51:29Z)
Many or Few Samples? Comparing Transfer, Contrastive and Meta-Learning in Encrypted Traffic Classification [68.19713459228369]
We compare transfer learning, meta-learning and contrastive learning against reference Machine Learning (ML) tree-based and monolithic DL models. We show that (i) using large datasets we can obtain more general representations, (ii) contrastive learning is the best methodology. While ML tree-based cannot handle large tasks but fits well small tasks, by means of reusing learned representations, DL methods are reaching tree-based models performance also for small tasks.
arXiv Detail & Related papers (2023-05-21T11:20:49Z)
Utilizing Background Knowledge for Robust Reasoning over Traffic Situations [63.45021731775964]
We focus on a complementary research aspect of Intelligent Transportation: traffic understanding. We scope our study to text-based methods and datasets given the abundant commonsense knowledge. We adopt three knowledge-driven approaches for zero-shot QA over traffic situations.
arXiv Detail & Related papers (2022-12-04T09:17:24Z)
Instance Attack:An Explanation-based Vulnerability Analysis Framework Against DNNs for Malware Detection [0.0]
We propose the notion of the instance-based attack. Our scheme is interpretable and can work in a black-box environment. Our method operates in black-box settings and the results can be validated with domain knowledge.
arXiv Detail & Related papers (2022-09-06T12:41:20Z)
Verifying Learning-Based Robotic Navigation Systems [61.01217374879221]
We show how modern verification engines can be used for effective model selection. Specifically, we use verification to detect and rule out policies that may demonstrate suboptimal behavior. Our work is the first to demonstrate the use of verification backends for recognizing suboptimal DRL policies in real-world robots.
arXiv Detail & Related papers (2022-05-26T17:56:43Z)
DATA: Domain-Aware and Task-Aware Pre-training [94.62676913928831]
We present DATA, a simple yet effective NAS approach specialized for self-supervised learning (SSL) Our method achieves promising results across a wide range of computation costs on downstream tasks, including image classification, object detection and semantic segmentation.
arXiv Detail & Related papers (2022-03-17T02:38:49Z)
A2Log: Attentive Augmented Log Anomaly Detection [53.06341151551106]
Anomaly detection becomes increasingly important for the dependability and serviceability of IT services. Existing unsupervised methods need anomaly examples to obtain a suitable decision boundary. We develop A2Log, which is an unsupervised anomaly detection method consisting of two steps: Anomaly scoring and anomaly decision.
arXiv Detail & Related papers (2021-09-20T13:40:21Z)
The Devil Is in the Details: An Efficient Convolutional Neural Network for Transport Mode Detection [3.008051369744002]
Transport mode detection is a classification problem aiming to design an algorithm that can infer the transport mode of a user given multimodal signals. We show that a small, optimized model can perform as well as a current deep model.
arXiv Detail & Related papers (2021-09-16T08:05:47Z)
DoS and DDoS Mitigation Using Variational Autoencoders [15.23225419183423]
We explore the potential of Variational Autoencoders to serve as a component within an intelligent security solution. Two methods based on the ability of Variational Autoencoders to learn latent representations from network traffic flows are proposed.
arXiv Detail & Related papers (2021-05-14T15:38:40Z)
Contextual-Bandit Anomaly Detection for IoT Data in Distributed Hierarchical Edge Computing [65.78881372074983]
IoT devices can hardly afford complex deep neural networks (DNN) models, and offloading anomaly detection tasks to the cloud incurs long delay. We propose and build a demo for an adaptive anomaly detection approach for distributed hierarchical edge computing (HEC) systems. We show that our proposed approach significantly reduces detection delay without sacrificing accuracy, as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-04-15T06:13:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.