Related papers: Malware Traffic Classification: Evaluation of Algorithms and an Automated Ground-truth Generation Pipeline

Malware Traffic Classification: Evaluation of Algorithms and an Automated Ground-truth Generation Pipeline

URL: http://arxiv.org/abs/2010.11627v2
Date: Sat, 7 Nov 2020 11:51:07 GMT
Title: Malware Traffic Classification: Evaluation of Algorithms and an Automated Ground-truth Generation Pipeline
Authors: Syed Muhammad Kumail Raza and Juan Caballero
Abstract summary: We propose an automated packet data-labeling pipeline to generate ground-truth data. We explore and test different kind of clustering approaches which make use of unique and diverse set of features extracted from this observable meta-data.
Score: 8.779666771357029
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Identifying threats in a network traffic flow which is encrypted is uniquely challenging. On one hand it is extremely difficult to simply decrypt the traffic due to modern encryption algorithms. On the other hand, passing such an encrypted stream through pattern matching algorithms is useless because encryption ensures there aren't any. Moreover, evaluating such models is also difficult due to lack of labeled benign and malware datasets. Other approaches have tried to tackle this problem by employing observable meta-data gathered from the flow. We try to augment this approach by extending it to a semi-supervised malware classification pipeline using these observable meta-data. To this end, we explore and test different kind of clustering approaches which make use of unique and diverse set of features extracted from this observable meta-data. We also, propose an automated packet data-labeling pipeline to generate ground-truth data which can serve as a base-line to evaluate the classifiers mentioned above in particular, or any other detection model in general.

Related papers

Learning to Localize Leakage of Cryptographic Sensitive Variables [13.98875599619791]
We develop a principled deep learning framework for determining the relative leakage due to measurements recorded at different points in time. This information is invaluable to cryptographic hardware designers for understanding *why* their hardware leaks.
arXiv Detail & Related papers (2025-03-10T15:42:30Z)
Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust. Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model. We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z)
An Unforgeable Publicly Verifiable Watermark for Large Language Models [84.2805275589553]
Current watermark detection algorithms require the secret key used in the watermark generation process, making them susceptible to security breaches and counterfeiting during public detection. We propose an unforgeable publicly verifiable watermark algorithm named UPV that uses two different neural networks for watermark generation and detection, instead of using the same key at both stages.
arXiv Detail & Related papers (2023-07-30T13:43:27Z)
Feature Mining for Encrypted Malicious Traffic Detection with Deep Learning and Other Machine Learning Algorithms [7.404682407709988]
The popularity of encryption mechanisms poses a great challenge to malicious traffic detection. Traditional detection techniques cannot work without the decryption of encrypted traffic. In this paper, we provide an in-depth analysis of traffic features and compare different state-of-the-art traffic feature creation approaches. We propose a novel concept for encrypted traffic feature which is specifically designed for encrypted malicious traffic analysis.
arXiv Detail & Related papers (2023-04-07T15:25:36Z)
Hard Regularization to Prevent Deep Online Clustering Collapse without Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed. While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster. We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z)
Did You Train on My Dataset? Towards Public Dataset Protection with Clean-Label Backdoor Watermarking [54.40184736491652]
We propose a backdoor-based watermarking approach that serves as a general framework for safeguarding public-available data. By inserting a small number of watermarking samples into the dataset, our approach enables the learning model to implicitly learn a secret function set by defenders. This hidden function can then be used as a watermark to track down third-party models that use the dataset illegally.
arXiv Detail & Related papers (2023-03-20T21:54:30Z)
Using Topological Data Analysis to classify Encrypted Bits [0.0]
Persistent homology is applied to generate topological features of a point cloud obtained from sets of encryptions. We see that this machine learning pipeline is able to classify our data successfully where classical models of machine learning fail to perform the task.
arXiv Detail & Related papers (2023-01-18T09:43:00Z)
Detection and Evaluation of Clusters within Sequential Data [58.720142291102135]
Clustering algorithms for Block Markov Chains possess theoretical optimality guarantees. In particular, our sequential data is derived from human DNA, written text, animal movement data and financial markets. It is found that the Block Markov Chain model assumption can indeed produce meaningful insights in exploratory data analyses.
arXiv Detail & Related papers (2022-10-04T15:22:39Z)
Unsupervised Abnormal Traffic Detection through Topological Flow Analysis [1.933681537640272]
topological connectivity component of a malicious flow is less exploited. We present a simple method that facilitate the use of connectivity graph features in unsupervised anomaly detection algorithms.
arXiv Detail & Related papers (2022-05-14T18:52:49Z)
Machine Learning for Encrypted Malicious Traffic Detection: Approaches, Datasets and Comparative Study [6.267890584151111]
In post-COVID-19 environment, malicious traffic encryption is growing rapidly. We formulate a universal framework of machine learning based encrypted malicious traffic detection techniques. We implement and compare 10 encrypted malicious traffic detection algorithms.
arXiv Detail & Related papers (2022-03-17T14:00:55Z)
Voice-Face Homogeneity Tells Deepfake [56.334968246631725]
Existing detection approaches contribute to exploring the specific artifacts in deepfake videos. We propose to perform the deepfake detection from an unexplored voice-face matching view. Our model obtains significantly improved performance as compared to other state-of-the-art competitors.
arXiv Detail & Related papers (2022-03-04T09:08:50Z)
MD-CSDNetwork: Multi-Domain Cross Stitched Network for Deepfake Detection [80.83725644958633]
Current deepfake generation methods leave discriminative artifacts in the frequency spectrum of fake images and videos. We present a novel approach, termed as MD-CSDNetwork, for combining the features in the spatial and frequency domains to mine a shared discriminative representation.
arXiv Detail & Related papers (2021-09-15T14:11:53Z)
DNS Covert Channel Detection via Behavioral Analysis: a Machine Learning Approach [0.09176056742068815]
We propose an effective covert channel detection method based on the analysis of DNS network data passively extracted from a network monitoring system. The proposed solution has been evaluated over a 15-day-long experimental session with the injection of traffic that covers the most relevant exfiltration and tunneling attacks.
arXiv Detail & Related papers (2020-10-04T13:28:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.