Related papers: Exploring QUIC Dynamics: A Large-Scale Dataset for Encrypted Traffic Analysis

Exploring QUIC Dynamics: A Large-Scale Dataset for Encrypted Traffic Analysis

URL: http://arxiv.org/abs/2410.03728v2
Date: Thu, 7 Nov 2024 17:19:26 GMT
Title: Exploring QUIC Dynamics: A Large-Scale Dataset for Encrypted Traffic Analysis
Authors: Barak Gahtan, Robert J. Shahla, Alex M. Bronstein, Reuven Cohen,
Abstract summary: We introduce VisQUIC, a labeled dataset comprising over 100,000 QUIC traces from more than 44,000 websites (URLs) These traces provide the foundation for generating more than seven million images, with parameters of window length, pixel resolution, normalization, and labels. To illustrate the dataset's potential, we offer a use-case example of an observer estimating the number of HTTP/3 responses/requests pairs in a given QUIC.
Score: 7.795761092358769
License: http://creativecommons.org/licenses/by/4.0/
Abstract: QUIC, a new and increasingly used transport protocol, addresses and resolves the limitations of TCP by offering improved security, performance, and features such as stream multiplexing and connection migration. These features, however, also present challenges for network operators who need to monitor and analyze web traffic. In this paper, we introduce VisQUIC, a labeled dataset comprising over 100,000 QUIC traces from more than 44,000 websites (URLs), collected over a four-month period. These traces provide the foundation for generating more than seven million images, with configurable parameters of window length, pixel resolution, normalization, and labels. These images enable an observer looking at the interactions between a client and a server to analyze and gain insights about QUIC encrypted connections. To illustrate the dataset's potential, we offer a use-case example of an observer estimating the number of HTTP/3 responses/requests pairs in a given QUIC, which can reveal server behavior, client--server interactions, and the load imposed by an observed connection. We formulate the problem as a discrete regression problem, train a machine learning (ML) model for it, and then evaluate it using the proposed dataset on an example use case.

Related papers

MIETT: Multi-Instance Encrypted Traffic Transformer for Encrypted Traffic Classification [59.96233305733875]
Classifying traffic is essential for detecting security threats and optimizing network management. We propose a Multi-Instance Encrypted Traffic Transformer (MIETT) to capture both token-level and packet-level relationships. MIETT achieves results across five datasets, demonstrating its effectiveness in classifying encrypted traffic and understanding complex network behaviors.
arXiv Detail & Related papers (2024-12-19T12:52:53Z)
Estimating the Number of HTTP/3 Responses in QUIC Using Deep Learning [7.795761092358769]
This paper proposes a novel solution for estimating the number of HTTP/3 responses in a given QUIC connection by an observer. The proposed scheme transforms QUIC connection traces into a sequence of images and trains machine learning (ML) models to predict the number of responses. The scheme achieves up to 97% cumulative accuracy in both known and unknown web server settings and 92% accuracy in estimating the total number of responses in unseen QUIC traces.
arXiv Detail & Related papers (2024-10-08T15:40:22Z)
Cross-domain Learning Framework for Tracking Users in RIS-aided Multi-band ISAC Systems with Sparse Labeled Data [55.70071704247794]
Integrated sensing and communications (ISAC) is pivotal for 6G communications and is boosted by the rapid development of reconfigurable intelligent surfaces (RISs) This paper proposes the X2Track framework, where we model the tracking function by a hierarchical architecture, jointly utilizing multi-modal CSI indicators across multiple bands, and optimize it in a cross-domain manner. Under X2Track, we design an efficient deep learning algorithm to minimize tracking errors, based on transformer neural networks and adversarial learning techniques.
arXiv Detail & Related papers (2024-05-10T08:04:27Z)
On the Cross-Dataset Generalization of Machine Learning for Network Intrusion Detection [50.38534263407915]
Network Intrusion Detection Systems (NIDS) are a fundamental tool in cybersecurity. Their ability to generalize across diverse networks is a critical factor in their effectiveness and a prerequisite for real-world applications. In this study, we conduct a comprehensive analysis on the generalization of machine-learning-based NIDS through an extensive experimentation in a cross-dataset framework.
arXiv Detail & Related papers (2024-02-15T14:39:58Z)
Lens: A Foundation Model for Network Traffic [19.3652490585798]
Lens is a foundation model for network traffic that leverages the T5 architecture to learn the pre-trained representations from large-scale unlabeled data. We design a novel loss that combines three distinct tasks: Masked Span Prediction (MSP), Packet Order Prediction (POP), and Homologous Traffic Prediction (HTP)
arXiv Detail & Related papers (2024-02-06T02:45:13Z)
Application-layer Characterization and Traffic Analysis for Encrypted QUIC Transport Protocol [14.40132345175898]
We propose a novel rule-based approach to estimate the application-level traffic attributes without decrypting QUIC packets. Based on the size, timing, and direction information, our proposed algorithm analyzes the associated network traffic. The inferred HTTP attributes can be used to evaluate the QoE of application-layer services and identify the service categories for traffic classification in the encrypted QUIC connections.
arXiv Detail & Related papers (2023-10-10T20:09:46Z)
LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting [65.71129509623587]
Road traffic forecasting plays a critical role in smart city initiatives and has experienced significant advancements thanks to the power of deep learning. However, the promising results achieved on current public datasets may not be applicable to practical scenarios. We introduce the LargeST benchmark dataset, which includes a total of 8,600 sensors in California with a 5-year time coverage.
arXiv Detail & Related papers (2023-06-14T05:48:36Z)
Efficient Federated Learning with Spike Neural Networks for Traffic Sign Recognition [70.306089187104]
We introduce powerful Spike Neural Networks (SNNs) into traffic sign recognition for energy-efficient and fast model training. Numerical results indicate that the proposed federated SNN outperforms traditional federated convolutional neural networks in terms of accuracy, noise immunity, and energy efficiency as well.
arXiv Detail & Related papers (2022-05-28T03:11:48Z)
Machine Learning for Encrypted Malicious Traffic Detection: Approaches, Datasets and Comparative Study [6.267890584151111]
In post-COVID-19 environment, malicious traffic encryption is growing rapidly. We formulate a universal framework of machine learning based encrypted malicious traffic detection techniques. We implement and compare 10 encrypted malicious traffic detection algorithms.
arXiv Detail & Related papers (2022-03-17T14:00:55Z)
DATA: Domain-Aware and Task-Aware Pre-training [94.62676913928831]
We present DATA, a simple yet effective NAS approach specialized for self-supervised learning (SSL) Our method achieves promising results across a wide range of computation costs on downstream tasks, including image classification, object detection and semantic segmentation.
arXiv Detail & Related papers (2022-03-17T02:38:49Z)
ET-BERT: A Contextualized Datagram Representation with Pre-training Transformers for Encrypted Traffic Classification [9.180725486824118]
We propose a new traffic representation model called Encrypted Traffic Bidirectional Representations from Transformer (ET-BERT) The pre-trained model can be fine-tuned on a small number of task-specific labeled data and achieves state-of-the-art performance across five encrypted traffic classification tasks.
arXiv Detail & Related papers (2022-02-13T14:54:48Z)
FENXI: Deep-learning Traffic Analytics at the Edge [69.34903175081284]
We present FENXI, a system to run complex analytics by leveraging TPU. FENXI decouples operations and traffic analytics which operates at different granularities. Our analysis shows that FENXI can sustain forwarding line rate traffic processing requiring only limited resources.
arXiv Detail & Related papers (2021-05-25T08:02:44Z)
Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral Super-Resolution [79.97180849505294]
We propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet, to enhance the spatial resolution of HSI. Experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models.
arXiv Detail & Related papers (2020-07-10T08:08:20Z)
NetML: A Challenge for Network Traffic Analytics [16.8001000840057]
We release three open datasets containing almost 1.3M labeled flows in total. We focus on broad aspects in network traffic analysis, including both malware detection and application classification. As we continue to grow NetML, we expect the datasets to serve as a common platform for AI driven, reproducible research on network flow analytics.
arXiv Detail & Related papers (2020-04-25T01:12:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.