Exploring QUIC Dynamics: A Large-Scale Dataset for Encrypted Traffic Analysis
- URL: http://arxiv.org/abs/2410.03728v2
- Date: Thu, 7 Nov 2024 17:19:26 GMT
- Title: Exploring QUIC Dynamics: A Large-Scale Dataset for Encrypted Traffic Analysis
- Authors: Barak Gahtan, Robert J. Shahla, Alex M. Bronstein, Reuven Cohen,
- Abstract summary: We introduce VisQUIC, a labeled dataset comprising over 100,000 QUIC traces from more than 44,000 websites (URLs)
These traces provide the foundation for generating more than seven million images, with parameters of window length, pixel resolution, normalization, and labels.
To illustrate the dataset's potential, we offer a use-case example of an observer estimating the number of HTTP/3 responses/requests pairs in a given QUIC.
- Score: 7.795761092358769
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: QUIC, a new and increasingly used transport protocol, addresses and resolves the limitations of TCP by offering improved security, performance, and features such as stream multiplexing and connection migration. These features, however, also present challenges for network operators who need to monitor and analyze web traffic. In this paper, we introduce VisQUIC, a labeled dataset comprising over 100,000 QUIC traces from more than 44,000 websites (URLs), collected over a four-month period. These traces provide the foundation for generating more than seven million images, with configurable parameters of window length, pixel resolution, normalization, and labels. These images enable an observer looking at the interactions between a client and a server to analyze and gain insights about QUIC encrypted connections. To illustrate the dataset's potential, we offer a use-case example of an observer estimating the number of HTTP/3 responses/requests pairs in a given QUIC, which can reveal server behavior, client--server interactions, and the load imposed by an observed connection. We formulate the problem as a discrete regression problem, train a machine learning (ML) model for it, and then evaluate it using the proposed dataset on an example use case.
Related papers
- NetFlowGen: Leveraging Generative Pre-training for Network Traffic Dynamics [72.95483148058378]
We propose to pre-train a general-purpose machine learning model to capture traffic dynamics with only traffic data from NetFlow records.
We address challenges such as unifying network feature representations, learning from large unlabeled traffic data volume, and testing on real downstream tasks in DDoS attack detection.
arXiv Detail & Related papers (2024-12-30T00:47:49Z) - MIETT: Multi-Instance Encrypted Traffic Transformer for Encrypted Traffic Classification [59.96233305733875]
Classifying traffic is essential for detecting security threats and optimizing network management.
We propose a Multi-Instance Encrypted Traffic Transformer (MIETT) to capture both token-level and packet-level relationships.
MIETT achieves results across five datasets, demonstrating its effectiveness in classifying encrypted traffic and understanding complex network behaviors.
arXiv Detail & Related papers (2024-12-19T12:52:53Z) - Estimating the Number of HTTP/3 Responses in QUIC Using Deep Learning [7.795761092358769]
This paper proposes a novel method to estimate the number of HTTP/3 responses in a given QUIC connection by an observer.
The proposed scheme transforms QUIC connection traces into image sequences and uses machine learning (ML) models, guided by a tailored loss function, to predict response counts.
arXiv Detail & Related papers (2024-10-08T15:40:22Z) - Cross-domain Learning Framework for Tracking Users in RIS-aided Multi-band ISAC Systems with Sparse Labeled Data [55.70071704247794]
Integrated sensing and communications (ISAC) is pivotal for 6G communications and is boosted by the rapid development of reconfigurable intelligent surfaces (RISs)
This paper proposes the X2Track framework, where we model the tracking function by a hierarchical architecture, jointly utilizing multi-modal CSI indicators across multiple bands, and optimize it in a cross-domain manner.
Under X2Track, we design an efficient deep learning algorithm to minimize tracking errors, based on transformer neural networks and adversarial learning techniques.
arXiv Detail & Related papers (2024-05-10T08:04:27Z) - Lens: A Foundation Model for Network Traffic [19.3652490585798]
Lens is a foundation model for network traffic that leverages the T5 architecture to learn the pre-trained representations from large-scale unlabeled data.
We design a novel loss that combines three distinct tasks: Masked Span Prediction (MSP), Packet Order Prediction (POP), and Homologous Traffic Prediction (HTP)
arXiv Detail & Related papers (2024-02-06T02:45:13Z) - Application-layer Characterization and Traffic Analysis for Encrypted QUIC Transport Protocol [14.40132345175898]
We propose a novel rule-based approach to estimate the application-level traffic attributes without decrypting QUIC packets.
Based on the size, timing, and direction information, our proposed algorithm analyzes the associated network traffic.
The inferred HTTP attributes can be used to evaluate the QoE of application-layer services and identify the service categories for traffic classification in the encrypted QUIC connections.
arXiv Detail & Related papers (2023-10-10T20:09:46Z) - LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting [65.71129509623587]
Road traffic forecasting plays a critical role in smart city initiatives and has experienced significant advancements thanks to the power of deep learning.
However, the promising results achieved on current public datasets may not be applicable to practical scenarios.
We introduce the LargeST benchmark dataset, which includes a total of 8,600 sensors in California with a 5-year time coverage.
arXiv Detail & Related papers (2023-06-14T05:48:36Z) - Machine Learning for Encrypted Malicious Traffic Detection: Approaches,
Datasets and Comparative Study [6.267890584151111]
In post-COVID-19 environment, malicious traffic encryption is growing rapidly.
We formulate a universal framework of machine learning based encrypted malicious traffic detection techniques.
We implement and compare 10 encrypted malicious traffic detection algorithms.
arXiv Detail & Related papers (2022-03-17T14:00:55Z) - DATA: Domain-Aware and Task-Aware Pre-training [94.62676913928831]
We present DATA, a simple yet effective NAS approach specialized for self-supervised learning (SSL)
Our method achieves promising results across a wide range of computation costs on downstream tasks, including image classification, object detection and semantic segmentation.
arXiv Detail & Related papers (2022-03-17T02:38:49Z) - FENXI: Deep-learning Traffic Analytics at the Edge [69.34903175081284]
We present FENXI, a system to run complex analytics by leveraging TPU.
FENXI decouples operations and traffic analytics which operates at different granularities.
Our analysis shows that FENXI can sustain forwarding line rate traffic processing requiring only limited resources.
arXiv Detail & Related papers (2021-05-25T08:02:44Z) - NetML: A Challenge for Network Traffic Analytics [16.8001000840057]
We release three open datasets containing almost 1.3M labeled flows in total.
We focus on broad aspects in network traffic analysis, including both malware detection and application classification.
As we continue to grow NetML, we expect the datasets to serve as a common platform for AI driven, reproducible research on network flow analytics.
arXiv Detail & Related papers (2020-04-25T01:12:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.