Exploring QUIC Dynamics: A Large-Scale Dataset for Encrypted Traffic Analysis
- URL: http://arxiv.org/abs/2410.03728v2
- Date: Thu, 7 Nov 2024 17:19:26 GMT
- Title: Exploring QUIC Dynamics: A Large-Scale Dataset for Encrypted Traffic Analysis
- Authors: Barak Gahtan, Robert J. Shahla, Alex M. Bronstein, Reuven Cohen,
- Abstract summary: We introduce VisQUIC, a labeled dataset comprising over 100,000 QUIC traces from more than 44,000 websites (URLs)
These traces provide the foundation for generating more than seven million images, with parameters of window length, pixel resolution, normalization, and labels.
To illustrate the dataset's potential, we offer a use-case example of an observer estimating the number of HTTP/3 responses/requests pairs in a given QUIC.
- Score: 7.795761092358769
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: QUIC, a new and increasingly used transport protocol, addresses and resolves the limitations of TCP by offering improved security, performance, and features such as stream multiplexing and connection migration. These features, however, also present challenges for network operators who need to monitor and analyze web traffic. In this paper, we introduce VisQUIC, a labeled dataset comprising over 100,000 QUIC traces from more than 44,000 websites (URLs), collected over a four-month period. These traces provide the foundation for generating more than seven million images, with configurable parameters of window length, pixel resolution, normalization, and labels. These images enable an observer looking at the interactions between a client and a server to analyze and gain insights about QUIC encrypted connections. To illustrate the dataset's potential, we offer a use-case example of an observer estimating the number of HTTP/3 responses/requests pairs in a given QUIC, which can reveal server behavior, client--server interactions, and the load imposed by an observed connection. We formulate the problem as a discrete regression problem, train a machine learning (ML) model for it, and then evaluate it using the proposed dataset on an example use case.
Related papers
- Estimating the Number of HTTP/3 Responses in QUIC Using Deep Learning [7.795761092358769]
This paper proposes a novel solution for estimating the number of HTTP/3 responses in a given QUIC connection by an observer.
The proposed scheme transforms QUIC connection traces into a sequence of images and trains machine learning (ML) models to predict the number of responses.
The scheme achieves up to 97% cumulative accuracy in both known and unknown web server settings and 92% accuracy in estimating the total number of responses in unseen QUIC traces.
arXiv Detail & Related papers (2024-10-08T15:40:22Z) - Cross-domain Learning Framework for Tracking Users in RIS-aided Multi-band ISAC Systems with Sparse Labeled Data [55.70071704247794]
Integrated sensing and communications (ISAC) is pivotal for 6G communications and is boosted by the rapid development of reconfigurable intelligent surfaces (RISs)
This paper proposes the X2Track framework, where we model the tracking function by a hierarchical architecture, jointly utilizing multi-modal CSI indicators across multiple bands, and optimize it in a cross-domain manner.
Under X2Track, we design an efficient deep learning algorithm to minimize tracking errors, based on transformer neural networks and adversarial learning techniques.
arXiv Detail & Related papers (2024-05-10T08:04:27Z) - Generic Multi-modal Representation Learning for Network Traffic Analysis [6.372999570085887]
Network traffic analysis is fundamental for network management, troubleshooting, and security.
We propose a flexible Multi-modal Autoencoder (MAE) pipeline that can solve different use cases.
We argue that the MAE architecture is generic and can be used to learn representations useful in multiple scenarios.
arXiv Detail & Related papers (2024-05-04T12:24:29Z) - Lens: A Foundation Model for Network Traffic [19.3652490585798]
Lens is a foundation model for network traffic that leverages the T5 architecture to learn the pre-trained representations from large-scale unlabeled data.
We design a novel loss that combines three distinct tasks: Masked Span Prediction (MSP), Packet Order Prediction (POP), and Homologous Traffic Prediction (HTP)
arXiv Detail & Related papers (2024-02-06T02:45:13Z) - Application-layer Characterization and Traffic Analysis for Encrypted QUIC Transport Protocol [14.40132345175898]
We propose a novel rule-based approach to estimate the application-level traffic attributes without decrypting QUIC packets.
Based on the size, timing, and direction information, our proposed algorithm analyzes the associated network traffic.
The inferred HTTP attributes can be used to evaluate the QoE of application-layer services and identify the service categories for traffic classification in the encrypted QUIC connections.
arXiv Detail & Related papers (2023-10-10T20:09:46Z) - Spatial-Temporal Graph Enhanced DETR Towards Multi-Frame 3D Object Detection [54.041049052843604]
We present STEMD, a novel end-to-end framework that enhances the DETR-like paradigm for multi-frame 3D object detection.
First, to model the inter-object spatial interaction and complex temporal dependencies, we introduce the spatial-temporal graph attention network.
Finally, it poses a challenge for the network to distinguish between the positive query and other highly similar queries that are not the best match.
arXiv Detail & Related papers (2023-07-01T13:53:14Z) - Traffic Scene Parsing through the TSP6K Dataset [109.69836680564616]
We introduce a specialized traffic monitoring dataset, termed TSP6K, with high-quality pixel-level and instance-level annotations.
The dataset captures more crowded traffic scenes with several times more traffic participants than the existing driving scenes.
We propose a detail refining decoder for scene parsing, which recovers the details of different semantic regions in traffic scenes.
arXiv Detail & Related papers (2023-03-06T02:05:14Z) - End-to-end Tracking with a Multi-query Transformer [96.13468602635082]
Multiple-object tracking (MOT) is a challenging task that requires simultaneous reasoning about location, appearance, and identity of the objects in the scene over time.
Our aim in this paper is to move beyond tracking-by-detection approaches, to class-agnostic tracking that performs well also for unknown object classes.
arXiv Detail & Related papers (2022-10-26T10:19:37Z) - Federated Multi-Target Domain Adaptation [99.93375364579484]
Federated learning methods enable us to train machine learning models on distributed user data while preserving its privacy.
We consider a more practical scenario where the distributed client data is unlabeled, and a centralized labeled dataset is available on the server.
We propose an effective DualAdapt method to address the new challenges.
arXiv Detail & Related papers (2021-08-17T17:53:05Z) - FENXI: Deep-learning Traffic Analytics at the Edge [69.34903175081284]
We present FENXI, a system to run complex analytics by leveraging TPU.
FENXI decouples operations and traffic analytics which operates at different granularities.
Our analysis shows that FENXI can sustain forwarding line rate traffic processing requiring only limited resources.
arXiv Detail & Related papers (2021-05-25T08:02:44Z) - Interpretable Feature Learning in Multivariate Big Data Analysis for
Network Monitoring [0.4342241136871849]
We present an extension of the Multivariate Big Data Analysis (MBDA) methodology, a recently proposed interpretable data analysis tool.
We propose a solution to the automatic derivation of features, a cornerstone step for the application of MBDA when the amount of data is massive.
We apply the extended MBDA to two case studies: UGR'16, a benchmark flow-based real-traffic dataset for anomaly detection, and Dartmouth'18, the longest and largest Wi-Fi trace known to date.
arXiv Detail & Related papers (2019-07-05T04:51:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.