AutoFlow: An Autoencoder-based Approach for IP Flow Record Compression with Minimal Impact on Traffic Classification
- URL: http://arxiv.org/abs/2410.00030v2
- Date: Fri, 31 Jan 2025 10:20:25 GMT
- Title: AutoFlow: An Autoencoder-based Approach for IP Flow Record Compression with Minimal Impact on Traffic Classification
- Authors: Adrian Pekar,
- Abstract summary: This paper presents a novel deep learning-based approach to compressing IP flow records using autoencoders.
Our approach reduces data volume while retaining the utility of compressed data for downstream analysis tasks.
The implications of this work extend to more efficient network monitoring and scalable, real-time network management solutions.
- Score: 0.0
- License:
- Abstract: Network monitoring generates massive volumes of IP flow records, posing significant challenges for storage and analysis. This paper presents a novel deep learning-based approach to compressing these records using autoencoders, enabling direct analysis of compressed data without requiring decompression. Unlike traditional compression methods, our approach reduces data volume while retaining the utility of compressed data for downstream analysis tasks, including distinguishing modern application protocols and encrypted traffic from popular services. Through extensive experiments on a real-world network traffic dataset, we demonstrate that our autoencoder-based compression achieves a 1.313x reduction in data size while maintaining 99.27% accuracy in a multi-class traffic classification task, compared to 99.77% accuracy with uncompressed data. This marginal decrease in performance is offset by substantial gains in storage and processing efficiency. The implications of this work extend to more efficient network monitoring and scalable, real-time network management solutions.
Related papers
- NetFlowGen: Leveraging Generative Pre-training for Network Traffic Dynamics [72.95483148058378]
We propose to pre-train a general-purpose machine learning model to capture traffic dynamics with only traffic data from NetFlow records.
We address challenges such as unifying network feature representations, learning from large unlabeled traffic data volume, and testing on real downstream tasks in DDoS attack detection.
arXiv Detail & Related papers (2024-12-30T00:47:49Z) - Efficient Compression of Sparse Accelerator Data Using Implicit Neural Representations and Importance Sampling [7.838980097597047]
Large-scale particle colliders in nuclear and high-energy physics generate data at extraordinary rates.
We propose a novel approach using implicit neural representations for data learning and compression.
We also introduce an importance sampling technique to accelerate the network training process.
arXiv Detail & Related papers (2024-12-02T17:50:49Z) - Lightweight Correlation-Aware Table Compression [58.50312417249682]
$texttVirtual$ is a framework that integrates seamlessly with existing open formats.
Experiments on data-gov datasets show that $texttVirtual$ reduces file sizes by up to 40% compared to Apache Parquet.
arXiv Detail & Related papers (2024-10-17T22:28:07Z) - Enabling robust sensor network design with data processing and
optimization making use of local beehive image and video files [0.0]
We of er a revolutionary paradigm that uses cutting-edge edge computing techniques to optimize data transmission and storage.
Our approach encompasses data compression for images and videos, coupled with a data aggregation technique for numerical data.
A key aspect of our approach is its ability to operate in resource-constrained environments.
arXiv Detail & Related papers (2024-02-26T15:27:47Z) - Edge Storage Management Recipe with Zero-Shot Data Compression for Road
Anomaly Detection [1.4563998247782686]
We consider an approach for efficient storage management methods while preserving high-fidelity audio.
A computational file compression approach that encodes collected high-resolution audio into a compact code should be recommended.
Motivated by this, we propose a way of simple yet effective pre-trained autoencoder-based data compression method.
arXiv Detail & Related papers (2023-07-10T01:30:21Z) - Task-aware Distributed Source Coding under Dynamic Bandwidth [23.610303860657588]
We propose a distributed compression framework composed of independent encoders and a joint decoder, which we call neural distributed principal component analysis (NDPCA)
NDPCA flexibly compresses data from multiple sources to any available bandwidth with a single model, reducing computing and storage overhead.
Experiments show that NDPCA improves the success rate of multi-view robotic arm manipulation by 9% and the accuracy of object detection tasks on satellite imagery by 14%.
arXiv Detail & Related papers (2023-05-24T19:20:59Z) - Unrolled Compressed Blind-Deconvolution [77.88847247301682]
sparse multichannel blind deconvolution (S-MBD) arises frequently in many engineering applications such as radar/sonar/ultrasound imaging.
We propose a compression method that enables blind recovery from much fewer measurements with respect to the full received signal in time.
arXiv Detail & Related papers (2022-09-28T15:16:58Z) - Supervised Compression for Resource-constrained Edge Computing Systems [26.676557573171618]
Full-scale deep neural networks are often too resource-intensive in terms of energy and storage.
This paper adopts ideas from knowledge distillation and neural image compression to compress intermediate feature representations more efficiently.
It achieves better supervised rate-distortion performance while also maintaining smaller end-to-end latency.
arXiv Detail & Related papers (2021-08-21T11:10:29Z) - Attribution Preservation in Network Compression for Reliable Network
Interpretation [81.84564694303397]
Neural networks embedded in safety-sensitive applications rely on input attribution for hindsight analysis and network compression to reduce its size for edge-computing.
We show that these seemingly unrelated techniques conflict with each other as network compression deforms the produced attributions.
This phenomenon arises due to the fact that conventional network compression methods only preserve the predictions of the network while ignoring the quality of the attributions.
arXiv Detail & Related papers (2020-10-28T16:02:31Z) - ALF: Autoencoder-based Low-rank Filter-sharing for Efficient
Convolutional Neural Networks [63.91384986073851]
We propose the autoencoder-based low-rank filter-sharing technique technique (ALF)
ALF shows a reduction of 70% in network parameters, 61% in operations and 41% in execution time, with minimal loss in accuracy.
arXiv Detail & Related papers (2020-07-27T09:01:22Z) - OctSqueeze: Octree-Structured Entropy Model for LiDAR Compression [77.8842824702423]
We present a novel deep compression algorithm to reduce the memory footprint of LiDAR point clouds.
Our method exploits the sparsity and structural redundancy between points to reduce the memory footprint.
Our algorithm can be used to reduce the onboard and offboard storage of LiDAR points for applications such as self-driving cars.
arXiv Detail & Related papers (2020-05-14T17:48:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.