AutoFlow: An Autoencoder-based Approach for IP Flow Record Compression with Minimal Impact on Traffic Classification
- URL: http://arxiv.org/abs/2410.00030v2
- Date: Fri, 31 Jan 2025 10:20:25 GMT
- Title: AutoFlow: An Autoencoder-based Approach for IP Flow Record Compression with Minimal Impact on Traffic Classification
- Authors: Adrian Pekar,
- Abstract summary: This paper presents a novel deep learning-based approach to compressing IP flow records using autoencoders.<n>Our approach reduces data volume while retaining the utility of compressed data for downstream analysis tasks.<n>The implications of this work extend to more efficient network monitoring and scalable, real-time network management solutions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Network monitoring generates massive volumes of IP flow records, posing significant challenges for storage and analysis. This paper presents a novel deep learning-based approach to compressing these records using autoencoders, enabling direct analysis of compressed data without requiring decompression. Unlike traditional compression methods, our approach reduces data volume while retaining the utility of compressed data for downstream analysis tasks, including distinguishing modern application protocols and encrypted traffic from popular services. Through extensive experiments on a real-world network traffic dataset, we demonstrate that our autoencoder-based compression achieves a 1.313x reduction in data size while maintaining 99.27% accuracy in a multi-class traffic classification task, compared to 99.77% accuracy with uncompressed data. This marginal decrease in performance is offset by substantial gains in storage and processing efficiency. The implications of this work extend to more efficient network monitoring and scalable, real-time network management solutions.
Related papers
- Efficient Token Compression for Vision Transformer with Spatial Information Preserved [59.79302182800274]
Token compression is essential for reducing the computational and memory requirements of transformer models.
We propose an efficient and hardware-compatible token compression method called Prune and Merge.
arXiv Detail & Related papers (2025-03-30T14:23:18Z) - Embedding Compression Distortion in Video Coding for Machines [67.97469042910855]
Currently, video transmission serves not only the Human Visual System (HVS) for viewing but also machine perception for analysis.
We propose a Compression Distortion Embedding (CDRE) framework, which extracts machine-perception-related distortion representation and embeds it into downstream models.
Our framework can effectively boost the rate-task performance of existing codecs with minimal overhead in terms of execution time, and number of parameters.
arXiv Detail & Related papers (2025-03-27T13:01:53Z) - Highly Efficient Direct Analytics on Semantic-aware Time Series Data Compression [15.122371541057339]
We propose a novel method for direct analytics on time series data compressed by the SHRINK compression algorithm.
Our approach offers reliable, high-speed outlier detection analytics for diverse IoT applications.
arXiv Detail & Related papers (2025-03-17T14:58:22Z) - Efficient Compression of Sparse Accelerator Data Using Implicit Neural Representations and Importance Sampling [7.838980097597047]
Large-scale particle colliders in nuclear and high-energy physics generate data at extraordinary rates.
We propose a novel approach using implicit neural representations for data learning and compression.
We also introduce an importance sampling technique to accelerate the network training process.
arXiv Detail & Related papers (2024-12-02T17:50:49Z) - Lightweight Correlation-Aware Table Compression [58.50312417249682]
$texttVirtual$ is a framework that integrates seamlessly with existing open formats.
Experiments on data-gov datasets show that $texttVirtual$ reduces file sizes by up to 40% compared to Apache Parquet.
arXiv Detail & Related papers (2024-10-17T22:28:07Z) - Channel-Aware Throughput Maximization for Cooperative Data Fusion in CAV [17.703608985129026]
Connected and autonomous vehicles (CAVs) have garnered significant attention due to their extended perception range and enhanced sensing coverage.
To address challenges such as blind spots and obstructions, CAVs employ vehicle-to-vehicle communications to aggregate data from surrounding vehicles.
We propose a channel-aware throughput approach to facilitate CAV data fusion, leveraging a self-supervised autoencoder for adaptive data compression.
arXiv Detail & Related papers (2024-10-06T00:43:46Z) - Bandwidth-Aware and Overlap-Weighted Compression for Communication-Efficient Federated Learning [29.727339562140653]
Current data compression methods, such as sparsification in Federated Averaging (FedAvg), effectively enhance the communication efficiency of Federated Learning (FL)
These methods encounter challenges such as the straggler problem and diminished model performance due to heterogeneous bandwidth and non-IID data.
We introduce a bandwidth-aware compression framework for FL, aimed at improving communication efficiency while mitigating the problems associated with non-IID data.
arXiv Detail & Related papers (2024-08-27T02:28:27Z) - Enabling robust sensor network design with data processing and
optimization making use of local beehive image and video files [0.0]
We of er a revolutionary paradigm that uses cutting-edge edge computing techniques to optimize data transmission and storage.
Our approach encompasses data compression for images and videos, coupled with a data aggregation technique for numerical data.
A key aspect of our approach is its ability to operate in resource-constrained environments.
arXiv Detail & Related papers (2024-02-26T15:27:47Z) - Accelerating Distributed Deep Learning using Lossless Homomorphic
Compression [17.654138014999326]
We introduce a novel compression algorithm that effectively merges worker-level compression with in-network aggregation.
We show up to a 6.33$times$ improvement in aggregation throughput and a 3.74$times$ increase in per-iteration training speed.
arXiv Detail & Related papers (2024-02-12T09:57:47Z) - Edge Storage Management Recipe with Zero-Shot Data Compression for Road
Anomaly Detection [1.4563998247782686]
We consider an approach for efficient storage management methods while preserving high-fidelity audio.
A computational file compression approach that encodes collected high-resolution audio into a compact code should be recommended.
Motivated by this, we propose a way of simple yet effective pre-trained autoencoder-based data compression method.
arXiv Detail & Related papers (2023-07-10T01:30:21Z) - Task-aware Distributed Source Coding under Dynamic Bandwidth [24.498190179263837]
We propose a distributed compression framework composed of independent encoders and a joint decoder, which we call neural distributed principal component analysis (NDPCA)
NDPCA flexibly compresses data from multiple sources to any available bandwidth with a single model, reducing computing and storage overhead.
Experiments show that NDPCA improves the success rate of multi-view robotic arm manipulation by 9% and the accuracy of object detection tasks on satellite imagery by 14%.
arXiv Detail & Related papers (2023-05-24T19:20:59Z) - Optimal transfer protocol by incremental layer defrosting [66.76153955485584]
Transfer learning is a powerful tool enabling model training with limited amounts of data.
The simplest transfer learning protocol is based on freezing" the feature-extractor layers of a network pre-trained on a data-rich source task.
We show that this protocol is often sub-optimal and the largest performance gain may be achieved when smaller portions of the pre-trained network are kept frozen.
arXiv Detail & Related papers (2023-03-02T17:32:11Z) - Attention-based Feature Compression for CNN Inference Offloading in Edge
Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems.
We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device.
Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z) - Correlating sparse sensing for large-scale traffic speed estimation: A
Laplacian-enhanced low-rank tensor kriging approach [76.45949280328838]
We propose a Laplacian enhanced low-rank tensor (LETC) framework featuring both lowrankness and multi-temporal correlations for large-scale traffic speed kriging.
We then design an efficient solution algorithm via several effective numeric techniques to scale up the proposed model to network-wide kriging.
arXiv Detail & Related papers (2022-10-21T07:25:57Z) - Unrolled Compressed Blind-Deconvolution [77.88847247301682]
sparse multichannel blind deconvolution (S-MBD) arises frequently in many engineering applications such as radar/sonar/ultrasound imaging.
We propose a compression method that enables blind recovery from much fewer measurements with respect to the full received signal in time.
arXiv Detail & Related papers (2022-09-28T15:16:58Z) - Attribution Preservation in Network Compression for Reliable Network
Interpretation [81.84564694303397]
Neural networks embedded in safety-sensitive applications rely on input attribution for hindsight analysis and network compression to reduce its size for edge-computing.
We show that these seemingly unrelated techniques conflict with each other as network compression deforms the produced attributions.
This phenomenon arises due to the fact that conventional network compression methods only preserve the predictions of the network while ignoring the quality of the attributions.
arXiv Detail & Related papers (2020-10-28T16:02:31Z) - ALF: Autoencoder-based Low-rank Filter-sharing for Efficient
Convolutional Neural Networks [63.91384986073851]
We propose the autoencoder-based low-rank filter-sharing technique technique (ALF)
ALF shows a reduction of 70% in network parameters, 61% in operations and 41% in execution time, with minimal loss in accuracy.
arXiv Detail & Related papers (2020-07-27T09:01:22Z) - OctSqueeze: Octree-Structured Entropy Model for LiDAR Compression [77.8842824702423]
We present a novel deep compression algorithm to reduce the memory footprint of LiDAR point clouds.
Our method exploits the sparsity and structural redundancy between points to reduce the memory footprint.
Our algorithm can be used to reduce the onboard and offboard storage of LiDAR points for applications such as self-driving cars.
arXiv Detail & Related papers (2020-05-14T17:48:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.