Data Augmentation for Traffic Classification
- URL: http://arxiv.org/abs/2401.10754v2
- Date: Tue, 23 Jan 2024 12:53:33 GMT
- Title: Data Augmentation for Traffic Classification
- Authors: Chao Wang, Alessandro Finamore, Pietro Michiardi, Massimo Gallo, Dario
Rossi
- Abstract summary: Data Augmentation (DA) is a technique widely adopted in Computer Vision (CV) and Natural Language Processing (NLP) tasks.
DA has struggled to gain traction in networking contexts, particularly in Traffic Classification (TC) tasks.
- Score: 54.92823760790628
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data Augmentation (DA) -- enriching training data by adding synthetic samples
-- is a technique widely adopted in Computer Vision (CV) and Natural Language
Processing (NLP) tasks to improve models performance. Yet, DA has struggled to
gain traction in networking contexts, particularly in Traffic Classification
(TC) tasks. In this work, we fulfill this gap by benchmarking 18 augmentation
functions applied to 3 TC datasets using packet time series as input
representation and considering a variety of training conditions. Our results
show that (i) DA can reap benefits previously unexplored, (ii) augmentations
acting on time series sequence order and masking are better suited for TC than
amplitude augmentations and (iii) basic models latent space analysis can help
understanding the positive/negative effects of augmentations on classification
performance.
Related papers
- Toward Generative Data Augmentation for Traffic Classification [54.92823760790628]
Data Augmentation (DA)-augmenting training data with synthetic samples-is wildly adopted in Computer Vision (CV) to improve models performance.
DA has not been yet popularized in networking use cases, including Traffic Classification (TC)
arXiv Detail & Related papers (2023-10-21T08:08:37Z) - Implicit Counterfactual Data Augmentation for Robust Learning [24.795542869249154]
This study proposes an Implicit Counterfactual Data Augmentation method to remove spurious correlations and make stable predictions.
Experiments have been conducted across various biased learning scenarios covering both image and text datasets.
arXiv Detail & Related papers (2023-04-26T10:36:40Z) - Instance-Conditioned GAN Data Augmentation for Representation Learning [29.36473147430433]
We introduce DA_IC-GAN, a learnable data augmentation module that can be used off-the-shelf in conjunction with most state-of-the-art training recipes.
We show that DA_IC-GAN can boost accuracy to between 1%p and 2%p with the highest capacity models.
We additionally couple DA_IC-GAN with a self-supervised training recipe and show that we can also achieve an improvement of 1%p in accuracy in some settings.
arXiv Detail & Related papers (2023-03-16T22:45:43Z) - Automatic Data Augmentation via Invariance-Constrained Learning [94.27081585149836]
Underlying data structures are often exploited to improve the solution of learning tasks.
Data augmentation induces these symmetries during training by applying multiple transformations to the input data.
This work tackles these issues by automatically adapting the data augmentation while solving the learning task.
arXiv Detail & Related papers (2022-09-29T18:11:01Z) - Improving GANs with A Dynamic Discriminator [106.54552336711997]
We argue that a discriminator with an on-the-fly adjustment on its capacity can better accommodate such a time-varying task.
A comprehensive empirical study confirms that the proposed training strategy, termed as DynamicD, improves the synthesis performance without incurring any additional cost or training objectives.
arXiv Detail & Related papers (2022-09-20T17:57:33Z) - Beyond Transfer Learning: Co-finetuning for Action Localisation [64.07196901012153]
We propose co-finetuning -- simultaneously training a single model on multiple upstream'' and downstream'' tasks.
We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data.
We also show how we can easily extend our approach to multiple upstream'' datasets to further improve performance.
arXiv Detail & Related papers (2022-07-08T10:25:47Z) - On Automatic Data Augmentation for 3D Point Cloud Classification [19.338266486983176]
We propose to automatically learn a data augmentation strategy using bilevel optimization.
An augmentor is designed in a similar fashion to a conditional generator and is optimized by minimizing a base model's loss on a validation set.
We evaluate our approach on standard point cloud classification tasks and a more challenging setting with pose misalignment between training and validation/test sets.
arXiv Detail & Related papers (2021-12-11T17:14:16Z) - Adaptive Weighting Scheme for Automatic Time-Series Data Augmentation [79.47771259100674]
We present two sample-adaptive automatic weighting schemes for data augmentation.
We validate our proposed methods on a large, noisy financial dataset and on time-series datasets from the UCR archive.
On the financial dataset, we show that the methods in combination with a trading strategy lead to improvements in annualized returns of over 50$%$, and on the time-series data we outperform state-of-the-art models on over half of the datasets, and achieve similar performance in accuracy on the others.
arXiv Detail & Related papers (2021-02-16T17:50:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.