Related papers: Active Learning Framework to Automate NetworkTraffic Classification

Active Learning Framework to Automate NetworkTraffic Classification

URL: http://arxiv.org/abs/2211.08399v1
Date: Wed, 26 Oct 2022 10:15:18 GMT
Title: Active Learning Framework to Automate NetworkTraffic Classification
Authors: Jaroslav Pe\v{s}ek, Dominik Soukup, Tom\'a\v{s} \v{C}ejka
Abstract summary: The paper presents a novel ActiveLearning Framework (ALF) to address this topic. ALF provides components that can be used to deploy an activelearning loop and maintain an ALF instance that continuouslyevolves a dataset and ML model. The resultingsolution is deployable for IP flow-based analysis of high-speed(100 Gb/s) networks.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent network traffic classification methods benefitfrom machine learning (ML) technology. However, there aremany challenges due to use of ML, such as: lack of high-qualityannotated datasets, data-drifts and other effects causing aging ofdatasets and ML models, high volumes of network traffic etc. Thispaper argues that it is necessary to augment traditional workflowsof ML training&deployment and adapt Active Learning concepton network traffic analysis. The paper presents a novel ActiveLearning Framework (ALF) to address this topic. ALF providesprepared software components that can be used to deploy an activelearning loop and maintain an ALF instance that continuouslyevolves a dataset and ML model automatically. The resultingsolution is deployable for IP flow-based analysis of high-speed(100 Gb/s) networks, and also supports research experiments ondifferent strategies and methods for annotation, evaluation, datasetoptimization, etc. Finally, the paper lists some research challengesthat emerge from the first experiments with ALF in practice.

Related papers

From Selection to Generation: A Survey of LLM-based Active Learning [153.8110509961261]
Large Language Models (LLMs) have been employed for generating entirely new data instances and providing more cost-effective annotations. This survey aims to serve as an up-to-date resource for researchers and practitioners seeking to gain an intuitive understanding of LLM-based AL techniques.
arXiv Detail & Related papers (2025-02-17T12:58:17Z)
ORIS: Online Active Learning Using Reinforcement Learning-based Inclusive Sampling for Robust Streaming Analytics System [2.985426781886815]
We propose ORIS, a method to perform Online active learning using Reinforcement learning-based Inclusive Sampling of documents for labeling. ORIS aims to create a novel Deep Q-Network-based strategy to sample incoming documents that minimize human errors in labeling. We evaluate the ORIS method on emotion recognition tasks, and it outperforms traditional baselines in terms of both human labeling performance and the ML model performance.
arXiv Detail & Related papers (2024-11-27T05:11:37Z)
A Survey of Machine Learning-based Physical-Layer Authentication in Wireless Communications [17.707450193500698]
Physical-Layer Authentication (PLA) is emerging as a promising complement due to its exploitation of unique properties in wireless environments. This paper presents a comprehensive survey of characteristics and technologies that can be used in the ML-based PLA.
arXiv Detail & Related papers (2024-11-15T03:01:23Z)
Recent Advances on Machine Learning for Computational Fluid Dynamics: A Survey [51.87875066383221]
This paper introduces fundamental concepts, traditional methods, and benchmark datasets, then examine the various roles Machine Learning plays in improving CFD. We highlight real-world applications of ML for CFD in critical scientific and engineering disciplines, including aerodynamics, combustion, atmosphere & ocean science, biology fluid, plasma, symbolic regression, and reduced order modeling. We draw the conclusion that ML is poised to significantly transform CFD research by enhancing simulation accuracy, reducing computational time, and enabling more complex analyses of fluid dynamics.
arXiv Detail & Related papers (2024-08-22T07:33:11Z)
Learning Objective-Specific Active Learning Strategies with Attentive Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting. We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem. Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z)
Active Transfer Prototypical Network: An Efficient Labeling Algorithm for Time-Series Data [1.7205106391379026]
This paper proposes a novel Few-Shot Learning (FSL)-based AL framework, which addresses the trade-off problem by incorporating a Prototypical Network (ProtoNet) in the AL iterations. This framework was validated on UCI HAR/HAPT dataset and a real-world braking maneuver dataset. The learning performance significantly surpasses traditional AL algorithms on both datasets, achieving 90% classification accuracy with 10% and 5% labeling effort, respectively.
arXiv Detail & Related papers (2022-09-28T16:14:40Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Automated Machine Learning Techniques for Data Streams [91.3755431537592]
This paper surveys the state-of-the-art open-source AutoML tools, applies them to data collected from streams, and measures how their performance changes over time. The results show that off-the-shelf AutoML tools can provide satisfactory results but in the presence of concept drift, detection or adaptation techniques have to be applied to maintain the predictive accuracy over time.
arXiv Detail & Related papers (2021-06-14T11:42:46Z)
Active Learning for Network Traffic Classification: A Technical Survey [1.942265343737899]
This study investigates the applicability of an active form of ML, called Active Learning (AL), which reduces the need for a high number of labeled examples. The study first provides an overview of NTC and its fundamental challenges along with surveying the literature in the field of using ML techniques in NTC. Further, challenges and open issues in the use of AL for NTC are discussed.
arXiv Detail & Related papers (2021-06-13T06:37:50Z)
Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification. Our strategy enables important aspects of the base learner objective to be learned during meta-training. We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z)
Bayesian active learning for production, a systematic study and a reusable library [85.32971950095742]
In this paper, we analyse the main drawbacks of current active learning techniques. We do a systematic study on the effects of the most common issues of real-world datasets on the deep active learning process. We derive two techniques that can speed up the active learning loop such as partial uncertainty sampling and larger query size.
arXiv Detail & Related papers (2020-06-17T14:51:11Z)
Adaptation Strategies for Automated Machine Learning on Evolving Data [7.843067454030999]
This study is to understand the effect of data stream challenges such as concept drift on the performance of AutoML methods. We propose 6 concept drift adaptation strategies and evaluate their effectiveness on different AutoML approaches.
arXiv Detail & Related papers (2020-06-09T14:29:16Z)
Federated Learning in Vehicular Networks [41.89469856322786]
Federated learning (FL) framework has been introduced as an efficient tool with the goal of reducing transmission overhead. In this paper, we investigate the usage of FL over centralized learning (CL) in vehicular network applications to develop intelligent transportation systems. We identify the major challenges from both learning perspective, i.e., data labeling and model training, and from the communications point of view, i.e., data rate, reliability, transmission overhead, privacy and resource management.
arXiv Detail & Related papers (2020-06-02T06:32:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.