NetML: A Challenge for Network Traffic Analytics
- URL: http://arxiv.org/abs/2004.13006v1
- Date: Sat, 25 Apr 2020 01:12:17 GMT
- Title: NetML: A Challenge for Network Traffic Analytics
- Authors: Onur Barut, Yan Luo, Tong Zhang, Weigang Li, Peilong Li
- Abstract summary: We release three open datasets containing almost 1.3M labeled flows in total.
We focus on broad aspects in network traffic analysis, including both malware detection and application classification.
As we continue to grow NetML, we expect the datasets to serve as a common platform for AI driven, reproducible research on network flow analytics.
- Score: 16.8001000840057
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Classifying network traffic is the basis for important network applications.
Prior research in this area has faced challenges on the availability of
representative datasets, and many of the results cannot be readily reproduced.
Such a problem is exacerbated by emerging data-driven machine learning based
approaches. To address this issue, we provide three open datasets containing
almost 1.3M labeled flows in total, with flow features and anonymized raw
packets, for the research community. We focus on broad aspects in network
traffic analysis, including both malware detection and application
classification. We release the datasets in the form of an open challenge called
NetML and implement several machine learning methods including random-forest,
SVM and MLP. As we continue to grow NetML, we expect the datasets to serve as a
common platform for AI driven, reproducible research on network flow analytics.
Related papers
- Systematic review, analysis, and characterisation of malicious industrial network traffic datasets for aiding Machine Learning algorithm performance testing [0.0]
This paper systematically reviews publicly available network traffic capture-based datasets.
It includes categorisation of contained attack types, review of metadata, and statistical as well as complexity analysis.
It provides researchers with metadata that can be used to select the best dataset for their research question.
arXiv Detail & Related papers (2024-05-08T07:48:40Z) - NetBench: A Large-Scale and Comprehensive Network Traffic Benchmark Dataset for Foundation Models [15.452625276982987]
In computer networking, network traffic refers to the amount of data transmitted in the form of packets between internetworked computers or Cyber-Physical Systems.
We introduce the NetBench, a large-scale and comprehensive benchmark dataset for assessing machine learning models, especially foundation models, in both network traffic classification and generation tasks.
arXiv Detail & Related papers (2024-03-15T14:09:54Z) - LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting [65.71129509623587]
Road traffic forecasting plays a critical role in smart city initiatives and has experienced significant advancements thanks to the power of deep learning.
However, the promising results achieved on current public datasets may not be applicable to practical scenarios.
We introduce the LargeST benchmark dataset, which includes a total of 8,600 sensors in California with a 5-year time coverage.
arXiv Detail & Related papers (2023-06-14T05:48:36Z) - A Survey of Label-Efficient Deep Learning for 3D Point Clouds [109.07889215814589]
This paper presents the first comprehensive survey of label-efficient learning of point clouds.
We propose a taxonomy that organizes label-efficient learning methods based on the data prerequisites provided by different types of labels.
For each approach, we outline the problem setup and provide an extensive literature review that showcases relevant progress and challenges.
arXiv Detail & Related papers (2023-05-31T12:54:51Z) - Multi-view Multi-label Anomaly Network Traffic Classification based on
MLP-Mixer Neural Network [55.21501819988941]
Existing network traffic classification based on convolutional neural networks (CNNs) often emphasizes local patterns of traffic data while ignoring global information associations.
We propose an end-to-end network traffic classification method.
arXiv Detail & Related papers (2022-10-30T01:52:05Z) - Active Learning Framework to Automate NetworkTraffic Classification [0.0]
The paper presents a novel ActiveLearning Framework (ALF) to address this topic.
ALF provides components that can be used to deploy an activelearning loop and maintain an ALF instance that continuouslyevolves a dataset and ML model.
The resultingsolution is deployable for IP flow-based analysis of high-speed(100 Gb/s) networks.
arXiv Detail & Related papers (2022-10-26T10:15:18Z) - Machine Learning Empowered Intelligent Data Center Networking: A Survey [35.55535885962517]
This paper comprehensively investigates the application of machine learning to data center networking.
It covers flow prediction, flow classification, load balancing, resource management, routing optimization, and congestion control.
We design a quality assessment criteria called REBEL-3S to impartially measure the strengths and weaknesses of these research works.
arXiv Detail & Related papers (2022-02-28T05:27:22Z) - Multi-Task Hierarchical Learning Based Network Traffic Analytics [18.04195092141071]
We present three open datasets containing nearly 1.3M labeled flows in total.
We focus on broad aspects in network traffic analysis, including both malware detection and application classification.
As we continue to grow them, we expect the datasets to serve as a common ground for AI driven, reproducible research on network flow analytics.
arXiv Detail & Related papers (2021-06-05T02:25:59Z) - Distributed Learning in Wireless Networks: Recent Progress and Future
Challenges [170.35951727508225]
Next-generation wireless networks will enable many machine learning (ML) tools and applications to analyze various types of data collected by edge devices.
Distributed learning and inference techniques have been proposed as a means to enable edge devices to collaboratively train ML models without raw data exchanges.
This paper provides a comprehensive study of how distributed learning can be efficiently and effectively deployed over wireless edge networks.
arXiv Detail & Related papers (2021-04-05T20:57:56Z) - Wireless for Machine Learning [91.13476340719087]
We give an exhaustive review of the state-of-the-art wireless methods that are specifically designed to support machine learning services over distributed datasets.
There are two clear themes within the literature, analog over-the-air computation and digital radio resource management optimized for ML.
This survey gives a comprehensive introduction to these methods, reviews the most important works, highlights open problems, and discusses application scenarios.
arXiv Detail & Related papers (2020-08-31T11:09:49Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.