SPEC5G: A Dataset for 5G Cellular Network Protocol Analysis
- URL: http://arxiv.org/abs/2301.09201v2
- Date: Thu, 14 Sep 2023 22:25:52 GMT
- Title: SPEC5G: A Dataset for 5G Cellular Network Protocol Analysis
- Authors: Imtiaz Karim, Kazi Samin Mubasshir, Mirza Masfiqur Rahman, and Elisa
Bertino
- Abstract summary: SPEC5G is the first-ever public 5G dataset for NLP research.
The dataset contains 3,547,586 sentences with 134M words, from 13094 cellular network specifications and 13 online websites.
Our results show the value of our 5G-centric dataset in 5G protocol analysis automation.
- Score: 12.073927880523305
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 5G is the 5th generation cellular network protocol. It is the
state-of-the-art global wireless standard that enables an advanced kind of
network designed to connect virtually everyone and everything with increased
speed and reduced latency. Therefore, its development, analysis, and security
are critical. However, all approaches to the 5G protocol development and
security analysis, e.g., property extraction, protocol summarization, and
semantic analysis of the protocol specifications and implementations are
completely manual. To reduce such manual effort, in this paper, we curate
SPEC5G the first-ever public 5G dataset for NLP research. The dataset contains
3,547,586 sentences with 134M words, from 13094 cellular network specifications
and 13 online websites. By leveraging large-scale pre-trained language models
that have achieved state-of-the-art results on NLP tasks, we use this dataset
for security-related text classification and summarization. Security-related
text classification can be used to extract relevant security-related properties
for protocol testing. On the other hand, summarization can help developers and
practitioners understand the high level of the protocol, which is itself a
daunting task. Our results show the value of our 5G-centric dataset in 5G
protocol analysis automation. We believe that SPEC5G will enable a new research
direction into automatic analyses for the 5G cellular network protocol and
numerous related downstream tasks. Our data and code are publicly available.
Related papers
- CellularLint: A Systematic Approach to Identify Inconsistent Behavior in Cellular Network Specifications [12.370608043864944]
We introduce a semi-automatic framework for inconsistency detection within the standards of 4G and 5G.
Our proposed method uses a revamped few-shot learning mechanism on domain-adapted large language models.
In our investigation, we focused on the Non-Access Stratum (NAS) and the security specifications of 4G and 5G networks, ultimately uncovering 157 inconsistencies with 82.67% accuracy.
arXiv Detail & Related papers (2024-07-18T17:48:46Z) - Penetration Testing of 5G Core Network Web Technologies [53.89039878885825]
We present the first security assessment of the 5G core from a web security perspective.
We use the STRIDE threat modeling approach to define a complete list of possible threat vectors and associated attacks.
Our analysis shows that all these cores are vulnerable to at least two of our identified attack vectors.
arXiv Detail & Related papers (2024-03-04T09:27:11Z) - Evaluation of EAP Usage for Authenticating Eduroam Users in 5G Networks [0.0]
This paper highlights key findings resulting from an analysis on the subject conducted through a test environment.
The implementation of this connectivity requires specific protocols to ensure authentication and reliability.
arXiv Detail & Related papers (2024-02-16T18:44:57Z) - Towards Semantic Communication Protocols for 6G: From Protocol Learning
to Language-Oriented Approaches [60.6632432485476]
6G systems are expected to address a wide range of non-stationary tasks. This poses challenges to traditional medium access control (MAC) protocols that are static and predefined.
Data-driven MAC protocols have recently emerged, offering ability to tailor their signaling messages for specific tasks.
This article presents a novel categorization of these data-driven MAC protocols into three levels: Level 1 MAC. task-oriented neural protocols constructed using multi-agent deep reinforcement learning (MADRL); Level 2 MAC. neural network-oriented symbolic protocols developed by converting Level 1 MAC outputs into explicit symbols; and Level 3 MAC. language-oriented semantic protocols harnessing
arXiv Detail & Related papers (2023-10-14T06:28:50Z) - Towards Supporting Intelligence in 5G/6G Core Networks: NWDAF
Implementation and Initial Analysis [3.5573601621032935]
The work presented in this paper incorporates a functional NWDAF into a 5G network developed using open source software.
The expected limitations of 5G networks are discussed as motivation for the development of 6G networks.
arXiv Detail & Related papers (2022-05-30T14:15:46Z) - Neuro-Symbolic Artificial Intelligence (AI) for Intent based Semantic
Communication [85.06664206117088]
6G networks must consider semantics and effectiveness (at end-user) of the data transmission.
NeSy AI is proposed as a pillar for learning causal structure behind the observed data.
GFlowNet is leveraged for the first time in a wireless system to learn the probabilistic structure which generates the data.
arXiv Detail & Related papers (2022-05-22T07:11:57Z) - Machine Learning Assisted Security Analysis of 5G-Network-Connected
Systems [5.918387680589584]
5G networks have transitioned to software-defined infrastructures.
New technologies, like network function virtualization and software-defined networking, have been incorporated in the 5G core network (5GCN) architecture to enable this transition.
This article presents a comprehensive security analysis framework for the 5GCN.
arXiv Detail & Related papers (2021-08-07T20:07:08Z) - Generative Conversational Networks [67.13144697969501]
We propose a framework called Generative Conversational Networks, in which conversational agents learn to generate their own labelled training data.
We show an average improvement of 35% in intent detection and 21% in slot tagging over a baseline model trained from the seed data.
arXiv Detail & Related papers (2021-06-15T23:19:37Z) - True-data Testbed for 5G/B5G Intelligent Network [46.09035008165811]
We build the world's first true-data testbed for 5G/B5G intelligent network (TTIN)
TTIN comprises 5G/B5G on-site experimental networks, data acquisition & data warehouse, and AI engine & network optimization.
This paper elaborates on the system architecture and module design of TTIN.
arXiv Detail & Related papers (2020-11-26T06:42:36Z) - Regularized Densely-connected Pyramid Network for Salient Instance
Segmentation [73.17802158095813]
We propose a new pipeline for end-to-end salient instance segmentation (SIS)
To better use the rich feature hierarchies in deep networks, we propose the regularized dense connections.
A novel multi-level RoIAlign based decoder is introduced to adaptively aggregate multi-level features for better mask predictions.
arXiv Detail & Related papers (2020-08-28T00:13:30Z) - 5G Security and Privacy: A Research Roadmap [24.802753928579477]
5G - the latest generation of cellular networks - combines different technologies to increase capacity, reduce latency, and save energy.
We outline recent approaches supporting systematic analyses of 4G LTE and 5G protocols and their related defenses.
arXiv Detail & Related papers (2020-03-30T16:36:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.