netFound: Foundation Model for Network Security
- URL: http://arxiv.org/abs/2310.17025v3
- Date: Mon, 07 Oct 2024 23:07:07 GMT
- Title: netFound: Foundation Model for Network Security
- Authors: Satyandra Guthula, Roman Beltiukov, Navya Battula, Wenbo Guo, Arpit Gupta,
- Abstract summary: This paper introduces a novel transformer-based network foundation model, netFound.
We employ self-supervised learning techniques on abundant, unlabeled network telemetry data for pre-training.
Our results demonstrate that netFound effectively captures the hidden networking context in production settings.
- Score: 11.38388749887112
- License:
- Abstract: Developing generalizable ML-based solutions for disparate learning problems in network security is highly desired. However, despite a rich history of applying ML to network security, most existing solutions lack generalizability. This lack of progress can be attributed to an overreliance on supervised learning techniques and the associated challenges of curating well-specified labeled training data. This paper addresses a fundamental gap by introducing a novel transformer-based network foundation model, netFound. We employ self-supervised learning techniques on abundant, unlabeled network telemetry data for pre-training. This pretrained model can subsequently be fine-tuned to create generalizable learning artifacts for disparate learning tasks, even when using commonly available but challenging labeled datasets that are sparse, noisy, and skewed. To realize this goal, netFound leverages various domain-specific attributes and constraints unique to network data (packet traces) by developing multi-modal embeddings, protocol-aware tokenization, data-driven token composition, and hierarchical transformers. Our results demonstrate that netFound's domain-specific design choices ensure that it (1) effectively captures the hidden networking context in production settings, (2) outperforms four different SOTA methods on five different learning tasks, and (3) is robust to both noisy labels and learning shortcuts -- critical for developing generalizable ML models in practical settings.
Related papers
- Deep Internal Learning: Deep Learning from a Single Input [88.59966585422914]
In many cases there is value in training a network just from the input at hand.
This is particularly relevant in many signal and image processing problems where training data is scarce and diversity is large.
This survey paper aims at covering deep internal-learning techniques that have been proposed in the past few years for these two important directions.
arXiv Detail & Related papers (2023-12-12T16:48:53Z) - Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data.
For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - Mitigating Catastrophic Forgetting in Long Short-Term Memory Networks [7.291687946822538]
Continual learning on sequential data is critical for many machine learning (ML) deployments.
LSTM networks suffer from catastrophic forgetting and are limited in their ability to learn multiple tasks continually.
We discover that catastrophic forgetting in LSTM networks can be overcome in two novel and readily-implementable ways.
arXiv Detail & Related papers (2023-05-26T20:17:18Z) - Federated Learning and Meta Learning: Approaches, Applications, and
Directions [94.68423258028285]
In this tutorial, we present a comprehensive review of FL, meta learning, and federated meta learning (FedMeta)
Unlike other tutorial papers, our objective is to explore how FL, meta learning, and FedMeta methodologies can be designed, optimized, and evolved, and their applications over wireless networks.
arXiv Detail & Related papers (2022-10-24T10:59:29Z) - Transfer Learning with Pre-trained Conditional Generative Models [29.43740987925133]
We propose a transfer learning method that uses deep generative models and is composed of the following two stages: pseudo pre-training and pseudo semi-supervised learning.
Our experimental results indicate that our method can outperform baselines of scratch training and knowledge distillation.
arXiv Detail & Related papers (2022-04-27T10:36:32Z) - Training Deep Networks from Zero to Hero: avoiding pitfalls and going
beyond [59.94347858883343]
This tutorial covers the basic steps as well as more recent options to improve models.
It can be particularly useful in datasets that are not as well-prepared as those in challenges.
arXiv Detail & Related papers (2021-09-06T21:31:42Z) - Federated Learning: A Signal Processing Perspective [144.63726413692876]
Federated learning is an emerging machine learning paradigm for training models across multiple edge devices holding local datasets, without explicitly exchanging the data.
This article provides a unified systematic framework for federated learning in a manner that encapsulates and highlights the main challenges that are natural to treat using signal processing tools.
arXiv Detail & Related papers (2021-03-31T15:14:39Z) - Applying Graph-based Deep Learning To Realistic Network Scenarios [5.453745629140304]
This paper presents a new Graph-based deep learning model able to estimate accurately the per-path mean delay in networks.
The proposed model can generalize successfully over topologies, routing configurations, queue scheduling policies and traffic matrices unseen during the training phase.
arXiv Detail & Related papers (2020-10-13T20:58:59Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.