Classification of Home Network Problems with Transformers
- URL: http://arxiv.org/abs/2312.01445v1
- Date: Sun, 3 Dec 2023 16:27:06 GMT
- Title: Classification of Home Network Problems with Transformers
- Authors: Jeremias D\"otterl, Zahra Hemmati Fard
- Abstract summary: We propose a model that can identify ten common home network problems based on the raw textual output of networking tools such as ping, dig, and ip.
Our deep learning model uses an encoder-only transformer architecture with a particular pre-tokenizer that we propose for splitting the tool output into token sequences.
Our model achieves high accuracy in our experiments, demonstrating the high potential of transformer-based problem classification for the home network.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a classifier that can identify ten common home network problems
based on the raw textual output of networking tools such as ping, dig, and ip.
Our deep learning model uses an encoder-only transformer architecture with a
particular pre-tokenizer that we propose for splitting the tool output into
token sequences. The use of transformers distinguishes our approach from
related work on network problem classification, which still primarily relies on
non-deep-learning methods. Our model achieves high accuracy in our experiments,
demonstrating the high potential of transformer-based problem classification
for the home network.
Related papers
- Generalization emerges from local optimization in a self-organized learning network [0.0]
We design and analyze a new paradigm for building supervised learning networks, driven only by local optimization rules without relying on a global error function.
Our network stores new knowledge in the nodes accurately and instantaneously, in the form of a lookup table.
We show on numerous examples of classification tasks that the networks generated by our algorithm systematically reach such a state of perfect generalization when the number of learned examples becomes sufficiently large.
We report on the dynamics of the change of state and show that it is abrupt and has the distinctive characteristics of a first order phase transition, a phenomenon already observed for traditional learning networks and known as grokking.
arXiv Detail & Related papers (2024-10-03T15:32:08Z) - Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot [50.16171384920963]
transformer architecture has prevailed in various deep learning settings.
One-layer transformer trained with gradient descent provably learns the sparse token selection task.
arXiv Detail & Related papers (2024-06-11T02:15:53Z) - Training toward significance with the decorrelated event classifier transformer neural network [0.0]
In natural language processing, one of the leading neural network architectures is the transformer.
It is found that this trained network can perform better than boosted decision trees and feed-forward networks.
arXiv Detail & Related papers (2023-12-31T08:57:29Z) - Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as
an Alternative to Attention Layers in Transformers [5.356051655680145]
This work presents an analysis of the effectiveness of using standard shallow feed-forward networks to mimic the behavior of the attention mechanism in the original Transformer model.
We substitute key elements of the attention mechanism in the Transformer with simple feed-forward networks, trained using the original components via knowledge distillation.
Our experiments, conducted on the IWSLT 2017 dataset, reveal the capacity of these "attentionless Transformers" to rival the performance of the original architecture.
arXiv Detail & Related papers (2023-11-17T16:58:52Z) - Centered Self-Attention Layers [89.21791761168032]
The self-attention mechanism in transformers and the message-passing mechanism in graph neural networks are repeatedly applied.
We show that this application inevitably leads to oversmoothing, i.e., to similar representations at the deeper layers.
We present a correction term to the aggregating operator of these mechanisms.
arXiv Detail & Related papers (2023-06-02T15:19:08Z) - Deep Transformers without Shortcuts: Modifying Self-attention for
Faithful Signal Propagation [105.22961467028234]
Skip connections and normalisation layers are ubiquitous for the training of Deep Neural Networks (DNNs)
Recent approaches such as Deep Kernel Shaping have made progress towards reducing our reliance on them.
But these approaches are incompatible with the self-attention layers present in transformers.
arXiv Detail & Related papers (2023-02-20T21:26:25Z) - Systematic Generalization and Emergent Structures in Transformers
Trained on Structured Tasks [6.525090891505941]
We show how a causal transformer can perform a set of algorithmic tasks, including copying, sorting, and hierarchical compositions.
We show that two-layer transformers learn generalizable solutions to multi-level problems and develop signs of systematic task decomposition.
These results provide key insights into how transformer models may be capable of decomposing complex decisions into reusable, multi-level policies.
arXiv Detail & Related papers (2022-10-02T00:46:36Z) - Transformers Solve the Limited Receptive Field for Monocular Depth
Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers.
This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z) - DHP: Differentiable Meta Pruning via HyperNetworks [158.69345612783198]
This paper introduces a differentiable pruning method via hypernetworks for automatic network pruning.
Latent vectors control the output channels of the convolutional layers in the backbone network and act as a handle for the pruning of the layers.
Experiments are conducted on various networks for image classification, single image super-resolution, and denoising.
arXiv Detail & Related papers (2020-03-30T17:59:18Z) - Robustness Verification for Transformers [165.25112192811764]
We develop the first robustness verification algorithm for Transformers.
The certified robustness bounds computed by our method are significantly tighter than those by naive Interval Bound propagation.
These bounds also shed light on interpreting Transformers as they consistently reflect the importance of different words in sentiment analysis.
arXiv Detail & Related papers (2020-02-16T17:16:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.