Related papers: Impact of Latent Space Dimension on IoT Botnet Detection Performance: VAE-Encoder Versus ViT-Encoder

Impact of Latent Space Dimension on IoT Botnet Detection Performance: VAE-Encoder Versus ViT-Encoder

URL: http://arxiv.org/abs/2504.14879v1
Date: Mon, 21 Apr 2025 06:15:07 GMT
Title: Impact of Latent Space Dimension on IoT Botnet Detection Performance: VAE-Encoder Versus ViT-Encoder
Authors: Hassan Wasswa, Aziida Nanyonga, Timothy Lynar,
Abstract summary: This study focuses on investigating how the latent dimension impacts the performance of different deep learning classifiers when trained on latent vector representations of the train dataset.<n>The encoder components are employed to project high-dimensional structured. CSV IoT botnet traffic datasets to various latent sizes.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid evolution of Internet of Things (IoT) technology has led to a significant increase in the number of IoT devices, applications, and services. This surge in IoT devices, along with their widespread presence, has made them a prime target for various cyber-attacks, particularly through IoT botnets. As a result, security has become a major concern within the IoT ecosystem. This study focuses on investigating how the latent dimension impacts the performance of different deep learning classifiers when trained on latent vector representations of the train dataset. The primary objective is to compare the outcomes of these models when encoder components from two cutting-edge architectures: the Vision Transformer (ViT) and the Variational Auto-Encoder (VAE) are utilized to project the high dimensional train dataset to the learned low dimensional latent space. The encoder components are employed to project high-dimensional structured .csv IoT botnet traffic datasets to various latent sizes. Evaluated on N-BaIoT and CICIoT2022 datasets, findings reveal that VAE-encoder based dimension reduction outperforms ViT-encoder based dimension reduction for both datasets in terms of four performance metrics including accuracy, precision, recall, and F1-score for all models which can be attributed to absence of spatial patterns in the datasets the ViT model attempts to learn and extract from image instances.

Related papers

FLARE: Feature-based Lightweight Aggregation for Robust Evaluation of IoT Intrusion Detection [0.0]
Internet of Things (IoT) devices have expanded the attack surface, necessitating efficient intrusion detection systems (IDSs) for network protection. This paper presents FLARE, a feature-based lightweight aggregation for robust evaluation of IoT intrusion detection. We employ four supervised learning models and two deep learning models to classify attacks in IoT IDS.
arXiv Detail & Related papers (2025-04-21T18:33:53Z)
LaVin-DiT: Large Vision Diffusion Transformer [99.98106406059333]
LaVin-DiT is a scalable and unified foundation model designed to tackle over 20 computer vision tasks in a generative framework.<n>We introduce key innovations to optimize generative performance for vision tasks.<n>The model is scaled from 0.1B to 3.4B parameters, demonstrating substantial scalability and state-of-the-art performance across diverse vision tasks.
arXiv Detail & Related papers (2024-11-18T12:05:27Z)
Optimizing Vision Transformers with Data-Free Knowledge Transfer [8.323741354066474]
Vision transformers (ViTs) have excelled in various computer vision tasks due to their superior ability to capture long-distance dependencies. We propose compressing large ViT models using Knowledge Distillation (KD), which is implemented data-free to circumvent limitations related to data availability.
arXiv Detail & Related papers (2024-08-12T07:03:35Z)
Leveraging Foundation Models for Zero-Shot IoT Sensing [5.319176383069102]
Deep learning models are increasingly deployed on edge Internet of Things (IoT) devices. ZSL aims to classify data of unseen classes with the help of semantic information. In this work, we align the IoT data embeddings with the semantic embeddings generated by an FM's text encoder for zero-shot IoT sensing.
arXiv Detail & Related papers (2024-07-29T11:16:48Z)
LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection [63.780355815743135]
We present a light-weight detection transformer, LW-DETR, which outperforms YOLOs for real-time object detection. The architecture is a simple stack of a ViT encoder, a projector, and a shallow DETR decoder.
arXiv Detail & Related papers (2024-06-05T17:07:24Z)
Energy-Efficient Edge Learning via Joint Data Deepening-and-Prefetching [9.468399367975984]
We propose a novel offloading architecture called joint data deepening-and-prefetching (JD2P) JD2P is feature-by-feature offloading comprising two key techniques. We evaluate the effectiveness of JD2P through experiments using the MNIST dataset.
arXiv Detail & Related papers (2024-02-19T08:12:47Z)
Constrained Twin Variational Auto-Encoder for Intrusion Detection in IoT Systems [30.16714420093091]
Intrusion detection systems (IDSs) play a critical role in protecting billions of IoT devices from malicious attacks. This article proposes a novel deep neural network/architecture called Constrained Twin Variational Auto-Encoder (CTVAE) CTVAE can boost around 1% in terms of accuracy and Fscore in detection attack compared to the state-of-the-art machine learning and representation learning methods.
arXiv Detail & Related papers (2023-12-05T04:42:04Z)
Enhancing IoT Security via Automatic Network Traffic Analysis: The Transition from Machine Learning to Deep Learning [0.0]
This work provides a comparative analysis illustrating how Deep Learning (DL) surpasses Machine Learning (ML) in addressing tasks within Internet of Things (IoT) Our approach involves training and evaluating a DL model using a range of diverse IoT-related datasets. Experiments showcase the ability of DL to surpass the constraints tied to manually engineered features, achieving superior results in attack detection and maintaining comparable outcomes in device-type identification.
arXiv Detail & Related papers (2023-11-20T16:48:50Z)
An Extendable, Efficient and Effective Transformer-based Object Detector [95.06044204961009]
We integrate Vision and Detection Transformers (ViDT) to construct an effective and efficient object detector. ViDT introduces a reconfigured attention module to extend the recent Swin Transformer to be a standalone object detector. We extend it to ViDT+ to support joint-task learning for object detection and instance segmentation.
arXiv Detail & Related papers (2022-04-17T09:27:45Z)
MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis. The proposed dataset contains 100,000 images and 25 different object types. We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z)
ViDT: An Efficient and Effective Fully Transformer-based Object Detector [97.71746903042968]
Detection transformers are the first fully end-to-end learning systems for object detection. vision transformers are the first fully transformer-based architecture for image classification. In this paper, we integrate Vision and Detection Transformers (ViDT) to build an effective and efficient object detector.
arXiv Detail & Related papers (2021-10-08T06:32:05Z)
Vision Transformers are Robust Learners [65.91359312429147]
We study the robustness of the Vision Transformer (ViT) against common corruptions and perturbations, distribution shifts, and natural adversarial examples. We present analyses that provide both quantitative and qualitative indications to explain why ViTs are indeed more robust learners.
arXiv Detail & Related papers (2021-05-17T02:39:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.