Detecting Vulnerabilities from Issue Reports for Internet-of-Things
- URL: http://arxiv.org/abs/2511.01941v1
- Date: Mon, 03 Nov 2025 05:59:34 GMT
- Title: Detecting Vulnerabilities from Issue Reports for Internet-of-Things
- Authors: Sogol Masoumzadeh,
- Abstract summary: We propose two approaches to detect vulnerability-indicating issues of 21 Eclipse IoT projects.<n>We fine-tune a pre-trained BERT Masked Language Model (MLM) on 11,000 GitHub issues for classifying vul.<n>Our contributions set the stage for accurately detecting IoT vulnerabilities from issue reports, similar to non-IoT systems.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Timely identification of issue reports reflecting software vulnerabilities is crucial, particularly for Internet-of-Things (IoT) where analysis is slower than non-IoT systems. While Machine Learning (ML) and Large Language Models (LLMs) detect vulnerability-indicating issues in non-IoT systems, their IoT use remains unexplored. We are the first to tackle this problem by proposing two approaches: (1) combining ML and LLMs with Natural Language Processing (NLP) techniques to detect vulnerability-indicating issues of 21 Eclipse IoT projects and (2) fine-tuning a pre-trained BERT Masked Language Model (MLM) on 11,000 GitHub issues for classifying \vul. Our best performance belongs to a Support Vector Machine (SVM) trained on BERT NLP features, achieving an Area Under the receiver operator characteristic Curve (AUC) of 0.65. The fine-tuned BERT achieves 0.26 accuracy, emphasizing the importance of exposing all data during training. Our contributions set the stage for accurately detecting IoT vulnerabilities from issue reports, similar to non-IoT systems.
Related papers
- Benchmarking Machine Learning Models for IoT Malware Detection under Data Scarcity and Drift [0.5735035463793007]
Internet of Things (IoT) devices are prime targets for cyberattacks and malware applications.<n>Machine learning (ML) offers a promising approach to automated malware detection and classification.<n>This study investigates the effectiveness of four supervised learning models for malware detection and classification.
arXiv Detail & Related papers (2026-01-26T17:59:33Z) - Multi-Agent Collaborative Intrusion Detection for Low-Altitude Economy IoT: An LLM-Enhanced Agentic AI Framework [60.72591149679355]
The rapid expansion of low-altitude economy Internet of Things (LAE-IoT) networks has created unprecedented security challenges.<n>Traditional intrusion detection systems fail to tackle the unique characteristics of aerial IoT environments.<n>We introduce a large language model (LLM)-enabled agentic AI framework for enhancing intrusion detection in LAE-IoT networks.
arXiv Detail & Related papers (2026-01-25T12:47:25Z) - ParaVul: A Parallel Large Language Model and Retrieval-Augmented Framework for Smart Contract Vulnerability Detection [43.41293570032631]
ParaVul is a retrieval-augmented framework to improve the reliability and accuracy of smart contract vulnerability detection.<n>We develop Sparse Low-Rank Adaptation (SLoRA) for LLM fine-tuning.<n>We construct a vulnerability contract dataset and develop a hybrid Retrieval-Augmented Generation (RAG) system.
arXiv Detail & Related papers (2025-10-20T03:23:41Z) - Large Language Models for Real-World IoT Device Identification [5.841950328636518]
We introduce a semantic inference pipeline that reframes device identification as a language modeling task over heterogeneous network metadata.<n>To construct reliable supervision, we generate high-fidelity vendor labels for the IoT Inspector dataset.<n>We then instruction-tune a quantized LLaMA3.18B model with curriculum learning to support generalization under sparsity and long-tail vendor distributions.
arXiv Detail & Related papers (2025-09-24T05:33:48Z) - Towards Lifecycle Unlearning Commitment Management: Measuring Sample-level Unlearning Completeness [30.596695293390415]
Interpolated Approximate Measurement (IAM) is a framework designed for unlearning inference.<n>IAM quantifies sample-level unlearning completeness by interpolating the model's generalization-fitting behavior gap on queried samples.<n>We apply IAM to recent approximate unlearning algorithms, revealing general risks of both over-unlearning and under-unlearning.
arXiv Detail & Related papers (2025-06-06T14:22:18Z) - Enhancing IoT-Botnet Detection using Variational Auto-encoder and Cost-Sensitive Learning: A Deep Learning Approach for Imbalanced Datasets [0.0]
The work in this study leveraged Variational Auto-encoder (VAE) and cost-sensitive learning to develop models for IoT-botnet detection.<n>The aim is to enhance the detection of minority class attack traffic instances which are often missed by machine learning models.
arXiv Detail & Related papers (2025-04-26T02:04:30Z) - Smart IoT Security: Lightweight Machine Learning Techniques for Multi-Class Attack Detection in IoT Networks [0.0]
The Internet of Things (IoT) is expanding at an accelerated pace, making it critical to have secure networks to mitigate a variety of cyber threats.<n>This study addresses the limitation of multi-class attack detection of IoT devices and presents new machine learning-based lightweight ensemble methods.
arXiv Detail & Related papers (2025-02-06T13:17:03Z) - Federated Learning framework for LoRaWAN-enabled IIoT communication: A case study [41.831392507864415]
Anomaly detection plays a crucial role in preventive maintenance and spotting irregularities in industrial components.
Traditional Machine Learning faces challenges in deploying anomaly detection models in resource-constrained environments like LoRaWAN.
Federated Learning (FL) solves this problem by enabling distributed model training, addressing privacy concerns, and minimizing data transmission.
arXiv Detail & Related papers (2024-10-15T13:48:04Z) - Lightweight CNN-BiLSTM based Intrusion Detection Systems for Resource-Constrained IoT Devices [38.16309790239142]
Intrusion Detection Systems (IDSs) have played a significant role in detecting and preventing cyber-attacks within traditional computing systems.
The limited computational resources available on Internet of Things (IoT) devices make it challenging to deploy conventional computing-based IDSs.
We propose a hybrid CNN architecture composed of a lightweight CNN and bidirectional LSTM (BiLSTM) to enhance the performance of IDS on the UNSW-NB15 dataset.
arXiv Detail & Related papers (2024-06-04T20:36:21Z) - Effective Intrusion Detection in Heterogeneous Internet-of-Things Networks via Ensemble Knowledge Distillation-based Federated Learning [52.6706505729803]
We introduce Federated Learning (FL) to collaboratively train a decentralized shared model of Intrusion Detection Systems (IDS)
FLEKD enables a more flexible aggregation method than conventional model fusion techniques.
Experiment results show that the proposed approach outperforms local training and traditional FL in terms of both speed and performance.
arXiv Detail & Related papers (2024-01-22T14:16:37Z) - Efficient Attack Detection in IoT Devices using Feature Engineering-Less
Machine Learning [0.0]
This research proposes a way to overcome the barrier by bypassing feature engineering in the deep learning pipeline and using raw packet data as input.
We introduce a feature engineering-less machine learning (ML) process to perform malware detection on IoT devices.
Our proposed model, "Feature engineering-less-ML (FEL-ML)," is a lighter-weight detection algorithm that expends no extra computations on "engineered" features.
arXiv Detail & Related papers (2023-01-09T17:26:37Z) - Automated Identification of Vulnerable Devices in Networks using Traffic
Data and Deep Learning [30.536369182792516]
Device-type identification combined with data from vulnerability databases can pinpoint vulnerable IoT devices in a network.
We present and evaluate two deep learning approaches to the reliable IoT device-type identification.
arXiv Detail & Related papers (2021-02-16T14:49:34Z) - Lightweight Collaborative Anomaly Detection for the IoT using Blockchain [40.52854197326305]
Internet of things (IoT) devices tend to have many vulnerabilities which can be exploited by an attacker.
Unsupervised techniques, such as anomaly detection, can be used to secure these devices in a plug-and-protect manner.
We present a distributed IoT simulation platform, which consists of 48 Raspberry Pis.
arXiv Detail & Related papers (2020-06-18T14:50:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.