Related papers: KD-GAT: Combining Knowledge Distillation and Graph Attention Transformer for a Controller Area Network Intrusion Detection System

KD-GAT: Combining Knowledge Distillation and Graph Attention Transformer for a Controller Area Network Intrusion Detection System

URL: http://arxiv.org/abs/2507.19686v1
Date: Fri, 25 Jul 2025 21:45:58 GMT
Title: KD-GAT: Combining Knowledge Distillation and Graph Attention Transformer for a Controller Area Network Intrusion Detection System
Authors: Robert Frenken, Sidra Ghayour Bhatti, Hanqin Zhang, Qadeer Ahmed,
Abstract summary: Controller Area Network (CAN) protocol is widely adopted for in-vehicle communication but lacks inherent security mechanisms.<n>This paper introduces KD-GAT, an intrusion detection framework that combines Graph Attention Networks (GATs) with knowledge distillation.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Controller Area Network (CAN) protocol is widely adopted for in-vehicle communication but lacks inherent security mechanisms, making it vulnerable to cyberattacks. This paper introduces KD-GAT, an intrusion detection framework that combines Graph Attention Networks (GATs) with knowledge distillation (KD) to enhance detection accuracy while reducing computational complexity. In our approach, CAN traffic is represented as graphs using a sliding window to capture temporal and relational patterns. A multi-layer GAT with jumping knowledge aggregation acting as the teacher model, while a compact student GAT--only 6.32% the size of the teacher--is trained via a two-phase process involving supervised pretraining and knowledge distillation with both soft and hard label supervision. Experiments on three benchmark datasets--Car-Hacking, Car-Survival, and can-train-and-test demonstrate that both teacher and student models achieve strong results, with the student model attaining 99.97% and 99.31% accuracy on Car-Hacking and Car-Survival, respectively. However, significant class imbalance in can-train-and-test has led to reduced performance for both models on this dataset. Addressing this imbalance remains an important direction for future work.

Related papers

Multi-Stage Knowledge-Distilled VGAE and GAT for Robust Controller-Area-Network Intrusion Detection [0.0]
The Controller Area Network (CAN) protocol is a standard for in-vehicle communication but remains susceptible to cyber-attacks due to its lack of built-in security.<n>This paper presents a multi-stage intrusion detection framework leveraging unsupervised anomaly detection and supervised graph learning tailored for automotive CAN traffic.
arXiv Detail & Related papers (2025-08-06T19:50:26Z)
Efficient Self-Supervised Neuro-Analytic Visual Servoing for Real-time Quadrotor Control [7.791675745811072]
This work introduces a self-supervised neuro-analytical, cost efficient, model for visual-based quadrotor control in which a small 1.7M parameters student ConvNet learns automatically from an analytical teacher.<n>Our vision-only self-supervised neuro-analytic control, enables quadrotor orientation and movement without requiring explicit geometric models or fiducial markers.
arXiv Detail & Related papers (2025-07-26T09:17:38Z)
Overtake Detection in Trucks Using CAN Bus Signals: A Comparative Study of Machine Learning Methods [51.28632782308621]
We focus on overtake detection using Controller Area Network (CAN) bus data collected from five in-service trucks provided by the Volvo Group.<n>We evaluate three common classifiers for vehicle manoeuvre detection, Artificial Neural Networks (ANN), Random Forest (RF), and Support Vector Machines (SVM)<n>Our pertruck analysis also reveals that classification accuracy, especially for overtakes, depends on the amount of training data per vehicle.
arXiv Detail & Related papers (2025-07-01T09:20:41Z)
Enhancing IoT-Botnet Detection using Variational Auto-encoder and Cost-Sensitive Learning: A Deep Learning Approach for Imbalanced Datasets [0.0]
The work in this study leveraged Variational Auto-encoder (VAE) and cost-sensitive learning to develop models for IoT-botnet detection.<n>The aim is to enhance the detection of minority class attack traffic instances which are often missed by machine learning models.
arXiv Detail & Related papers (2025-04-26T02:04:30Z)
Knowledge Distillation Neural Network for Predicting Car-following Behaviour of Human-driven and Autonomous Vehicles [2.099922236065961]
This study investigates the car-following behaviours of three vehicle pairs: HDV-AV, AV-HDV and HDV-HDV in mixed traffic. We introduce a data-driven Knowledge Distillation Neural Network (KDNN) model for predicting car-following behaviour in terms of speed.
arXiv Detail & Related papers (2024-11-08T14:57:59Z)
Exploring Highly Quantised Neural Networks for Intrusion Detection in Automotive CAN [13.581341206178525]
Machine learning-based intrusion detection models have been shown to successfully detect multiple targeted attack vectors. In this paper, we present a case for custom-quantised literature (CQMLP) as a multi-class classification model. We show that the 2-bit CQMLP model, when integrated as the IDS, can detect malicious attack messages with a very high accuracy of 99.9%.
arXiv Detail & Related papers (2024-01-19T21:11:02Z)
CrossKD: Cross-Head Knowledge Distillation for Object Detection [69.16346256926842]
Knowledge Distillation (KD) has been validated as an effective model compression technique for learning compact object detectors. We present a prediction mimicking distillation scheme, called CrossKD, which delivers the intermediate features of the student's detection head to the teacher's detection head. Our CrossKD boosts the average precision of GFL ResNet-50 with 1x training schedule from 40.2 to 43.7, outperforming all existing KD methods.
arXiv Detail & Related papers (2023-06-20T08:19:51Z)
Certified Interpretability Robustness for Class Activation Mapping [77.58769591550225]
We present CORGI, short for Certifiably prOvable Robustness Guarantees for Interpretability mapping. CORGI is an algorithm that takes in an input image and gives a certifiable lower bound for the robustness of its CAM interpretability map. We show the effectiveness of CORGI via a case study on traffic sign data, certifying lower bounds on the minimum adversarial perturbation.
arXiv Detail & Related papers (2023-01-26T18:58:11Z)
Directed Acyclic Graph Factorization Machines for CTR Prediction via Knowledge Distillation [65.62538699160085]
We propose a Directed Acyclic Graph Factorization Machine (KD-DAGFM) to learn the high-order feature interactions from existing complex interaction models for CTR prediction via Knowledge Distillation. KD-DAGFM achieves the best performance with less than 21.5% FLOPs of the state-of-the-art method on both online and offline experiments.
arXiv Detail & Related papers (2022-11-21T03:09:42Z)
Supervised Contrastive ResNet and Transfer Learning for the In-vehicle Intrusion Detection System [0.22843885788439797]
We propose a novel deep learning model called supervised contrastive (SupCon) ResNet to handle multiple attack identification on the CAN bus. The model improves the overall false-negative rates of four types of attack by four times on average, compared to other models. The model achieves the highest F1 score at 0.9994 on the survival dataset by utilizing transfer learning.
arXiv Detail & Related papers (2022-07-18T05:34:55Z)
How and When Adversarial Robustness Transfers in Knowledge Distillation? [137.11016173468457]
This paper studies how and when the adversarial robustness can be transferred from a teacher model to a student model in Knowledge distillation (KD) We show that standard KD training fails to preserve adversarial robustness, and we propose KD with input gradient alignment (KDIGA) for remedy. Under certain assumptions, we prove that the student model using our proposed KDIGA can achieve at least the same certified robustness as the teacher model.
arXiv Detail & Related papers (2021-10-22T21:30:53Z)
Towards Reducing Labeling Cost in Deep Object Detection [61.010693873330446]
We propose a unified framework for active learning, that considers both the uncertainty and the robustness of the detector. Our method is able to pseudo-label the very confident predictions, suppressing a potential distribution drift.
arXiv Detail & Related papers (2021-06-22T16:53:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.