Risk-Aware and Explainable Framework for Ensuring Guaranteed Coverage in Evolving Hardware Trojan Detection
- URL: http://arxiv.org/abs/2312.00009v1
- Date: Sat, 14 Oct 2023 03:30:21 GMT
- Title: Risk-Aware and Explainable Framework for Ensuring Guaranteed Coverage in Evolving Hardware Trojan Detection
- Authors: Rahul Vishwakarma, Amin Rezaei,
- Abstract summary: In high-risk and sensitive domain, we cannot accept even a small misclassification.
In this paper, we generate evolving hardware Trojans using our proposed novel conformalized generative adversarial networks.
The proposed approach has been validated on both synthetic and real chip-level benchmarks.
- Score: 2.6396287656676733
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: As the semiconductor industry has shifted to a fabless paradigm, the risk of hardware Trojans being inserted at various stages of production has also increased. Recently, there has been a growing trend toward the use of machine learning solutions to detect hardware Trojans more effectively, with a focus on the accuracy of the model as an evaluation metric. However, in a high-risk and sensitive domain, we cannot accept even a small misclassification. Additionally, it is unrealistic to expect an ideal model, especially when Trojans evolve over time. Therefore, we need metrics to assess the trustworthiness of detected Trojans and a mechanism to simulate unseen ones. In this paper, we generate evolving hardware Trojans using our proposed novel conformalized generative adversarial networks and offer an efficient approach to detecting them based on a non-invasive algorithm-agnostic statistical inference framework that leverages the Mondrian conformal predictor. The method acts like a wrapper over any of the machine learning models and produces set predictions along with uncertainty quantification for each new detected Trojan for more robust decision-making. In the case of a NULL set, a novel method to reject the decision by providing a calibrated explainability is discussed. The proposed approach has been validated on both synthetic and real chip-level benchmarks and proven to pave the way for researchers looking to find informed machine learning solutions to hardware security problems.
Related papers
- Bayesian Learned Models Can Detect Adversarial Malware For Free [28.498994871579985]
Adversarial training is an effective method but is computationally expensive to scale up to large datasets.
In particular, a Bayesian formulation can capture the model parameters' distribution and quantify uncertainty without sacrificing model performance.
We found, quantifying uncertainty through Bayesian learning methods can defend against adversarial malware.
arXiv Detail & Related papers (2024-03-27T07:16:48Z) - Uncertainty-Aware Hardware Trojan Detection Using Multimodal Deep
Learning [3.118371710802894]
The risk of hardware Trojans being inserted at various stages of chip production has increased in a zero-trust fabless era.
We propose a multimodal deep learning approach to detect hardware Trojans and evaluate the results from both early fusion and late fusion strategies.
arXiv Detail & Related papers (2024-01-15T05:45:51Z) - PerD: Perturbation Sensitivity-based Neural Trojan Detection Framework
on NLP Applications [21.854581570954075]
Trojan attacks embed the backdoor into the victim and is activated by the trigger in the input space.
We propose a model-level Trojan detection framework by analyzing the deviation of the model output when we introduce a specially crafted perturbation to the input.
We demonstrate the effectiveness of our proposed method on both a dataset of NLP models we create and a public dataset of Trojaned NLP models from TrojAI.
arXiv Detail & Related papers (2022-08-08T22:50:03Z) - CC-Cert: A Probabilistic Approach to Certify General Robustness of
Neural Networks [58.29502185344086]
In safety-critical machine learning applications, it is crucial to defend models against adversarial attacks.
It is important to provide provable guarantees for deep learning models against semantically meaningful input transformations.
We propose a new universal probabilistic certification approach based on Chernoff-Cramer bounds.
arXiv Detail & Related papers (2021-09-22T12:46:04Z) - Online Defense of Trojaned Models using Misattributions [18.16378666013071]
This paper proposes a new approach to detecting neural Trojans on Deep Neural Networks during inference.
We evaluate our approach on several benchmarks, including models trained on MNIST, Fashion MNIST, and German Traffic Sign Recognition Benchmark.
arXiv Detail & Related papers (2021-03-29T19:53:44Z) - Increasing the Confidence of Deep Neural Networks by Coverage Analysis [71.57324258813674]
This paper presents a lightweight monitoring architecture based on coverage paradigms to enhance the model against different unsafe inputs.
Experimental results show that the proposed approach is effective in detecting both powerful adversarial examples and out-of-distribution inputs.
arXiv Detail & Related papers (2021-01-28T16:38:26Z) - Evaluating the Safety of Deep Reinforcement Learning Models using
Semi-Formal Verification [81.32981236437395]
We present a semi-formal verification approach for decision-making tasks based on interval analysis.
Our method obtains comparable results over standard benchmarks with respect to formal verifiers.
Our approach allows to efficiently evaluate safety properties for decision-making models in practical applications.
arXiv Detail & Related papers (2020-10-19T11:18:06Z) - Bayesian Optimization with Machine Learning Algorithms Towards Anomaly
Detection [66.05992706105224]
In this paper, an effective anomaly detection framework is proposed utilizing Bayesian Optimization technique.
The performance of the considered algorithms is evaluated using the ISCX 2012 dataset.
Experimental results show the effectiveness of the proposed framework in term of accuracy rate, precision, low-false alarm rate, and recall.
arXiv Detail & Related papers (2020-08-05T19:29:35Z) - Cassandra: Detecting Trojaned Networks from Adversarial Perturbations [92.43879594465422]
In many cases, pre-trained models are sourced from vendors who may have disrupted the training pipeline to insert Trojan behaviors into the models.
We propose a method to verify if a pre-trained model is Trojaned or benign.
Our method captures fingerprints of neural networks in the form of adversarial perturbations learned from the network gradients.
arXiv Detail & Related papers (2020-07-28T19:00:40Z) - Scalable Backdoor Detection in Neural Networks [61.39635364047679]
Deep learning models are vulnerable to Trojan attacks, where an attacker can install a backdoor during training time to make the resultant model misidentify samples contaminated with a small trigger patch.
We propose a novel trigger reverse-engineering based approach whose computational complexity does not scale with the number of labels, and is based on a measure that is both interpretable and universal across different network and patch types.
In experiments, we observe that our method achieves a perfect score in separating Trojaned models from pure models, which is an improvement over the current state-of-the art method.
arXiv Detail & Related papers (2020-06-10T04:12:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.