Related papers: TEN-GUARD: Tensor Decomposition for Backdoor Attack Detection in Deep Neural Networks

TEN-GUARD: Tensor Decomposition for Backdoor Attack Detection in Deep Neural Networks

URL: http://arxiv.org/abs/2401.05432v1
Date: Sat, 6 Jan 2024 03:08:28 GMT
Title: TEN-GUARD: Tensor Decomposition for Backdoor Attack Detection in Deep Neural Networks
Authors: Khondoker Murad Hossain, Tim Oates
Abstract summary: We introduce a novel approach to backdoor detection using two tensor decomposition methods applied to network activations. This has a number of advantages relative to existing detection methods, including the ability to analyze multiple models at the same time. Results show that our method detects backdoored networks more accurately and efficiently than current state-of-the-art methods.
Score: 3.489779105594534
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As deep neural networks and the datasets used to train them get larger, the default approach to integrating them into research and commercial projects is to download a pre-trained model and fine tune it. But these models can have uncertain provenance, opening up the possibility that they embed hidden malicious behavior such as trojans or backdoors, where small changes to an input (triggers) can cause the model to produce incorrect outputs (e.g., to misclassify). This paper introduces a novel approach to backdoor detection that uses two tensor decomposition methods applied to network activations. This has a number of advantages relative to existing detection methods, including the ability to analyze multiple models at the same time, working across a wide variety of network architectures, making no assumptions about the nature of triggers used to alter network behavior, and being computationally efficient. We provide a detailed description of the detection pipeline along with results on models trained on the MNIST digit dataset, CIFAR-10 dataset, and two difficult datasets from NIST's TrojAI competition. These results show that our method detects backdoored networks more accurately and efficiently than current state-of-the-art methods.

Related papers

Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection. We design a forgery-style mixture formulation that augments the diversity of forgery source domains. We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z)
Advancing Security in AI Systems: A Novel Approach to Detecting Backdoors in Deep Neural Networks [3.489779105594534]
backdoors can be exploited by malicious actors on deep neural networks (DNNs) and cloud services for data processing. Our approach leverages advanced tensor decomposition algorithms to meticulously analyze the weights of pre-trained DNNs and distinguish between backdoored and clean models. This advancement enhances the security of deep learning and AI in networked systems, providing essential cybersecurity against evolving threats in emerging technologies.
arXiv Detail & Related papers (2024-03-13T03:10:11Z)
Backdoor Attack Detection in Computer Vision by Applying Matrix Factorization on the Weights of Deep Networks [6.44397009982949]
We introduce a novel method for backdoor detection that extracts features from pre-trained DNN's weights. In comparison to other detection techniques, this has a number of benefits, such as not requiring any training data. Our method outperforms the competing algorithms in terms of efficiency and is more accurate, helping to ensure the safe application of deep learning and AI.
arXiv Detail & Related papers (2022-12-15T20:20:18Z)
Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters. We find that our approach successfully generates parameters for a wide range of loss prompts. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z)
Model2Detector:Widening the Information Bottleneck for Out-of-Distribution Detection using a Handful of Gradient Steps [12.263417500077383]
Out-of-distribution detection is an important capability that has long eluded vanilla neural networks. Recent advances in inference-time out-of-distribution detection help mitigate some of these problems. We show how our method consistently outperforms the state-of-the-art in detection accuracy on popular image datasets.
arXiv Detail & Related papers (2022-02-22T23:03:40Z)
Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks. Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair. A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)
Cassandra: Detecting Trojaned Networks from Adversarial Perturbations [92.43879594465422]
In many cases, pre-trained models are sourced from vendors who may have disrupted the training pipeline to insert Trojan behaviors into the models. We propose a method to verify if a pre-trained model is Trojaned or benign. Our method captures fingerprints of neural networks in the form of adversarial perturbations learned from the network gradients.
arXiv Detail & Related papers (2020-07-28T19:00:40Z)
Scalable Backdoor Detection in Neural Networks [61.39635364047679]
Deep learning models are vulnerable to Trojan attacks, where an attacker can install a backdoor during training time to make the resultant model misidentify samples contaminated with a small trigger patch. We propose a novel trigger reverse-engineering based approach whose computational complexity does not scale with the number of labels, and is based on a measure that is both interpretable and universal across different network and patch types. In experiments, we observe that our method achieves a perfect score in separating Trojaned models from pure models, which is an improvement over the current state-of-the art method.
arXiv Detail & Related papers (2020-06-10T04:12:53Z)
Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass. We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.