Related papers: Safety and Performance, Why Not Both? Bi-Objective Optimized Model Compression against Heterogeneous Attacks Toward AI Software Deployment

Safety and Performance, Why Not Both? Bi-Objective Optimized Model Compression against Heterogeneous Attacks Toward AI Software Deployment

URL: http://arxiv.org/abs/2401.00996v1
Date: Tue, 2 Jan 2024 02:31:36 GMT
Title: Safety and Performance, Why Not Both? Bi-Objective Optimized Model Compression against Heterogeneous Attacks Toward AI Software Deployment
Authors: Jie Zhu, Leye Wang, Xiao Han, Anmin Liu, and Tao Xie
Abstract summary: We propose a test-driven sparse training framework called SafeCompress. By simulating the attack mechanism as safety testing, SafeCompress can automatically compress a big model to a small one. We conduct extensive experiments on five datasets for both computer vision and natural language processing tasks.
Score: 15.803413192172037
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The size of deep learning models in artificial intelligence (AI) software is increasing rapidly, hindering the large-scale deployment on resource-restricted devices (e.g., smartphones). To mitigate this issue, AI software compression plays a crucial role, which aims to compress model size while keeping high performance. However, the intrinsic defects in a big model may be inherited by the compressed one. Such defects may be easily leveraged by adversaries, since a compressed model is usually deployed in a large number of devices without adequate protection. In this article, we aim to address the safe model compression problem from the perspective of safety-performance co-optimization. Specifically, inspired by the test-driven development (TDD) paradigm in software engineering, we propose a test-driven sparse training framework called SafeCompress. By simulating the attack mechanism as safety testing, SafeCompress can automatically compress a big model to a small one following the dynamic sparse training paradigm. Then, considering two kinds of representative and heterogeneous attack mechanisms, i.e., black-box membership inference attack and white-box membership inference attack, we develop two concrete instances called BMIA-SafeCompress and WMIA-SafeCompress. Further, we implement another instance called MMIA-SafeCompress by extending SafeCompress to defend against the occasion when adversaries conduct black-box and white-box membership inference attacks simultaneously. We conduct extensive experiments on five datasets for both computer vision and natural language processing tasks. The results show the effectiveness and generalizability of our framework. We also discuss how to adapt SafeCompress to other attacks besides membership inference attack, demonstrating the flexibility of SafeCompress.

Related papers

DoomArena: A framework for Testing AI Agents Against Evolving Security Threats [84.94654617852322]
We present DoomArena, a security evaluation framework for AI agents. It is a plug-in framework and integrates easily into realistic agentic frameworks. It is modular and decouples the development of attacks from details of the environment in which the agent is deployed.
arXiv Detail & Related papers (2025-04-18T20:36:10Z)
Tit-for-Tat: Safeguarding Large Vision-Language Models Against Jailbreak Attacks via Adversarial Defense [90.71884758066042]
Large vision-language models (LVLMs) introduce a unique vulnerability: susceptibility to malicious attacks via visual inputs. We propose ESIII (Embedding Security Instructions Into Images), a novel methodology for transforming the visual space from a source of vulnerability into an active defense mechanism.
arXiv Detail & Related papers (2025-03-14T17:39:45Z)
Robust and Transferable Backdoor Attacks Against Deep Image Compression With Selective Frequency Prior [118.92747171905727]
This paper introduces a novel frequency-based trigger injection model for launching backdoor attacks with multiple triggers on learned image compression models. We design attack objectives tailored to diverse scenarios, including: 1) degrading compression quality in terms of bit-rate and reconstruction accuracy; 2) targeting task-driven measures like face recognition and semantic segmentation. Experiments show that our trigger injection models, combined with minor modifications to encoder parameters, successfully inject multiple backdoors and their triggers into a single compression model.
arXiv Detail & Related papers (2024-12-02T15:58:40Z)
SoK: A Systems Perspective on Compound AI Threats and Countermeasures [3.458371054070399]
We discuss different software and hardware attacks applicable to compound AI systems. We show how combining multiple attack mechanisms can reduce the threat model assumptions required for an isolated attack.
arXiv Detail & Related papers (2024-11-20T17:08:38Z)
Code Polymorphism Meets Code Encryption: Confidentiality and Side-Channel Protection of Software Components [0.0]
PolEn is a toolchain and a processor architecturethat combine countermeasures in order to provide an effective mitigation of side-channel attacks. Code encryption is supported by a processor extension such that machineinstructions are only decrypted inside the CPU. Code polymorphism is implemented by software means. It regularly changes the observablebehaviour of the program, making it unpredictable for an attacker.
arXiv Detail & Related papers (2023-10-11T09:16:10Z)
Jailbroken: How Does LLM Safety Training Fail? [92.8748773632051]
"jailbreak" attacks on early releases of ChatGPT elicit undesired behavior. We investigate why such attacks succeed and how they can be created. New attacks utilizing our failure modes succeed on every prompt in a collection of unsafe requests.
arXiv Detail & Related papers (2023-07-05T17:58:10Z)
Citadel: Real-World Hardware-Software Contracts for Secure Enclaves Through Microarchitectural Isolation and Controlled Speculation [8.414722884952525]
Hardware isolation primitives such as secure enclaves aim to protect programs, but remain vulnerable to transient execution attacks. This paper advocates for processors to incorporate microarchitectural isolation primitives and mechanisms for controlled speculation. We introduce two mechanisms to securely share memory between an enclave and an untrusted OS in an out-of-order processor.
arXiv Detail & Related papers (2023-06-26T17:51:23Z)
DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection. Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables. We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z)
Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger [106.10954454667757]
We present a novel backdoor attack with multiple triggers against learned image compression models. Motivated by the widely used discrete cosine transform (DCT) in existing compression systems and standards, we propose a frequency-based trigger injection model.
arXiv Detail & Related papers (2023-02-28T15:39:31Z)
Safety and Performance, Why not Both? Bi-Objective Optimized Model Compression toward AI Software Deployment [12.153709321048947]
AI software compression plays a crucial role, which aims to compress model size while keeping high performance. In this paper, we try to address the safe model compression problem from a safety-performance co-optimization perspective. Specifically, inspired by the test-driven development (TDD) paradigm in software engineering, we propose a test-driven sparse training framework called SafeCompress.
arXiv Detail & Related papers (2022-08-11T04:41:08Z)
Fixed Points in Cyber Space: Rethinking Optimal Evasion Attacks in the Age of AI-NIDS [70.60975663021952]
We study blackbox adversarial attacks on network classifiers. We argue that attacker-defender fixed points are themselves general-sum games with complex phase transitions. We show that a continual learning approach is required to study attacker-defender dynamics.
arXiv Detail & Related papers (2021-11-23T23:42:16Z)
Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications. We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths. Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z)
Robustness and Transferability of Universal Attacks on Compressed Models [3.187381965457262]
Neural network compression methods like pruning and quantization are very effective at efficiently deploying Deep Neural Networks (DNNs) on edge devices. In particular, Universal Adversarial Perturbations (UAPs), are a powerful class of adversarial attacks. We show that, in some scenarios, quantization can produce gradient-masking, giving a false sense of security.
arXiv Detail & Related papers (2020-12-10T23:40:23Z)
Omni: Automated Ensemble with Unexpected Models against Adversarial Evasion Attack [35.0689225703137]
A machine learning-based security detection model is susceptible to adversarial evasion attacks. We propose an approach called Omni to explore methods that create an ensemble of "unexpected models" In studies with five types of adversarial evasion attacks, we show Omni is a promising approach as a defense strategy.
arXiv Detail & Related papers (2020-11-23T20:02:40Z)
Adversarial EXEmples: A Survey and Experimental Evaluation of Practical Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes. We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks. These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.