Rewriting the Budget: A General Framework for Black-Box Attacks Under Cost Asymmetry
- URL: http://arxiv.org/abs/2506.06933v1
- Date: Sat, 07 Jun 2025 22:02:27 GMT
- Title: Rewriting the Budget: A General Framework for Black-Box Attacks Under Cost Asymmetry
- Authors: Mahdi Salmani, Alireza Abdollahpoorrostam, Seyed-Mohsen Moosavi-Dezfooli,
- Abstract summary: We propose a general framework for decision-based attacks under asymmetric query costs.<n>We design efficient algorithms that minimize total attack cost by balancing different query types.<n>Our method achieves consistently lower total query cost and smaller perturbations than existing approaches.
- Score: 11.292557925135283
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional decision-based black-box adversarial attacks on image classifiers aim to generate adversarial examples by slightly modifying input images while keeping the number of queries low, where each query involves sending an input to the model and observing its output. Most existing methods assume that all queries have equal cost. However, in practice, queries may incur asymmetric costs; for example, in content moderation systems, certain output classes may trigger additional review, enforcement, or penalties, making them more costly than others. While prior work has considered such asymmetric cost settings, effective algorithms for this scenario remain underdeveloped. In this paper, we propose a general framework for decision-based attacks under asymmetric query costs, which we refer to as asymmetric black-box attacks. We modify two core components of existing attacks: the search strategy and the gradient estimation process. Specifically, we propose Asymmetric Search (AS), a more conservative variant of binary search that reduces reliance on high-cost queries, and Asymmetric Gradient Estimation (AGREST), which shifts the sampling distribution to favor low-cost queries. We design efficient algorithms that minimize total attack cost by balancing different query types, in contrast to earlier methods such as stealthy attacks that focus only on limiting expensive (high-cost) queries. Our method can be integrated into a range of existing black-box attacks with minimal changes. We perform both theoretical analysis and empirical evaluation on standard image classification benchmarks. Across various cost regimes, our method consistently achieves lower total query cost and smaller perturbations than existing approaches, with improvements of up to 40% in some settings.
Related papers
- Auditing Black-Box LLM APIs with a Rank-Based Uniformity Test [24.393978712663618]
API providers may discreetly serve quantized or fine-tuned variants to reduce costs or maliciously alter model behaviors.<n>We propose a rank-based uniformity test that can verify the behavioral equality of a black-box LLM to a locally deployed authentic model.<n>We evaluate the approach across diverse threat scenarios, including quantization, harmful fine-tuning, jailbreak prompts, and full model substitution.
arXiv Detail & Related papers (2025-06-08T03:00:31Z) - AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries.
We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots.
We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z) - AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples [26.37278338032268]
Adrial examples are typically optimized with gradient-based attacks.<n>Each is shown to outperform its predecessors using different experimental setups.<n>This provides overly-optimistic and even biased evaluations.
arXiv Detail & Related papers (2024-04-30T11:19:05Z) - One-Shot Strategic Classification Under Unknown Costs [19.390528752448283]
We show that for a broad class of costs, even small mis-estimations of the cost function can entail trivial accuracy in the worst case.
Our analysis reveals important strategic responses, particularly the value of dual regularization with respect to the cost manipulation function.
arXiv Detail & Related papers (2023-11-05T20:43:08Z) - Evading Black-box Classifiers Without Breaking Eggs [70.72391781899597]
Decision-based evasion attacks repeatedly query a black-box classifier to generate adversarial examples.
Prior work measures the cost of such attacks by the total number of queries made to the classifier.
We argue this metric is flawed and design new attacks that reduce the number of bad queries by $1.5$-$7.3times$.
arXiv Detail & Related papers (2023-06-05T14:04:53Z) - Budgeted Classification with Rejection: An Evolutionary Method with
Multiple Objectives [0.0]
Budgeted, sequential classifiers (BSCs) process inputs through a sequence of partial feature acquisition and evaluation steps.
This allows for an efficient evaluation of inputs that prevents unneeded feature acquisition.
We propose a problem-specific genetic algorithm to build budgeted, sequential classifiers with confidence-based reject options.
arXiv Detail & Related papers (2022-05-01T22:05:16Z) - LSDAT: Low-Rank and Sparse Decomposition for Decision-based Adversarial
Attack [74.5144793386864]
LSDAT crafts perturbations in the low-dimensional subspace formed by the sparse component of the input sample and that of an adversarial sample.
LSD works directly in the image pixel domain to guarantee that non-$ell$ constraints, such as sparsity, are satisfied.
arXiv Detail & Related papers (2021-03-19T13:10:47Z) - SurFree: a fast surrogate-free black-box attack [17.323638042215013]
Adversarial examples are slightly modified inputs that are then misclassified, while remaining perceptively close to their originals.
Last couple of years have witnessed a striking decrease in the amount of queries a black box attack submits to the target.
This paper presents SurFree, a geometrical approach that achieves a similar drastic reduction in the amount of queries in the hardest setup: black box decision-based attacks.
arXiv Detail & Related papers (2020-11-25T15:08:19Z) - Simple and Efficient Hard Label Black-box Adversarial Attacks in Low
Query Budget Regimes [80.9350052404617]
We propose a simple and efficient Bayesian Optimization(BO) based approach for developing black-box adversarial attacks.
Issues with BO's performance in high dimensions are avoided by searching for adversarial examples in a structured low-dimensional subspace.
Our proposed approach consistently achieves 2x to 10x higher attack success rate while requiring 10x to 20x fewer queries.
arXiv Detail & Related papers (2020-07-13T04:34:57Z) - Projection & Probability-Driven Black-Box Attack [205.9923346080908]
Existing black-box attacks suffer from the need for excessive queries in the high-dimensional space.
We propose Projection & Probability-driven Black-box Attack (PPBA) to tackle this problem.
Our method requires at most 24% fewer queries with a higher attack success rate compared with state-of-the-art approaches.
arXiv Detail & Related papers (2020-05-08T03:37:50Z) - GeoDA: a geometric framework for black-box adversarial attacks [79.52980486689287]
We propose a framework to generate adversarial examples in one of the most challenging black-box settings.
Our framework is based on the observation that the decision boundary of deep networks usually has a small mean curvature in the vicinity of data samples.
arXiv Detail & Related papers (2020-03-13T20:03:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.