Related papers: Scalable and Precise Patch Robustness Certification for Deep Learning Models with Top-k Predictions

Scalable and Precise Patch Robustness Certification for Deep Learning Models with Top-k Predictions

URL: http://arxiv.org/abs/2507.23335v1
Date: Thu, 31 Jul 2025 08:31:59 GMT
Title: Scalable and Precise Patch Robustness Certification for Deep Learning Models with Top-k Predictions
Authors: Qilin Zhou, Haipeng Wang, Zhengyuan Wei, W. K. Chan,
Abstract summary: Patch robustness certification is an emerging verification approach for defending against adversarial patch attacks.<n>We propose CostCert, a voting-based certified recovery defender.<n>We show that CostCert significantly outperforms the current state-of-the-art defender PatchGuard.
Score: 2.6499018693213316
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Patch robustness certification is an emerging verification approach for defending against adversarial patch attacks with provable guarantees for deep learning systems. Certified recovery techniques guarantee the prediction of the sole true label of a certified sample. However, existing techniques, if applicable to top-k predictions, commonly conduct pairwise comparisons on those votes between labels, failing to certify the sole true label within the top k prediction labels precisely due to the inflation on the number of votes controlled by the attacker (i.e., attack budget); yet enumerating all combinations of vote allocation suffers from the combinatorial explosion problem. We propose CostCert, a novel, scalable, and precise voting-based certified recovery defender. CostCert verifies the true label of a sample within the top k predictions without pairwise comparisons and combinatorial explosion through a novel design: whether the attack budget on the sample is infeasible to cover the smallest total additional votes on top of the votes uncontrollable by the attacker to exclude the true labels from the top k prediction labels. Experiments show that CostCert significantly outperforms the current state-of-the-art defender PatchGuard, such as retaining up to 57.3% in certified accuracy when the patch size is 96, whereas PatchGuard has already dropped to zero.

Related papers

Toward Patch Robustness Certification and Detection for Deep Learning Systems Beyond Consistent Samples [3.914395669991103]
This paper proposes HiCert, a novel masking-based certified detection technique.<n>It certifies significantly more benign samples, including those inconsistent and consistent.<n>It achieves significantly higher accuracy on those samples without warnings and a significantly lower false silent ratio.
arXiv Detail & Related papers (2025-12-05T20:02:09Z)
Robust Conformal Prediction with a Single Binary Certificate [58.450154976190795]
Conformal prediction (CP) converts any model's output to prediction sets with a guarantee to cover the true label with (adjustable) high probability.<n>We propose a robust conformal prediction that produces smaller sets even with significantly lower MC samples.
arXiv Detail & Related papers (2025-03-07T08:41:53Z)
Robust Yet Efficient Conformal Prediction Sets [53.78604391939934]
Conformal prediction (CP) can convert any model's output into prediction sets guaranteed to include the true label. We derive provably robust sets by bounding the worst-case change in conformity scores.
arXiv Detail & Related papers (2024-07-12T10:59:44Z)
Verifiably Robust Conformal Prediction [1.391198481393699]
This paper introduces VRCP (Verifiably Robust Conformal Prediction), a new framework that leverages neural network verification methods to recover coverage guarantees under adversarial attacks. Our method is the first to support perturbations bounded by arbitrary norms including $ell1$, $ell2$, and $ellinfty$, as well as regression tasks. In every case, VRCP achieves above nominal coverage and yields significantly more efficient and informative prediction regions than the SotA.
arXiv Detail & Related papers (2024-05-29T09:50:43Z)
A Majority Invariant Approach to Patch Robustness Certification for Deep Learning Models [2.6499018693213316]
MajorCert finds all possible label sets manipulatable by the same patch region on the same sample. It enumerates their combinations element-wise, and then checks whether the majority invariant of all these combinations is intact to certify samples.
arXiv Detail & Related papers (2023-08-01T11:05:13Z)
PatchCensor: Patch Robustness Certification for Transformers via Exhaustive Testing [7.88628640954152]
Vision Transformer (ViT) is known to be highly nonlinear like other classical neural networks and could be easily fooled by both natural and adversarial patch perturbations. This limitation could pose a threat to the deployment of ViT in the real industrial environment, especially in safety-critical scenarios. We propose PatchCensor, aiming to certify the patch robustness of ViT by applying exhaustive testing.
arXiv Detail & Related papers (2021-11-19T23:45:23Z)
Checking Patch Behaviour against Test Specification [4.723400023753107]
We propose a hypothesis on how the link between the patch behaviour and failing test specifications can be drawn. We then propose BATS, an unsupervised learning-based system to predict patch correctness.
arXiv Detail & Related papers (2021-07-28T11:39:06Z)
Almost Tight L0-norm Certified Robustness of Top-k Predictions against Adversarial Perturbations [78.23408201652984]
Top-k predictions are used in many real-world applications such as machine learning as a service, recommender systems, and web searches. Our work is based on randomized smoothing, which builds a provably robust classifier via randomizing an input. For instance, our method can build a classifier that achieves a certified top-3 accuracy of 69.2% on ImageNet when an attacker can arbitrarily perturb 5 pixels of a testing image.
arXiv Detail & Related papers (2020-11-15T21:34:44Z)
Deep Partition Aggregation: Provable Defense against General Poisoning Attacks [136.79415677706612]
Adrial poisoning attacks distort training data in order to corrupt the test-time behavior of a classifier. We propose two novel provable defenses against poisoning attacks. DPA is a certified defense against a general poisoning threat model. SS-DPA is a certified defense against label-flipping attacks.
arXiv Detail & Related papers (2020-06-26T03:16:31Z)
(De)Randomized Smoothing for Certifiable Defense against Patch Attacks [136.79415677706612]
We introduce a certifiable defense against patch attacks that guarantees for a given image and patch attack size. Our method is related to the broad class of randomized smoothing robustness schemes. Our results effectively establish a new state-of-the-art of certifiable defense against patch attacks on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2020-02-25T08:39:46Z)
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks. We present a unifying view of randomized smoothing over arbitrary functions. We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.