Related papers: Bayes Security: A Not So Average Metric

Bayes Security: A Not So Average Metric

URL: http://arxiv.org/abs/2011.03396v3
Date: Tue, 20 Feb 2024 15:54:57 GMT
Title: Bayes Security: A Not So Average Metric
Authors: Konstantinos Chatzikokolakis, Giovanni Cherubin, Catuscia Palamidessi, Carmela Troncoso,
Abstract summary: Security system designers favor worst-case security metrics, such as those derived from differential privacy (DP) In this paper, we study Bayes security, a security metric inspired by the cryptographic advantage.
Score: 20.60340368521067
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Security system designers favor worst-case security metrics, such as those derived from differential privacy (DP), due to the strong guarantees they provide. On the downside, these guarantees result in a high penalty on the system's performance. In this paper, we study Bayes security, a security metric inspired by the cryptographic advantage. Similarly to DP, Bayes security i) is independent of an adversary's prior knowledge, ii) it captures the worst-case scenario for the two most vulnerable secrets (e.g., data records); and iii) it is easy to compose, facilitating security analyses. Additionally, Bayes security iv) can be consistently estimated in a black-box manner, contrary to DP, which is useful when a formal analysis is not feasible; and v) provides a better utility-security trade-off in high-security regimes because it quantifies the risk for a specific threat model as opposed to threat-agnostic metrics such as DP. We formulate a theory around Bayes security, and we provide a thorough comparison with respect to well-known metrics, identifying the scenarios where Bayes Security is advantageous for designers.

Related papers

Prioritizing Security Practice Adoption: Empirical Insights on Software Security Outcomes in the npm Ecosystem [5.900798025576862]
The goal of this study is to assist practitioners and policymakers in making informed decisions on which security practices to adopt. We selected the OpenSSF Scorecard metrics to automatically measure the adoption of security practices in npm GitHub repositories. We conducted regression and causal analysis using 12 Scorecard metrics and their aggregated Scorecard score.
arXiv Detail & Related papers (2025-04-18T18:31:31Z)
Safe Vision-Language Models via Unsafe Weights Manipulation [75.04426753720551]
We revise safety evaluation by introducing Safe-Ground, a new set of metrics that evaluate safety at different levels of granularity. We take a different direction and explore whether it is possible to make a model safer without training, introducing Unsafe Weights Manipulation (UWM) UWM uses a calibration set of safe and unsafe instances to compare activations between safe and unsafe content, identifying the most important parameters for processing the latter.
arXiv Detail & Related papers (2025-03-14T17:00:22Z)
Assessing confidence in frontier AI safety cases [37.839615078345886]
A safety case presents a structured argument in support of a top-level claim about a safety property of the system. This raises the question of what level of confidence should be associated with a top-level claim. We propose a method by which AI developers can prioritise, and thereby make their investigation of argument defeaters more efficient.
arXiv Detail & Related papers (2025-02-09T06:35:11Z)
On the Robustness of Adversarial Training Against Uncertainty Attacks [9.180552487186485]
In learning problems, the noise inherent to the task at hand hinders the possibility to infer without a certain degree of uncertainty. In this work, we reveal both empirically and theoretically that defending against adversarial examples, i.e., carefully perturbed samples that cause misclassification, guarantees a more secure, trustworthy uncertainty estimate. To support our claims, we evaluate multiple adversarial-robust models from the publicly available benchmark RobustBench on the CIFAR-10 and ImageNet datasets.
arXiv Detail & Related papers (2024-10-29T11:12:44Z)
SafetyAnalyst: Interpretable, transparent, and steerable safety moderation for AI behavior [56.10557932893919]
We present SafetyAnalyst, a novel AI safety moderation framework. Given an AI behavior, SafetyAnalyst uses chain-of-thought reasoning to analyze its potential consequences. It aggregates all harmful and beneficial effects into a harmfulness score using fully interpretable weight parameters.
arXiv Detail & Related papers (2024-10-22T03:38:37Z)
Criticality and Safety Margins for Reinforcement Learning [53.10194953873209]
We seek to define a criticality framework with both a quantifiable ground truth and a clear significance to users. We introduce true criticality as the expected drop in reward when an agent deviates from its policy for n consecutive random actions. We also introduce the concept of proxy criticality, a low-overhead metric that has a statistically monotonic relationship to true criticality.
arXiv Detail & Related papers (2024-09-26T21:00:45Z)
Quantitative analysis of attack-fault trees via Markov decision processes [0.7179506962081079]
We introduce a novel method to find the front between the metrics reliability (safety) and attack cost (security) using Markov decision processes. This gives us the full interplay between safety and security while being considerably more lightweight and faster than the automaton approach.
arXiv Detail & Related papers (2024-08-13T14:06:07Z)
What Makes and Breaks Safety Fine-tuning? A Mechanistic Study [64.9691741899956]
Safety fine-tuning helps align Large Language Models (LLMs) with human preferences for their safe deployment. We design a synthetic data generation framework that captures salient aspects of an unsafe input. Using this, we investigate three well-known safety fine-tuning methods.
arXiv Detail & Related papers (2024-07-14T16:12:57Z)
Breach By A Thousand Leaks: Unsafe Information Leakage in `Safe' AI Responses [42.136793654338106]
We introduce a new safety evaluation framework based on impermissible information leakage of model outputs. We show that to ensure safety against inferential adversaries, defense mechanisms must ensure information censorship.
arXiv Detail & Related papers (2024-07-02T16:19:25Z)
Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment [56.2017039028998]
Fine-tuning of Language-Model-as-a-Service (LM) introduces new threats, particularly against the Fine-tuning based Jailbreak Attack (FJAttack) We propose the Backdoor Enhanced Safety Alignment method inspired by an analogy with the concept of backdoor attacks. Our comprehensive experiments demonstrate that through the Backdoor Enhanced Safety Alignment with adding as few as 11 safety examples, the maliciously finetuned LLMs will achieve similar safety performance as the original aligned models without harming the benign performance.
arXiv Detail & Related papers (2024-02-22T21:05:18Z)
The Art of Defending: A Systematic Evaluation and Analysis of LLM Defense Strategies on Safety and Over-Defensiveness [56.174255970895466]
Large Language Models (LLMs) play an increasingly pivotal role in natural language processing applications. This paper presents Safety and Over-Defensiveness Evaluation (SODE) benchmark.
arXiv Detail & Related papers (2023-12-30T17:37:06Z)
PROFL: A Privacy-Preserving Federated Learning Method with Stringent Defense Against Poisoning Attacks [2.6487166137163007]
Federated Learning (FL) faces two major issues: privacy leakage and poisoning attacks. We propose a novel privacy-preserving Byzantine-robust FL framework PROFL. PROFL is based on the two-trapdoor additional homomorphic encryption algorithm and blinding techniques.
arXiv Detail & Related papers (2023-12-02T06:34:37Z)
Certifying LLM Safety against Adversarial Prompting [75.19953634352258]
Large language models (LLMs) are vulnerable to adversarial attacks that add malicious tokens to an input prompt. We introduce erase-and-check, the first framework for defending against adversarial prompts with certifiable safety guarantees.
arXiv Detail & Related papers (2023-09-06T04:37:20Z)
Safety Margins for Reinforcement Learning [53.10194953873209]
We show how to leverage proxy criticality metrics to generate safety margins. We evaluate our approach on learned policies from APE-X and A3C within an Atari environment.
arXiv Detail & Related papers (2023-07-25T16:49:54Z)
Do Software Security Practices Yield Fewer Vulnerabilities? [6.6840472845873276]
The goal of this study is to assist practitioners and researchers making informed decisions on which security practices to adopt. Four security practices were the most important practices influencing vulnerability count. The number of reported vulnerabilities increased rather than reduced as the aggregate security score of the packages increased.
arXiv Detail & Related papers (2022-10-20T20:04:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.