A New Framework of Software Obfuscation Evaluation Criteria
- URL: http://arxiv.org/abs/2502.14093v1
- Date: Wed, 19 Feb 2025 20:45:47 GMT
- Title: A New Framework of Software Obfuscation Evaluation Criteria
- Authors: Bjorn De Sutter,
- Abstract summary: Several criteria have been proposed in the past to assess the strength of protections, such as potency, resilience, stealth, and cost.<n>We present a new framework of software protection evaluation criteria: relevance, effectiveness (or efficacy), robustness, concealment, stubbornness, sensitivity, predictability, and cost.
- Score: 3.0567294793102784
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the domain of practical software protection against man-at-the-end attacks such as software reverse engineering and tampering, much of the scientific literature is plagued by the use of subpar methods to evaluate the protections' strength and even by the absence of such evaluations. Several criteria have been proposed in the past to assess the strength of protections, such as potency, resilience, stealth, and cost. We analyze their evolving definitions and uses. We formulate a number of critiques, from which we conclude that the existing definitions are unsatisfactory and need to be revised. We present a new framework of software protection evaluation criteria: relevance, effectiveness (or efficacy), robustness, concealment, stubbornness, sensitivity, predictability, and cost.
Related papers
- Evaluating the Evaluators: Trust in Adversarial Robustness Tests [17.06660302788049]
AttackBench is an evaluation tool that ranks existing attack implementations based on a novel optimality metric.<n>The framework enforces consistent testing conditions and enables continuous updates, making it a reliable foundation for robustness verification.
arXiv Detail & Related papers (2025-07-04T10:07:26Z) - Cost-Optimal Active AI Model Evaluation [71.2069549142394]
Development of generative AI systems requires continual evaluation, data acquisition, and annotation.<n>We develop novel, cost-aware methods for actively balancing the use of a cheap, but often inaccurate, weak rater.<n>We derive a family of cost-optimal policies for allocating a given annotation budget between weak and strong raters.
arXiv Detail & Related papers (2025-06-09T17:14:41Z) - Advancing Embodied Agent Security: From Safety Benchmarks to Input Moderation [52.83870601473094]
Embodied agents exhibit immense potential across a multitude of domains.
Existing research predominantly concentrates on the security of general large language models.
This paper introduces a novel input moderation framework, meticulously designed to safeguard embodied agents.
arXiv Detail & Related papers (2025-04-22T08:34:35Z) - What Makes an Evaluation Useful? Common Pitfalls and Best Practices [3.4740704830599385]
We discuss the steps of the initial thought process, which connects threat modeling to evaluation design.
We provide the characteristics and parameters that make an evaluation useful.
arXiv Detail & Related papers (2025-03-30T12:51:47Z) - LLM-Safety Evaluations Lack Robustness [58.334290876531036]
We argue that current safety alignment research efforts for large language models are hindered by many intertwined sources of noise.
We propose a set of guidelines for reducing noise and bias in evaluations of future attack and defense papers.
arXiv Detail & Related papers (2025-03-04T12:55:07Z) - Dynamic Vulnerability Criticality Calculator for Industrial Control Systems [0.0]
This paper introduces an innovative approach by proposing a dynamic vulnerability criticality calculator.
Our methodology encompasses the analysis of environmental topology and the effectiveness of deployed security mechanisms.
Our approach integrates these factors into a comprehensive Fuzzy Cognitive Map model, incorporating attack paths to holistically assess the overall vulnerability score.
arXiv Detail & Related papers (2024-03-20T09:48:47Z) - A Convex Framework for Confounding Robust Inference [21.918894096307294]
We study policy evaluation of offline contextual bandits subject to unobserved confounders.
We propose a general estimator that provides a sharp lower bound of the policy value using convex programming.
arXiv Detail & Related papers (2023-09-21T19:45:37Z) - Evaluation Methodologies in Software Protection Research [3.0448872422956437]
Man-at-the-end (MATE) attackers have full control over the system on which the attacked software runs.
Both companies and malware authors want to prevent such attacks.
It remains difficult to measure the strength of protections because MATE attackers can reach their goals in many different ways.
arXiv Detail & Related papers (2023-07-14T12:24:36Z) - From Static Benchmarks to Adaptive Testing: Psychometrics in AI Evaluation [60.14902811624433]
We discuss a paradigm shift from static evaluation methods to adaptive testing.
This involves estimating the characteristics and value of each test item in the benchmark and dynamically adjusting items in real-time.
We analyze the current approaches, advantages, and underlying reasons for adopting psychometrics in AI evaluation.
arXiv Detail & Related papers (2023-06-18T09:54:33Z) - From Adversarial Arms Race to Model-centric Evaluation: Motivating a
Unified Automatic Robustness Evaluation Framework [91.94389491920309]
Textual adversarial attacks can discover models' weaknesses by adding semantic-preserved but misleading perturbations to the inputs.
The existing practice of robustness evaluation may exhibit issues of incomprehensive evaluation, impractical evaluation protocol, and invalid adversarial samples.
We set up a unified automatic robustness evaluation framework, shifting towards model-centric evaluation to exploit the advantages of adversarial attacks.
arXiv Detail & Related papers (2023-05-29T14:55:20Z) - A Unified Evaluation of Textual Backdoor Learning: Frameworks and
Benchmarks [72.7373468905418]
We develop an open-source toolkit OpenBackdoor to foster the implementations and evaluations of textual backdoor learning.
We also propose CUBE, a simple yet strong clustering-based defense baseline.
arXiv Detail & Related papers (2022-06-17T02:29:23Z) - Evaluating the Adversarial Robustness of Adaptive Test-time Defenses [60.55448652445904]
We categorize such adaptive testtime defenses and explain their potential benefits and drawbacks.
Unfortunately, none significantly improve upon static models when evaluated appropriately.
Some even weaken the underlying static model while simultaneously increasing inference cost.
arXiv Detail & Related papers (2022-02-28T12:11:40Z) - Indicators of Attack Failure: Debugging and Improving Optimization of
Adversarial Examples [29.385242714424624]
evaluating robustness of machine-learning models to adversarial examples is a challenging problem.
We define a set of quantitative indicators which unveil common failures in the optimization of gradient-based attacks.
Our experimental analysis shows that the proposed indicators of failure can be used to visualize, debug and improve current adversarial robustness evaluations.
arXiv Detail & Related papers (2021-06-18T06:57:58Z) - Reliable Off-policy Evaluation for Reinforcement Learning [53.486680020852724]
In a sequential decision-making problem, off-policy evaluation estimates the expected cumulative reward of a target policy.
We propose a novel framework that provides robust and optimistic cumulative reward estimates using one or multiple logged data.
arXiv Detail & Related papers (2020-11-08T23:16:19Z) - A general framework for defining and optimizing robustness [74.67016173858497]
We propose a rigorous and flexible framework for defining different types of robustness properties for classifiers.
Our concept is based on postulates that robustness of a classifier should be considered as a property that is independent of accuracy.
We develop a very general robustness framework that is applicable to any type of classification model.
arXiv Detail & Related papers (2020-06-19T13:24:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.