Related papers: From Counterfactuals to Trees: Competitive Analysis of Model Extraction Attacks

From Counterfactuals to Trees: Competitive Analysis of Model Extraction Attacks

URL: http://arxiv.org/abs/2502.05325v1
Date: Fri, 07 Feb 2025 20:51:06 GMT
Title: From Counterfactuals to Trees: Competitive Analysis of Model Extraction Attacks
Authors: Awa Khouna, Julien Ferry, Thibaut Vidal,
Abstract summary: We formalize and characterize the risks and inherent complexity of model reconstruction.<n>We present the first formal analysis of model extraction attacks through the lens of competitive analysis.<n>We introduce novel reconstruction algorithms that achieve provably perfect fidelity while demonstrating strong anytime performance.
Score: 4.293083690039339
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The advent of Machine Learning as a Service (MLaaS) has heightened the trade-off between model explainability and security. In particular, explainability techniques, such as counterfactual explanations, inadvertently increase the risk of model extraction attacks, enabling unauthorized replication of proprietary models. In this paper, we formalize and characterize the risks and inherent complexity of model reconstruction, focusing on the "oracle'' queries required for faithfully inferring the underlying prediction function. We present the first formal analysis of model extraction attacks through the lens of competitive analysis, establishing a foundational framework to evaluate their efficiency. Focusing on models based on additive decision trees (e.g., decision trees, gradient boosting, and random forests), we introduce novel reconstruction algorithms that achieve provably perfect fidelity while demonstrating strong anytime performance. Our framework provides theoretical bounds on the query complexity for extracting tree-based model, offering new insights into the security vulnerabilities of their deployment.

Related papers

Tuning for Trustworthiness -- Balancing Performance and Explanation Consistency in Neural Network Optimization [49.567092222782435]
We introduce the novel concept of XAI consistency, defined as the agreement among different feature attribution methods.<n>We create a multi-objective optimization framework that balances predictive performance with explanation.<n>Our research provides a foundation for future investigations into whether models from the trade-off zone-balancing performance loss and XAI consistency-exhibit greater robustness.
arXiv Detail & Related papers (2025-05-12T13:19:14Z)
Seeing Through Risk: A Symbolic Approximation of Prospect Theory [0.0]
We propose a novel symbolic modeling framework for decision-making under risk. Our approach replaces opaque utility curves and probability weighting functions with transparent, effect-size-guided features. We mathematically formalize the method, demonstrate its ability to replicate well-known framing and loss-aversion phenomena, and provide an end-to-end empirical validation on synthetic datasets.
arXiv Detail & Related papers (2025-04-20T01:44:54Z)
On the Reasoning Capacity of AI Models and How to Quantify It [0.0]
Large Language Models (LLMs) have intensified the debate surrounding the fundamental nature of their reasoning capabilities.<n>While achieving high performance on benchmarks such as GPQA and MMLU, these models exhibit limitations in more complex reasoning tasks.<n>We propose a novel phenomenological approach that goes beyond traditional accuracy metrics to probe the underlying mechanisms of model behavior.
arXiv Detail & Related papers (2025-01-23T16:58:18Z)
Rigorous Probabilistic Guarantees for Robust Counterfactual Explanations [80.86128012438834]
We show for the first time that computing the robustness of counterfactuals with respect to plausible model shifts is NP-complete. We propose a novel probabilistic approach which is able to provide tight estimates of robustness with strong guarantees.
arXiv Detail & Related papers (2024-07-10T09:13:11Z)
SynthTree: Co-supervised Local Model Synthesis for Explainable Prediction [15.832975722301011]
We propose a novel method to enhance explainability with minimal accuracy loss. We have developed novel methods for estimating nodes by leveraging AI techniques. Our findings highlight the critical role that statistical methodologies can play in advancing explainable AI.
arXiv Detail & Related papers (2024-06-16T14:43:01Z)
The Buffer Mechanism for Multi-Step Information Reasoning in Language Models [52.77133661679439]
Investigating internal reasoning mechanisms of large language models can help us design better model architectures and training strategies. In this study, we constructed a symbolic dataset to investigate the mechanisms by which Transformer models employ vertical thinking strategy. We proposed a random matrix-based algorithm to enhance the model's reasoning ability, resulting in a 75% reduction in the training time required for the GPT-2 model.
arXiv Detail & Related papers (2024-05-24T07:41:26Z)
Model Reconstruction Using Counterfactual Explanations: A Perspective From Polytope Theory [9.771997770574947]
We analyze how model reconstruction using counterfactuals can be improved. Our main contribution is to derive novel theoretical relationships between the error in model reconstruction and the number of counterfactual queries.
arXiv Detail & Related papers (2024-05-08T18:52:47Z)
AUTOLYCUS: Exploiting Explainable AI (XAI) for Model Extraction Attacks against Interpretable Models [1.8752655643513647]
XAI tools can increase the vulnerability of model extraction attacks, which is a concern when model owners prefer black-box access. We propose a novel retraining (learning) based model extraction attack framework against interpretable models under black-box settings. We show that AUTOLYCUS is highly effective, requiring significantly fewer queries compared to state-of-the-art attacks.
arXiv Detail & Related papers (2023-02-04T13:23:39Z)
When to Update Your Model: Constrained Model-based Reinforcement Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL) Our follow-up derived bounds reveal the relationship between model shifts and performance improvement. A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z)
On the Robustness of Random Forest Against Untargeted Data Poisoning: An Ensemble-Based Approach [42.81632484264218]
In machine learning models, perturbations of fractions of the training set (poisoning) can seriously undermine the model accuracy. This paper aims to implement a novel hash-based ensemble approach that protects random forest against untargeted, random poisoning attacks.
arXiv Detail & Related papers (2022-09-28T11:41:38Z)
Logically Consistent Adversarial Attacks for Soft Theorem Provers [110.17147570572939]
We propose a generative adversarial framework for probing and improving language models' reasoning capabilities. Our framework successfully generates adversarial attacks and identifies global weaknesses. In addition to effective probing, we show that training on the generated samples improves the target model's performance.
arXiv Detail & Related papers (2022-04-29T19:10:12Z)
Federated Learning with Unreliable Clients: Performance Analysis and Mechanism Design [76.29738151117583]
Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients. However, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training. We model these unreliable behaviors of clients and propose a defensive mechanism to mitigate such a security risk.
arXiv Detail & Related papers (2021-05-10T08:02:27Z)
Adversarial Attack and Defense of Structured Prediction Models [58.49290114755019]
In this paper, we investigate attacks and defenses for structured prediction tasks in NLP. The structured output of structured prediction models is sensitive to small perturbations in the input. We propose a novel and unified framework that learns to attack a structured prediction model using a sequence-to-sequence model.
arXiv Detail & Related papers (2020-10-04T15:54:03Z)
Semi-Structured Distributional Regression -- Extending Structured Additive Models by Arbitrary Deep Neural Networks and Data Modalities [0.0]
We propose a general framework to combine structured regression models and deep neural networks into a unifying network architecture. We demonstrate the framework's efficacy in numerical experiments and illustrate its special merits in benchmarks and real-world applications.
arXiv Detail & Related papers (2020-02-13T21:01:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.