Related papers: Embedding and Extraction of Knowledge in Tree Ensemble Classifiers

Embedding and Extraction of Knowledge in Tree Ensemble Classifiers

URL: http://arxiv.org/abs/2010.08281v3
Date: Tue, 26 Oct 2021 13:47:46 GMT
Title: Embedding and Extraction of Knowledge in Tree Ensemble Classifiers
Authors: Wei Huang, Xingyu Zhao and Xiaowei Huang
Abstract summary: This paper studies the embedding and extraction of knowledge in tree ensemble classifiers. We propose two novel, and effective, embedding algorithms, one of which is for black-box settings and the other for white-box settings. We develop an algorithm to extract the embedded knowledge, by reducing the problem to be solvable with an SMT (satisfiability modulo theories) solver.
Score: 11.762762974386684
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The embedding and extraction of useful knowledge is a recent trend in machine learning applications, e.g., to supplement existing datasets that are small. Whilst, as the increasing use of machine learning models in security-critical applications, the embedding and extraction of malicious knowledge are equivalent to the notorious backdoor attack and its defence, respectively. This paper studies the embedding and extraction of knowledge in tree ensemble classifiers, and focuses on knowledge expressible with a generic form of Boolean formulas, e.g., robustness properties and backdoor attacks. For the embedding, it is required to be preservative(the original performance of the classifier is preserved), verifiable(the knowledge can be attested), and stealthy(the embedding cannot be easily detected). To facilitate this, we propose two novel, and effective, embedding algorithms, one of which is for black-box settings and the other for white-box settings.The embedding can be done in PTIME. Beyond the embedding, we develop an algorithm to extract the embedded knowledge, by reducing the problem to be solvable with an SMT (satisfiability modulo theories) solver. While this novel algorithm can successfully extract knowledge, the reduction leads to an NP computation. Therefore, if applying embedding as backdoor attacks and extraction as defence, our results suggest a complexity gap (P vs. NP) between the attack and defence when working with tree ensemble classifiers. We apply our algorithms toa diverse set of datasets to validate our conclusion extensively.

Related papers

Cryptanalysis via Machine Learning Based Information Theoretic Metrics [58.96805474751668]
We propose two novel applications of machine learning (ML) algorithms to perform cryptanalysis on any cryptosystem. These algorithms can be readily applied in an audit setting to evaluate the robustness of a cryptosystem. We show that our classification model correctly identifies the encryption schemes that are not IND-CPA secure, such as DES, RSA, and AES ECB, with high accuracy.
arXiv Detail & Related papers (2025-01-25T04:53:36Z)
Knowledge-aware equation discovery with automated background knowledge extraction [50.79602839359522]
We describe an algorithm that allows the discovery of unknown equations using automatically or manually extracted background knowledge. In this way, we mimic expertly chosen terms while preserving the possibility of obtaining any equation form. The paper shows that the extraction and use of knowledge allows it to outperform the SINDy algorithm in terms of search stability and robustness.
arXiv Detail & Related papers (2024-12-31T13:51:31Z)
A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time" It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z)
LLMs as Probabilistic Minimally Adequate Teachers for DFA Learning [11.037017229299607]
The emergence of intelligence in large language models (LLMs) has inspired investigations into their integration into automata learning. This paper introduces the probabilistic Minimally Adequate Teacher (pMAT) formulation. We develop techniques to improve answer accuracy and ensure the correctness of the learned automata.
arXiv Detail & Related papers (2024-08-06T07:12:09Z)
Model Pairing Using Embedding Translation for Backdoor Attack Detection on Open-Set Classification Tasks [63.269788236474234]
We propose to use model pairs on open-set classification tasks for detecting backdoors. We show that this score, can be an indicator for the presence of a backdoor despite models being of different architectures. This technique allows for the detection of backdoors on models designed for open-set classification tasks, which is little studied in the literature.
arXiv Detail & Related papers (2024-02-28T21:29:16Z)
Verifiable Learning for Robust Tree Ensembles [8.207928136395184]
A class of decision tree ensembles called large-spread ensembles admit a security verification algorithm running in restricted time. We show the benefits of this idea by designing a new training algorithm that automatically learns a large-spread decision tree ensemble from labelled data. Experimental results on public datasets confirm that large-spread ensembles trained using our algorithm can be verified in a matter of seconds.
arXiv Detail & Related papers (2023-05-05T15:37:23Z)
The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF. Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples. In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z)
Open-Set Automatic Target Recognition [52.27048031302509]
Automatic Target Recognition (ATR) is a category of computer vision algorithms which attempts to recognize targets on data obtained from different sensors. Existing ATR algorithms are developed for traditional closed-set methods where training and testing have the same class distribution. We propose an Open-set Automatic Target Recognition framework where we enable open-set recognition capability for ATR algorithms.
arXiv Detail & Related papers (2022-11-10T21:28:24Z)
Continual Learning with Deep Learning Methods in an Application-Oriented Context [0.0]
An important research area of Artificial Intelligence (AI) deals with the automatic derivation of knowledge from data. One type of machine learning algorithms that can be categorized as "deep learning" model is referred to as Deep Neural Networks (DNNs) DNNs are affected by a problem that prevents new knowledge from being added to an existing base.
arXiv Detail & Related papers (2022-07-12T10:13:33Z)
Learning Bayesian Networks in the Presence of Structural Side Information [22.734574764075226]
We study the problem of learning a Bayesian network (BN) of a set of variables when structural side information about the system is available. We develop an algorithm that efficiently incorporates such knowledge into the learning process. As a consequence of our work, we show that bounded treewidth BNs can be learned with complexity.
arXiv Detail & Related papers (2021-12-20T22:14:19Z)
A black-box adversarial attack for poisoning clustering [78.19784577498031]
We propose a black-box adversarial attack for crafting adversarial samples to test the robustness of clustering algorithms. We show that our attacks are transferable even against supervised algorithms such as SVMs, random forests, and neural networks.
arXiv Detail & Related papers (2020-09-09T18:19:31Z)
Discovering Reinforcement Learning Algorithms [53.72358280495428]
Reinforcement learning algorithms update an agent's parameters according to one of several possible rules. This paper introduces a new meta-learning approach that discovers an entire update rule. It includes both 'what to predict' (e.g. value functions) and 'how to learn from it' by interacting with a set of environments.
arXiv Detail & Related papers (2020-07-17T07:38:39Z)
Generalizing Outside the Training Set: When Can Neural Networks Learn Identity Effects? [1.2891210250935143]
We show that a class of algorithms including deep neural networks with standard architecture and training with backpropagation can generalize to novel inputs. We demonstrate our theory with computational experiments in which we explore the effect of different input encodings on the ability of algorithms to generalize to novel inputs.
arXiv Detail & Related papers (2020-05-09T01:08:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.