Related papers: Extracting PAC Decision Trees from Black Box Binary Classifiers: The Gender Bias Study Case on BERT-based Language Models

Extracting PAC Decision Trees from Black Box Binary Classifiers: The Gender Bias Study Case on BERT-based Language Models

URL: http://arxiv.org/abs/2412.10513v1
Date: Fri, 13 Dec 2024 19:14:08 GMT
Title: Extracting PAC Decision Trees from Black Box Binary Classifiers: The Gender Bias Study Case on BERT-based Language Models
Authors: Ana Ozaki, Roberto Confalonieri, Ricardo Guimarães, Anders Imenes,
Abstract summary: Decision trees are a popular machine learning method, known for their inherent explainability.<n>In Explainable AI, decision trees can be used as surrogate models for complex black box AI models or as approximations of parts of such models.<n>We investigate the use of the Probably Approximately Correct (PAC) framework to provide a theoretical guarantee of fidelity for decision trees extracted from AI models.
Score: 7.94649144127614
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Decision trees are a popular machine learning method, known for their inherent explainability. In Explainable AI, decision trees can be used as surrogate models for complex black box AI models or as approximations of parts of such models. A key challenge of this approach is determining how accurately the extracted decision tree represents the original model and to what extent it can be trusted as an approximation of their behavior. In this work, we investigate the use of the Probably Approximately Correct (PAC) framework to provide a theoretical guarantee of fidelity for decision trees extracted from AI models. Based on theoretical results from the PAC framework, we adapt a decision tree algorithm to ensure a PAC guarantee under certain conditions. We focus on binary classification and conduct experiments where we extract decision trees from BERT-based language models with PAC guarantees. Our results indicate occupational gender bias in these models.

Related papers

Leveraging Predictive Equivalence in Decision Trees [15.961209879141066]
Decision trees are widely used for interpretable machine learning.<n>We present a representation of decision trees that does not exhibit predictive equivalence.<n>We show that decision trees are surprisingly robust to test-time missingness of feature values.
arXiv Detail & Related papers (2025-06-17T03:11:30Z)
Learning Decision Trees as Amortized Structure Inference [59.65621207449269]
We propose a hybrid amortized structure inference approach to learn predictive decision tree ensembles given data. We show that our approach, DT-GFN, outperforms state-of-the-art decision tree and deep learning methods on standard classification benchmarks.
arXiv Detail & Related papers (2025-03-10T07:05:07Z)
Decision Trees for Interpretable Clusters in Mixture Models and Deep Representations [5.65604054654671]
We introduce the notion of an explainability-to-noise ratio for mixture models. We propose an algorithm that takes as input a mixture model and constructs a suitable tree in data-independent time. We prove upper and lower bounds on the error rate of the resulting decision tree.
arXiv Detail & Related papers (2024-11-03T14:00:20Z)
Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning [53.241569810013836]
We propose a novel framework that utilizes large language models (LLMs) to identify effective feature generation rules. We use decision trees to convey this reasoning information, as they can be easily represented in natural language. OCTree consistently enhances the performance of various prediction models across diverse benchmarks.
arXiv Detail & Related papers (2024-06-12T08:31:34Z)
Learning accurate and interpretable decision trees [27.203303726977616]
We develop approaches to design decision tree learning algorithms given repeated access to data from the same domain. We study the sample complexity of tuning prior parameters in Bayesian decision tree learning, and extend our results to decision tree regression. We also study the interpretability of the learned decision trees and introduce a data-driven approach for optimizing the explainability versus accuracy trade-off using decision trees.
arXiv Detail & Related papers (2024-05-24T20:10:10Z)
An Interpretable Client Decision Tree Aggregation process for Federated Learning [7.8973037023478785]
We propose an Interpretable Client Decision Tree aggregation process for Federated Learning scenarios. This model is based on aggregating multiple decision paths of the decision trees and can be used on different decision tree types, such as ID3 and CART. We carry out the experiments within four datasets, and the analysis shows that the tree built with the model improves the local models, and outperforms the state-of-the-art.
arXiv Detail & Related papers (2024-04-03T06:53:56Z)
Modeling Boundedly Rational Agents with Latent Inference Budgets [56.24971011281947]
We introduce a latent inference budget model (L-IBM) that models agents' computational constraints explicitly. L-IBMs make it possible to learn agent models using data from diverse populations of suboptimal actors. We show that L-IBMs match or outperform Boltzmann models of decision-making under uncertainty.
arXiv Detail & Related papers (2023-12-07T03:55:51Z)
Provably Efficient UCB-type Algorithms For Learning Predictive State Representations [55.00359893021461]
The sequential decision-making problem is statistically learnable if it admits a low-rank structure modeled by predictive state representations (PSRs) This paper proposes the first known UCB-type approach for PSRs, featuring a novel bonus term that upper bounds the total variation distance between the estimated and true models. In contrast to existing approaches for PSRs, our UCB-type algorithms enjoy computational tractability, last-iterate guaranteed near-optimal policy, and guaranteed model accuracy.
arXiv Detail & Related papers (2023-07-01T18:35:21Z)
A cautionary tale on fitting decision trees to data from additive models: generalization lower bounds [9.546094657606178]
We study the generalization performance of decision trees with respect to different generative regression models. This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data. We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models.
arXiv Detail & Related papers (2021-10-18T21:22:40Z)
Decision Machines: Congruent Decision Trees [0.0]
We propose Decision Machines, which embed Boolean tests into a binary vector space and represent the tree structure as a matrices. We explore the congruence of decision trees and attention mechanisms, opening new avenues for optimizing decision trees and potentially enhancing their predictive power.
arXiv Detail & Related papers (2021-01-27T12:23:24Z)
Genetic Adversarial Training of Decision Trees [6.85316573653194]
We put forward a novel learning methodology for ensembles of decision trees based on a genetic algorithm which is able to train a decision tree for maximizing its accuracy and its robustness to adversarial perturbations. We implemented this genetic adversarial training algorithm in a tool called Meta-Silvae (MS) and we experimentally evaluated it on some reference datasets used in adversarial training.
arXiv Detail & Related papers (2020-12-21T14:05:57Z)
Rectified Decision Trees: Exploring the Landscape of Interpretable and Effective Machine Learning [66.01622034708319]
We propose a knowledge distillation based decision trees extension, dubbed rectified decision trees (ReDT) We extend the splitting criteria and the ending condition of the standard decision trees, which allows training with soft labels. We then train the ReDT based on the soft label distilled from a well-trained teacher model through a novel jackknife-based method.
arXiv Detail & Related papers (2020-08-21T10:45:25Z)
ENTMOOT: A Framework for Optimization over Ensemble Tree Models [57.98561336670884]
ENTMOOT is a framework for integrating tree models into larger optimization problems. We show how ENTMOOT allows a simple integration of tree models into decision-making and black-box optimization.
arXiv Detail & Related papers (2020-03-10T14:34:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.