Related papers: Optimal Interpretability-Performance Trade-off of Classification Trees with Black-Box Reinforcement Learning

Optimal Interpretability-Performance Trade-off of Classification Trees with Black-Box Reinforcement Learning

URL: http://arxiv.org/abs/2304.05839v1
Date: Tue, 11 Apr 2023 09:43:23 GMT
Title: Optimal Interpretability-Performance Trade-off of Classification Trees with Black-Box Reinforcement Learning
Authors: Hector Kohler (Scool, CRIStAL), Riad Akrour (Scool, CRIStAL), Philippe Preux (Scool, CRIStAL)
Abstract summary: Interpretability of AI models allows for user safety checks to build trust in these models. Decision trees (DTs) provide a global view on the learned model and clearly outlines the role of the features that are critical to classify a given data. To learn compact trees, a Reinforcement Learning framework has been recently proposed to explore the space of DTs.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Interpretability of AI models allows for user safety checks to build trust in these models. In particular, decision trees (DTs) provide a global view on the learned model and clearly outlines the role of the features that are critical to classify a given data. However, interpretability is hindered if the DT is too large. To learn compact trees, a Reinforcement Learning (RL) framework has been recently proposed to explore the space of DTs. A given supervised classification task is modeled as a Markov decision problem (MDP) and then augmented with additional actions that gather information about the features, equivalent to building a DT. By appropriately penalizing these actions, the RL agent learns to optimally trade-off size and performance of a DT. However, to do so, this RL agent has to solve a partially observable MDP. The main contribution of this paper is to prove that it is sufficient to solve a fully observable problem to learn a DT optimizing the interpretability-performance trade-off. As such any planning or RL algorithm can be used. We demonstrate the effectiveness of this approach on a set of classical supervised classification datasets and compare our approach with other interpretability-performance optimizing methods.

Related papers

Forgetting: A New Mechanism Towards Better Large Language Model Fine-tuning [53.398270878295754]
Supervised fine-tuning (SFT) plays a critical role for pretrained large language models (LLMs)<n>We suggest categorizing tokens within each corpus into two parts -- positive and negative tokens -- based on whether they are useful to improve model performance.<n>We conduct experiments on well-established benchmarks, finding that this forgetting mechanism not only improves overall model performance and also facilitate more diverse model responses.
arXiv Detail & Related papers (2025-08-06T11:22:23Z)
TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity Tree [52.44403214958304]
In this paper, we introduce TreeLoRA, a novel approach that constructs layer-wise adapters by leveraging hierarchical gradient similarity.<n>To reduce the computational burden of task similarity estimation, we employ bandit techniques to develop an algorithm based on lower confidence bounds.<n> experiments on both vision transformers (ViTs) and large language models (LLMs) demonstrate the effectiveness and efficiency of our approach.
arXiv Detail & Related papers (2025-06-12T05:25:35Z)
Enhancing Training Data Attribution with Representational Optimization [57.61977909113113]
Training data attribution methods aim to measure how training data impacts a model's predictions.<n>We propose AirRep, a representation-based approach that closes this gap by learning task-specific and model-aligned representations explicitly for TDA.<n>AirRep introduces two key innovations: a trainable encoder tuned for attribution quality, and an attention-based pooling mechanism that enables accurate estimation of group-wise influence.
arXiv Detail & Related papers (2025-05-24T05:17:53Z)
Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining [74.83412846804977]
Reinforcement learning (RL)-based fine-tuning has become a crucial step in post-training language models. We present a systematic end-to-end study of RL fine-tuning for mathematical reasoning by training models entirely from scratch.
arXiv Detail & Related papers (2025-04-10T17:15:53Z)
Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness [27.43137305486112]
We propose a novel Self-supervised Preference Optimization (SPO) framework, which constructs a self-supervised preference degree loss combined with the alignment loss. The results demonstrate that SPO can be seamlessly integrated with existing preference optimization methods to achieve state-of-the-art performance.
arXiv Detail & Related papers (2024-09-26T12:37:26Z)
Interpretable Modeling of Deep Reinforcement Learning Driven Scheduling [3.890533943135602]
We present a framework called IRL (Interpretable Reinforcement Learning) to address the issue of interpretability of DRL scheduling. ILR is capable of converting a black-box DNN policy into an interpretable rulebased decision tree while maintaining comparable scheduling performance.
arXiv Detail & Related papers (2024-03-24T20:56:16Z)
Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification. We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations. Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z)
Learning Interpretable Rules for Scalable Data Representation and Classification [11.393431987232425]
Rule-based Learner Representation (RRL) learns interpretable non-fuzzy rules for data representation and classification. RRL can be easily adjusted to obtain a trade-off between classification accuracy and model complexity for different scenarios.
arXiv Detail & Related papers (2023-10-22T15:55:58Z)
Limits of Actor-Critic Algorithms for Decision Tree Policies Learning in IBMDPs [9.587070290189507]
Interpretability of AI models allows for user safety checks to build trust in such AIs. Decision Trees (DTs) provide a global look at the learned model and transparently reveal which features of the input are critical for making a decision. Recent Reinforcement Learning framework has been proposed to explore the space of DTs using deep RL.
arXiv Detail & Related papers (2023-09-23T13:06:20Z)
Denoised MDPs: Learning World Models Better Than the World Itself [94.74665254213588]
This work categorizes information out in the wild into four types based on controllability and relation with reward, and formulates useful information as that which is both controllable and reward-relevant. Experiments on variants of DeepMind Control Suite and RoboDesk demonstrate superior performance of our denoised world model over using raw observations alone.
arXiv Detail & Related papers (2022-06-30T17:59:49Z)
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE) In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z)
Optimal Decision Diagrams for Classification [68.72078059880018]
We study the training of optimal decision diagrams from a mathematical programming perspective. We introduce a novel mixed-integer linear programming model for training. We show how this model can be easily extended for fairness, parsimony, and stability notions.
arXiv Detail & Related papers (2022-05-28T18:31:23Z)
Feeding What You Need by Understanding What You Learned [54.400455868448695]
Machine Reading (MRC) reveals the ability to understand a given text passage and answer questions based on it. Existing research works in MRC rely heavily on large-size models and corpus to improve the performance evaluated by metrics such as Exact Match. We argue that a deep understanding of model capabilities and data properties can help us feed a model with appropriate training data.
arXiv Detail & Related papers (2022-03-05T14:15:59Z)
Scalable Rule-Based Representation Learning for Interpretable Classification [12.736847587988853]
Rule-based Learner Representation (RRL) learns interpretable non-fuzzy rules for data representation and classification. RRL can be easily adjusted to obtain a trade-off between classification accuracy and model complexity for different scenarios.
arXiv Detail & Related papers (2021-09-30T13:07:42Z)
Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method. PCL implicitly encodes semantic structures of the data into the learned embedding space. PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.