Optimizing Binary Decision Diagrams with MaxSAT for classification
- URL: http://arxiv.org/abs/2203.11386v1
- Date: Mon, 21 Mar 2022 23:17:37 GMT
- Title: Optimizing Binary Decision Diagrams with MaxSAT for classification
- Authors: Hao Hu, Marie-Jos\'e Huguet, and Mohamed Siala
- Abstract summary: A growing interest in explainable artificial intelligence motivates the need for interpretable machine learning (ML) models.
Recently, several exact methods for computing such models are proposed to overcome weaknesses of traditional methods.
In this paper, we first propose SAT-based models for learning optimal Binary decision diagrams (BDDs)
Then, we lift the encoding to a MaxSAT model to learn optimal BDDs in limited depths.
Finally, we tackle the fragmentation problem by introducing a method to merge compatible subtrees for the BDDs found via the MaxSAT model.
- Score: 3.2894524838755608
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The growing interest in explainable artificial intelligence (XAI) for
critical decision making motivates the need for interpretable machine learning
(ML) models. In fact, due to their structure (especially with small sizes),
these models are inherently understandable by humans. Recently, several exact
methods for computing such models are proposed to overcome weaknesses of
traditional heuristic methods by providing more compact models or better
prediction quality.
Despite their compressed representation of Boolean functions, Binary decision
diagrams (BDDs) did not gain enough interest as other interpretable ML models.
In this paper, we first propose SAT-based models for learning optimal BDDs (in
terms of the number of features) that classify all input examples. Then, we
lift the encoding to a MaxSAT model to learn optimal BDDs in limited depths,
that maximize the number of examples correctly classified. Finally, we tackle
the fragmentation problem by introducing a method to merge compatible subtrees
for the BDDs found via the MaxSAT model. Our empirical study shows clear
benefits of the proposed approach in terms of prediction quality and
intrepretability (i.e., lighter size) compared to the state-of-the-art
approaches.
Related papers
- Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models [46.959380978972206]
We study inference scaling laws and compute-optimal inference for large language models (LLMs) training.
As a first step towards understanding and designing compute-optimal inference methods, we studied cost-performance trade-offs for inference strategies.
Our findings indicate smaller models (e.g., Llemma-7B) can outperform larger models given the same computation budgets.
arXiv Detail & Related papers (2024-08-01T17:16:04Z) - Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - Oracle Inequalities for Model Selection in Offline Reinforcement
Learning [105.74139523696284]
We study the problem of model selection in offline RL with value function approximation.
We propose the first model selection algorithm for offline RL that achieves minimax rate-optimal inequalities up to logarithmic factors.
We conclude with several numerical simulations showing it is capable of reliably selecting a good model class.
arXiv Detail & Related papers (2022-11-03T17:32:34Z) - Margin Optimal Classification Trees [0.0]
We present a novel mixed-integer formulation for the Optimal Classification Tree ( OCT) problem.
Our model, denoted as Margin Optimal Classification Tree (MARGOT), exploits the generalization capabilities of Support Vector Machines for binary classification.
To enhance the interpretability of our approach, we analyse two alternative versions of MARGOT, which include feature selection constraints inducing local sparsity of the hyperplanes.
arXiv Detail & Related papers (2022-10-19T14:08:56Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - A model aggregation approach for high-dimensional large-scale
optimization [2.1104930506758275]
We propose a model aggregation method in the Bayesian optimization (MamBO) algorithm for efficiently solving high-dimensional large-scale optimization problems.
MamBO uses a combination of subsampling and subspace embeddings to collectively address high dimensionality and large-scale issues.
Our proposed model aggregation method reduces these lower-dimensional surrogate model risks and improves the robustness of the BO algorithm.
arXiv Detail & Related papers (2022-05-16T08:58:42Z) - Sparse MoEs meet Efficient Ensembles [49.313497379189315]
We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs)
We present Efficient Ensemble of Experts (E$3$), a scalable and simple ensemble of sparse MoEs that takes the best of both classes of models, while using up to 45% fewer FLOPs than a deep ensemble.
arXiv Detail & Related papers (2021-10-07T11:58:35Z) - Robust Optimal Classification Trees Against Adversarial Examples [5.254093731341154]
We propose a collection of methods to train decision trees that are optimally robust against user-specified attack models.
We show that the min-max optimization problem that arises in adversarial learning can be solved using a single minimization formulation.
We also present a method that determines the upper bound on adversarial accuracy for any model using bipartite matching.
arXiv Detail & Related papers (2021-09-08T18:10:49Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z) - Learning Interpretable Models Using Uncertainty Oracles [12.879371384378164]
A desirable property of interpretable models is small size, so that they are easily understandable by humans.
This leads to the following challenges: (a) small sizes imply diminished accuracy, and (b) bespoke levers provided by model families to restrict size might be insufficient to reach the desired size-accuracy trade-off.
arXiv Detail & Related papers (2019-06-17T05:53:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.