Rule Generation for Classification: Scalability, Interpretability, and Fairness
- URL: http://arxiv.org/abs/2104.10751v4
- Date: Sun, 12 May 2024 12:51:09 GMT
- Title: Rule Generation for Classification: Scalability, Interpretability, and Fairness
- Authors: Tabea E. Röber, Adia C. Lumadjeng, M. Hakan Akyüz, Ş. İlker Birbil,
- Abstract summary: We propose a new rule-based optimization method for classification with constraints.
We address interpretability and fairness by assigning cost coefficients to the rules and introducing additional constraints.
The proposed method exhibits a good compromise between local interpretability and fairness on the one side, and accuracy on the other side.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a new rule-based optimization method for classification with constraints. The proposed method leverages column generation for linear programming, and hence, is scalable to large datasets. The resulting pricing subproblem is shown to be NP-Hard. We recourse to a decision tree-based heuristic and solve a proxy pricing subproblem for acceleration. The method returns a set of rules along with their optimal weights indicating the importance of each rule for learning. We address interpretability and fairness by assigning cost coefficients to the rules and introducing additional constraints. In particular, we focus on local interpretability and generalize separation criterion in fairness to multiple sensitive attributes and classes. We test the performance of the proposed methodology on a collection of datasets and present a case study to elaborate on its different aspects. The proposed rule-based learning method exhibits a good compromise between local interpretability and fairness on the one side, and accuracy on the other side.
Related papers
- Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Obtaining Explainable Classification Models using Distributionally
Robust Optimization [12.511155426574563]
We study generalized linear models constructed using sets of feature value rules.
An inherent trade-off exists between rule set sparsity and its prediction accuracy.
We propose a new formulation to learn an ensemble of rule sets that simultaneously addresses these competing factors.
arXiv Detail & Related papers (2023-11-03T15:45:34Z) - Post-hoc Bias Scoring Is Optimal For Fair Classification [12.897626117694317]
We introduce a novel instance-level measure of bias, which we call bias score, and the modification rule is a simple linear rule on top of the finite amount of bias scores.
In the case of DP and EOp constraints, the modification rule is thresholding a single bias score, while in the case of EO constraints we are required to fit a linear modification rule with 2 parameters.
arXiv Detail & Related papers (2023-10-09T13:54:08Z) - Boosting Fair Classifier Generalization through Adaptive Priority Reweighing [59.801444556074394]
A performance-promising fair algorithm with better generalizability is needed.
This paper proposes a novel adaptive reweighing method to eliminate the impact of the distribution shifts between training and test data on model generalizability.
arXiv Detail & Related papers (2023-09-15T13:04:55Z) - On Regularization and Inference with Label Constraints [62.60903248392479]
We compare two strategies for encoding label constraints in a machine learning pipeline, regularization with constraints and constrained inference.
For regularization, we show that it narrows the generalization gap by precluding models that are inconsistent with the constraints.
For constrained inference, we show that it reduces the population risk by correcting a model's violation, and hence turns the violation into an advantage.
arXiv Detail & Related papers (2023-07-08T03:39:22Z) - Efficient Learning of Interpretable Classification Rules [34.27987659227838]
This paper contributes an interpretable learning framework IMLI, that is based on maximum satisfiability (MaxSAT) for classification rules expressible in proposition logic.
In our experiments, IMLI achieves the best balance among prediction accuracy, interpretability, and scalability.
arXiv Detail & Related papers (2022-05-14T00:36:38Z) - Interpretable and Fair Boolean Rule Sets via Column Generation [18.08486863429421]
An integer program is formulated to optimally trade classification accuracy for rule simplicity.
We consider the fairness setting and extend the formulation to include explicit constraints on two different measures of classification parity.
Compared to other fair and interpretable classifiers, our method is able to find rule sets that meet stricter notions of fairness with a modest trade-off in accuracy.
arXiv Detail & Related papers (2021-11-16T13:40:28Z) - Fair Decision Rules for Binary Classification [0.0]
We consider the problem of building Boolean rule sets in disjunctive normal form (DNF)
We formulate the problem as an integer program that maximizes classification accuracy with explicit constraints on two different measures of classification parity.
Compared to other fair and interpretable classifiers, our method is able to find rule sets that meet stricter notions of fairness with a modest trade-off in accuracy.
arXiv Detail & Related papers (2021-07-03T02:32:17Z) - Binary Classification from Multiple Unlabeled Datasets via Surrogate Set
Classification [94.55805516167369]
We propose a new approach for binary classification from m U-sets for $mge2$.
Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC)
arXiv Detail & Related papers (2021-02-01T07:36:38Z) - An Integer Linear Programming Framework for Mining Constraints from Data [81.60135973848125]
We present a general framework for mining constraints from data.
In particular, we consider the inference in structured output prediction as an integer linear programming (ILP) problem.
We show that our approach can learn to solve 9x9 Sudoku puzzles and minimal spanning tree problems from examples without providing the underlying rules.
arXiv Detail & Related papers (2020-06-18T20:09:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.