Comprehensible Counterfactual Explanation on Kolmogorov-Smirnov Test
- URL: http://arxiv.org/abs/2011.01223v2
- Date: Fri, 16 Apr 2021 01:23:23 GMT
- Title: Comprehensible Counterfactual Explanation on Kolmogorov-Smirnov Test
- Authors: Zicun Cong, Lingyang Chu, Yu Yang, Jian Pei
- Abstract summary: We tackle the problem of producing counterfactual explanations for test data failing the Kolmogorov-Smirnov (KS) test.
We develop an efficient algorithm MOCHE that avoids enumerating and checking an exponential number of subsets of the test set failing the KS test.
- Score: 56.5373227424117
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Kolmogorov-Smirnov (KS) test is popularly used in many applications, such
as anomaly detection, astronomy, database security and AI systems. One
challenge remained untouched is how we can obtain an explanation on why a test
set fails the KS test. In this paper, we tackle the problem of producing
counterfactual explanations for test data failing the KS test. Concept-wise, we
propose the notion of most comprehensible counterfactual explanations, which
accommodates both the KS test data and the user domain knowledge in producing
explanations. Computation-wise, we develop an efficient algorithm MOCHE (for
MOst CompreHensible Explanation) that avoids enumerating and checking an
exponential number of subsets of the test set failing the KS test. MOCHE not
only guarantees to produce the most comprehensible counterfactual explanations,
but also is orders of magnitudes faster than the baselines. Experiment-wise, we
present a systematic empirical study on a series of benchmark real datasets to
verify the effectiveness, efficiency and scalability of most comprehensible
counterfactual explanations and MOCHE.
Related papers
- MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs [80.96119560172224]
Large language models (LLMs) can solve arithmetic word problems with high accuracy, but little is known about how well they generalize to problems that are more complex than the ones on which they have been trained.
We present a framework for evaluating LLMs on problems with arbitrarily complex arithmetic proofs, called MathGAP.
arXiv Detail & Related papers (2024-10-17T12:48:14Z) - Provably Neural Active Learning Succeeds via Prioritizing Perplexing Samples [53.95282502030541]
Neural Network-based active learning (NAL) is a cost-effective data selection technique that utilizes neural networks to select and train on a small subset of samples.
We try to move one step forward by offering a unified explanation for the success of both query criteria-based NAL from a feature learning view.
arXiv Detail & Related papers (2024-06-06T10:38:01Z) - Sanity Checks for Explanation Uncertainty [6.9060054915724]
Explanations for machine learning models can be hard to interpret or be wrong.
We propose sanity checks for uncertainty explanation methods, where a weight and data randomization tests are defined for explanations with uncertainty.
We experimentally show the validity and effectiveness of these tests on the CIFAR10 and California Housing datasets.
arXiv Detail & Related papers (2024-03-25T21:39:33Z) - Membership Testing in Markov Equivalence Classes via Independence Query
Oracles [17.84347390320128]
We show that it is relatively easier to test causal relationships than to learn them.
In particular, it requires exponentially less independence tests in graphs featuring high in-degrees and small clique sizes.
arXiv Detail & Related papers (2024-03-09T02:10:08Z) - Optimal Decision Tree and Adaptive Submodular Ranking with Noisy Outcomes [9.321976218862542]
In pool-based active learning, the learner is given an unlabeled data set and aims to efficiently learn the unknown hypothesis by querying the labels of the data points.
This can be formulated as the classical Optimal Decision Tree (ODT) problem: Given a set of tests, a set of hypotheses, and an outcome for each pair of test and hypothesis, our objective is to find a low-cost testing procedure (i.e., decision tree) that identifies the true hypothesis.
In this work, we study a fundamental variant of the ODT problem in which some test outcomes are noisy, even in the more general
arXiv Detail & Related papers (2023-12-23T21:47:50Z) - PARs: Predicate-based Association Rules for Efficient and Accurate
Model-Agnostic Anomaly Explanation [2.280762565226767]
We present a novel approach for efficient and accurate model-agnostic anomaly explanation using Predicate-based Association Rules (PARs)
Our user study indicates that the anomaly explanation form of PARs is better comprehended and preferred by regular users of anomaly detection systems.
arXiv Detail & Related papers (2023-12-18T06:45:31Z) - Understanding and Mitigating Classification Errors Through Interpretable
Token Patterns [58.91023283103762]
Characterizing errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors.
We propose to discover those patterns of tokens that distinguish correct and erroneous predictions.
We show that our method, Premise, performs well in practice.
arXiv Detail & Related papers (2023-11-18T00:24:26Z) - Precise Error Rates for Computationally Efficient Testing [75.63895690909241]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity.
An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z) - Don't Explain Noise: Robust Counterfactuals for Randomized Ensembles [50.81061839052459]
We formalize the generation of robust counterfactual explanations as a probabilistic problem.
We show the link between the robustness of ensemble models and the robustness of base learners.
Our method achieves high robustness with only a small increase in the distance from counterfactual explanations to their initial observations.
arXiv Detail & Related papers (2022-05-27T17:28:54Z) - Recursive Causal Structure Learning in the Presence of Latent Variables
and Selection Bias [27.06618125828978]
We consider the problem of learning the causal MAG of a system from observational data in the presence of latent variables and selection bias.
We propose a novel computationally efficient constraint-based method that is sound and complete.
We provide experimental results to compare the proposed approach with the state of the art on both synthetic and real-world structures.
arXiv Detail & Related papers (2021-10-22T19:49:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.