Statistical Hypothesis Testing for Information Value (IV)
- URL: http://arxiv.org/abs/2309.13183v2
- Date: Sat, 30 Sep 2023 00:27:31 GMT
- Title: Statistical Hypothesis Testing for Information Value (IV)
- Authors: Helder Rojas, Cirilo Alvarez and Nilton Rojas
- Abstract summary: We propose a non-parametric hypothesis test to evaluate the predictive power of features contemplated in a data set.
We show how to efficiently compute our test statistic and we study its performance on simulated data.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Information value (IV) is a quite popular technique for features selection
before the modeling phase. There are practical criteria, based on fixed
thresholds for IV, but at the same time mysterious and lacking theoretical
arguments, to decide if a predictor has sufficient predictive power to be
considered in the modeling phase. However, the mathematical development and
statistical inference methods for this technique are almost nonexistent in the
literature. In this paper we present a theoretical framework for IV, and at the
same time, we propose a non-parametric hypothesis test to evaluate the
predictive power of features contemplated in a data set. Due to its
relationship with divergence measures developed in the Information Theory, we
call our proposal the J - Divergence test. We show how to efficiently compute
our test statistic and we study its performance on simulated data. In various
scenarios, particularly in unbalanced data sets, we show its superiority over
conventional criteria based on fixed thresholds. Furthermore, we apply our test
on fraud identification data and provide an open-source Python library, called
"statistical-iv"(https://pypi.org/project/statistical-iv/), where we implement
our main results.
Related papers
- Statistical Test for Generated Hypotheses by Diffusion Models [21.378672594642616]
We consider a medical diagnostic task using generated images by diffusion models, and propose a statistical test to quantify its reliability.
Using the proposed method, the statistical reliability of medical image diagnostic results can be quantified in the form of a p-value, allowing for decision-making with a controlled error rate.
arXiv Detail & Related papers (2024-02-19T02:32:45Z) - Precise Error Rates for Computationally Efficient Testing [75.63895690909241]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity.
An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z) - Toward Generalizable Machine Learning Models in Speech, Language, and
Hearing Sciences: Estimating Sample Size and Reducing Overfitting [1.8416014644193064]
This study uses Monte Carlo simulations to quantify the interactions between the employed cross-validation method and the discnative power of features.
The required sample size with a single holdout could be 50% higher than what would be needed if nested crossvalidation were used.
arXiv Detail & Related papers (2023-08-22T05:14:42Z) - Learning Robust Statistics for Simulation-based Inference under Model
Misspecification [23.331522354991527]
We propose the first general approach to handle model misspecification that works across different classes of simulation-based inference methods.
We show that our method yields robust inference in misspecified scenarios, whilst still being accurate when the model is well-specified.
arXiv Detail & Related papers (2023-05-25T09:06:26Z) - Conformal prediction for the design problem [72.14982816083297]
In many real-world deployments of machine learning, we use a prediction algorithm to choose what data to test next.
In such settings, there is a distinct type of distribution shift between the training and test data.
We introduce a method to quantify predictive uncertainty in such settings.
arXiv Detail & Related papers (2022-02-08T02:59:12Z) - Learning to be a Statistician: Learned Estimator for Number of Distinct
Values [54.629042119819744]
Estimating the number of distinct values (NDV) in a column is useful for many tasks in database systems.
In this work, we focus on how to derive accurate NDV estimations from random (online/offline) samples.
We propose to formulate the NDV estimation task in a supervised learning framework, and aim to learn a model as the estimator.
arXiv Detail & Related papers (2022-02-06T15:42:04Z) - Robust Validation: Confident Predictions Even When Distributions Shift [19.327409270934474]
We describe procedures for robust predictive inference, where a model provides uncertainty estimates on its predictions rather than point predictions.
We present a method that produces prediction sets (almost exactly) giving the right coverage level for any test distribution in an $f$-divergence ball around the training population.
An essential component of our methodology is to estimate the amount of expected future data shift and build robustness to it.
arXiv Detail & Related papers (2020-08-10T17:09:16Z) - Balance-Subsampled Stable Prediction [55.13512328954456]
We propose a novel balance-subsampled stable prediction (BSSP) algorithm based on the theory of fractional factorial design.
A design-theoretic analysis shows that the proposed method can reduce the confounding effects among predictors induced by the distribution shift.
Numerical experiments on both synthetic and real-world data sets demonstrate that our BSSP algorithm significantly outperforms the baseline methods for stable prediction across unknown test data.
arXiv Detail & Related papers (2020-06-08T07:01:38Z) - Performance metrics for intervention-triggering prediction models do not
reflect an expected reduction in outcomes from using the model [71.9860741092209]
Clinical researchers often select among and evaluate risk prediction models.
Standard metrics calculated from retrospective data are only related to model utility under certain assumptions.
When predictions are delivered repeatedly throughout time, the relationship between standard metrics and utility is further complicated.
arXiv Detail & Related papers (2020-06-02T16:26:49Z) - Marginal likelihood computation for model selection and hypothesis
testing: an extensive review [66.37504201165159]
This article provides a comprehensive study of the state-of-the-art of the topic.
We highlight limitations, benefits, connections and differences among the different techniques.
Problems and possible solutions with the use of improper priors are also described.
arXiv Detail & Related papers (2020-05-17T18:31:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.