An Interpretable Measure for Quantifying Predictive Dependence between Continuous Random Variables -- Extended Version
- URL: http://arxiv.org/abs/2501.10815v1
- Date: Sat, 18 Jan 2025 16:25:20 GMT
- Title: An Interpretable Measure for Quantifying Predictive Dependence between Continuous Random Variables -- Extended Version
- Authors: Renato Assunção, Flávio Figueiredo, Francisco N. Tinoco Júnior, Léo M. de Sá-Freire, Fábio Silva,
- Abstract summary: We introduce a novel measure that assesses the degree of association between continuous variables $X$ and $Y$.
A key advantage of this measure is its interpretability.
We evaluate the performance of our measure on over 90,000 real and synthetic datasets.
- Score: 0.0
- License:
- Abstract: A fundamental task in statistical learning is quantifying the joint dependence or association between two continuous random variables. We introduce a novel, fully non-parametric measure that assesses the degree of association between continuous variables $X$ and $Y$, capable of capturing a wide range of relationships, including non-functional ones. A key advantage of this measure is its interpretability: it quantifies the expected relative loss in predictive accuracy when the distribution of $X$ is ignored in predicting $Y$. This measure is bounded within the interval [0,1] and is equal to zero if and only if $X$ and $Y$ are independent. We evaluate the performance of our measure on over 90,000 real and synthetic datasets, benchmarking it against leading alternatives. Our results demonstrate that the proposed measure provides valuable insights into underlying relationships, particularly in cases where existing methods fail to capture important dependencies.
Related papers
- Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs.
We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint.
We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z) - Uncertainty in Language Models: Assessment through Rank-Calibration [65.10149293133846]
Language Models (LMs) have shown promising performance in natural language generation.
It is crucial to correctly quantify their uncertainty in responding to given inputs.
We develop a novel and practical framework, termed $Rank$-$Calibration$, to assess uncertainty and confidence measures for LMs.
arXiv Detail & Related papers (2024-04-04T02:31:05Z) - Sparsified Simultaneous Confidence Intervals for High-Dimensional Linear Models [4.675899216825188]
We propose a notion of simultaneous confidence intervals called the sparsified simultaneous confidence intervals.
Our intervals are sparse in the sense that some of the intervals' upper and lower bounds are shrunken to zero.
The proposed method can be coupled with various selection procedures, making it ideal for comparing their uncertainty.
arXiv Detail & Related papers (2023-07-14T18:37:57Z) - Policy evaluation from a single path: Multi-step methods, mixing and
mis-specification [45.88067550131531]
We study non-parametric estimation of the value function of an infinite-horizon $gamma$-discounted Markov reward process.
We provide non-asymptotic guarantees for a general family of kernel-based multi-step temporal difference estimates.
arXiv Detail & Related papers (2022-11-07T23:15:25Z) - Revealing Unobservables by Deep Learning: Generative Element Extraction
Networks (GEEN) [5.3028918247347585]
This paper proposes a novel method for estimating realizations of a latent variable $X*$ in a random sample.
To the best of our knowledge, this paper is the first to provide such identification in observation.
arXiv Detail & Related papers (2022-10-04T01:09:05Z) - Nonparametric Conditional Local Independence Testing [69.31200003384122]
Conditional local independence is an independence relation among continuous time processes.
No nonparametric test of conditional local independence has been available.
We propose such a nonparametric test based on double machine learning.
arXiv Detail & Related papers (2022-03-25T10:31:02Z) - Multivariate Probabilistic Regression with Natural Gradient Boosting [63.58097881421937]
We propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution.
Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches.
arXiv Detail & Related papers (2021-06-07T17:44:49Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - Neural Methods for Point-wise Dependency Estimation [129.93860669802046]
We focus on estimating point-wise dependency (PD), which quantitatively measures how likely two outcomes co-occur.
We demonstrate the effectiveness of our approaches in 1) MI estimation, 2) self-supervised representation learning, and 3) cross-modal retrieval task.
arXiv Detail & Related papers (2020-06-09T23:26:15Z) - Estimation of Accurate and Calibrated Uncertainties in Deterministic
models [0.8702432681310401]
We devise a method to transform a deterministic prediction into a probabilistic one.
We show that for doing so, one has to compromise between the accuracy and the reliability (calibration) of such a model.
We show several examples both with synthetic data, where the underlying hidden noise can accurately be recovered, and with large real-world datasets.
arXiv Detail & Related papers (2020-03-11T04:02:56Z) - Optimal rates for independence testing via $U$-statistic permutation
tests [7.090165638014331]
We study the problem of independence testing given independent and identically distributed pairs taking values in a $sigma$-finite, separable measure space.
We first show that there is no valid test of independence that is uniformly consistent against alternatives of the form $f: D(f) geq rho2 $.
arXiv Detail & Related papers (2020-01-15T19:04:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.