A Model-free Closeness-of-influence Test for Features in Supervised
Learning
- URL: http://arxiv.org/abs/2306.11855v1
- Date: Tue, 20 Jun 2023 19:20:18 GMT
- Title: A Model-free Closeness-of-influence Test for Features in Supervised
Learning
- Authors: Mohammad Mehrabi and Ryan A. Rossi
- Abstract summary: We study the question of assessing the difference of influence that the two given features have on the response value.
We first propose a notion of closeness for the influence of features, and show that our definition recovers the familiar notion of the magnitude of coefficients in the model.
We then propose a novel method to test for the closeness of influence in general model-free supervised learning problems.
- Score: 23.345517302581044
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding the effect of a feature vector $x \in \mathbb{R}^d$ on the
response value (label) $y \in \mathbb{R}$ is the cornerstone of many
statistical learning problems. Ideally, it is desired to understand how a set
of collected features combine together and influence the response value, but
this problem is notoriously difficult, due to the high-dimensionality of data
and limited number of labeled data points, among many others. In this work, we
take a new perspective on this problem, and we study the question of assessing
the difference of influence that the two given features have on the response
value. We first propose a notion of closeness for the influence of features,
and show that our definition recovers the familiar notion of the magnitude of
coefficients in the parametric model. We then propose a novel method to test
for the closeness of influence in general model-free supervised learning
problems. Our proposed test can be used with finite number of samples with
control on type I error rate, no matter the ground truth conditional law
$\mathcal{L}(Y |X)$. We analyze the power of our test for two general learning
problems i) linear regression, and ii) binary classification under mixture of
Gaussian models, and show that under the proper choice of score function, an
internal component of our test, with sufficient number of samples will achieve
full statistical power. We evaluate our findings through extensive numerical
simulations, specifically we adopt the datamodel framework (Ilyas, et al.,
2022) for CIFAR-10 dataset to identify pairs of training samples with different
influence on the trained model via optional black box training mechanisms.
Related papers
- Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective.
The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning.
The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z) - Dissecting Representation Misalignment in Contrastive Learning via Influence Function [15.28417468377201]
We introduce the Extended Influence Function for Contrastive Loss (ECIF), an influence function crafted for contrastive loss.
ECIF considers both positive and negative samples and provides a closed-form approximation of contrastive learning models.
Building upon ECIF, we develop a series of algorithms for data evaluation, misalignment detection, and misprediction trace-back tasks.
arXiv Detail & Related papers (2024-11-18T15:45:41Z) - Most Influential Subset Selection: Challenges, Promises, and Beyond [9.479235005673683]
We study the Most Influential Subset Selection (MISS) problem, which aims to identify a subset of training samples with the greatest collective influence.
We conduct a comprehensive analysis of the prevailing approaches in MISS, elucidating their strengths and weaknesses.
We demonstrate that an adaptive version of theses which applies them iteratively, can effectively capture the interactions among samples.
arXiv Detail & Related papers (2024-09-25T20:00:23Z) - Explaining Predictive Uncertainty with Information Theoretic Shapley
Values [6.49838460559032]
We adapt the popular Shapley value framework to explain various types of predictive uncertainty.
We implement efficient algorithms that perform well in a range of experiments on real and simulated data.
arXiv Detail & Related papers (2023-06-09T07:43:46Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - A Lagrangian Duality Approach to Active Learning [119.36233726867992]
We consider the batch active learning problem, where only a subset of the training data is labeled.
We formulate the learning problem using constrained optimization, where each constraint bounds the performance of the model on labeled samples.
We show, via numerical experiments, that our proposed approach performs similarly to or better than state-of-the-art active learning methods.
arXiv Detail & Related papers (2022-02-08T19:18:49Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - Causal Inference Under Unmeasured Confounding With Negative Controls: A
Minimax Learning Approach [84.29777236590674]
We study the estimation of causal parameters when not all confounders are observed and instead negative controls are available.
Recent work has shown how these can enable identification and efficient estimation via two so-called bridge functions.
arXiv Detail & Related papers (2021-03-25T17:59:19Z) - Significance tests of feature relevance for a blackbox learner [6.72450543613463]
We derive two consistent tests for the feature relevance of a blackbox learner.
The first evaluates a loss difference with perturbation on an inference sample.
The second splits the inference sample into two but does not require data perturbation.
arXiv Detail & Related papers (2021-03-02T00:59:19Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.