A Model-free Closeness-of-influence Test for Features in Supervised
Learning
- URL: http://arxiv.org/abs/2306.11855v1
- Date: Tue, 20 Jun 2023 19:20:18 GMT
- Title: A Model-free Closeness-of-influence Test for Features in Supervised
Learning
- Authors: Mohammad Mehrabi and Ryan A. Rossi
- Abstract summary: We study the question of assessing the difference of influence that the two given features have on the response value.
We first propose a notion of closeness for the influence of features, and show that our definition recovers the familiar notion of the magnitude of coefficients in the model.
We then propose a novel method to test for the closeness of influence in general model-free supervised learning problems.
- Score: 23.345517302581044
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding the effect of a feature vector $x \in \mathbb{R}^d$ on the
response value (label) $y \in \mathbb{R}$ is the cornerstone of many
statistical learning problems. Ideally, it is desired to understand how a set
of collected features combine together and influence the response value, but
this problem is notoriously difficult, due to the high-dimensionality of data
and limited number of labeled data points, among many others. In this work, we
take a new perspective on this problem, and we study the question of assessing
the difference of influence that the two given features have on the response
value. We first propose a notion of closeness for the influence of features,
and show that our definition recovers the familiar notion of the magnitude of
coefficients in the parametric model. We then propose a novel method to test
for the closeness of influence in general model-free supervised learning
problems. Our proposed test can be used with finite number of samples with
control on type I error rate, no matter the ground truth conditional law
$\mathcal{L}(Y |X)$. We analyze the power of our test for two general learning
problems i) linear regression, and ii) binary classification under mixture of
Gaussian models, and show that under the proper choice of score function, an
internal component of our test, with sufficient number of samples will achieve
full statistical power. We evaluate our findings through extensive numerical
simulations, specifically we adopt the datamodel framework (Ilyas, et al.,
2022) for CIFAR-10 dataset to identify pairs of training samples with different
influence on the trained model via optional black box training mechanisms.
Related papers
- Revisit, Extend, and Enhance Hessian-Free Influence Functions [26.105554752277648]
Influence functions serve as crucial tools for assessing sample influence in model interpretation, subset training set selection, and more.
In this paper, we revisit a specific, albeit effective approximation method known as Trac.
This method substitutes the inverse of the Hessian matrix with an identity matrix.
arXiv Detail & Related papers (2024-05-25T03:43:36Z) - Scalable Learning of Item Response Theory Models [53.43355949923962]
Item Response Theory (IRT) models aim to assess latent abilities of $n$ examinees along with latent difficulty characteristics of $m$ test items from categorical data.
We leverage the similarity of these models to logistic regression, which can be approximated accurately using small weighted subsets called coresets.
arXiv Detail & Related papers (2024-03-01T17:12:53Z) - The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes [30.30769701138665]
We introduce and explore the Mirrored Influence Hypothesis, highlighting a reciprocal nature of influence between training and test data.
Specifically, it suggests that evaluating the influence of training data on test predictions can be reformulated as an equivalent, yet inverse problem.
We introduce a new method for estimating the influence of training data, which requires calculating gradients for specific test samples, paired with a forward pass for each training point.
arXiv Detail & Related papers (2024-02-14T03:43:05Z) - Explaining Predictive Uncertainty with Information Theoretic Shapley
Values [6.49838460559032]
We adapt the popular Shapley value framework to explain various types of predictive uncertainty.
We implement efficient algorithms that perform well in a range of experiments on real and simulated data.
arXiv Detail & Related papers (2023-06-09T07:43:46Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - A Lagrangian Duality Approach to Active Learning [119.36233726867992]
We consider the batch active learning problem, where only a subset of the training data is labeled.
We formulate the learning problem using constrained optimization, where each constraint bounds the performance of the model on labeled samples.
We show, via numerical experiments, that our proposed approach performs similarly to or better than state-of-the-art active learning methods.
arXiv Detail & Related papers (2022-02-08T19:18:49Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - Causal Inference Under Unmeasured Confounding With Negative Controls: A
Minimax Learning Approach [84.29777236590674]
We study the estimation of causal parameters when not all confounders are observed and instead negative controls are available.
Recent work has shown how these can enable identification and efficient estimation via two so-called bridge functions.
arXiv Detail & Related papers (2021-03-25T17:59:19Z) - Significance tests of feature relevance for a blackbox learner [6.72450543613463]
We derive two consistent tests for the feature relevance of a blackbox learner.
The first evaluates a loss difference with perturbation on an inference sample.
The second splits the inference sample into two but does not require data perturbation.
arXiv Detail & Related papers (2021-03-02T00:59:19Z) - Gaussian Function On Response Surface Estimation [12.35564140065216]
We propose a new framework for interpreting (features and samples) black-box machine learning models via a metamodeling technique.
The metamodel can be estimated from data generated via a trained complex model by running the computer experiment on samples of data in the region of interest.
arXiv Detail & Related papers (2021-01-04T04:47:00Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.