Related papers: A feature selection method based on Shapley values robust to concept shift in regression

A feature selection method based on Shapley values robust to concept shift in regression

URL: http://arxiv.org/abs/2304.14774v3
Date: Mon, 25 Sep 2023 06:05:19 GMT
Title: A feature selection method based on Shapley values robust to concept shift in regression
Authors: Carlos Sebasti\'an and Carlos E. Gonz\'alez-Guill\'en
Abstract summary: In this paper, we introduce a direct relationship between Shapley values and prediction errors. We show that our proposed algorithm significantly outperforms state-of-the-art feature selection methods in concept shift scenarios. We also perform three analyses of standard situations to assess the algorithm's robustness in the absence of shifts.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Feature selection is one of the most relevant processes in any methodology for creating a statistical learning model. Usually, existing algorithms establish some criterion to select the most influential variables, discarding those that do not contribute to the model with any relevant information. This methodology makes sense in a static situation where the joint distribution of the data does not vary over time. However, when dealing with real data, it is common to encounter the problem of the dataset shift and, specifically, changes in the relationships between variables (concept shift). In this case, the influence of a variable cannot be the only indicator of its quality as a regressor of the model, since the relationship learned in the training phase may not correspond to the current situation. In tackling this problem, our approach establishes a direct relationship between the Shapley values and prediction errors, operating at a more local level to effectively detect the individual biases introduced by each variable. The proposed methodology is evaluated through various examples, including synthetic scenarios mimicking sudden and incremental shift situations, as well as two real-world cases characterized by concept shifts. Additionally, we perform three analyses of standard situations to assess the algorithm's robustness in the absence of shifts. The results demonstrate that our proposed algorithm significantly outperforms state-of-the-art feature selection methods in concept shift scenarios, while matching the performance of existing methodologies in static situations.

Related papers

Local Learning for Covariate Selection in Nonparametric Causal Effect Estimation with Latent Variables [13.12743473333296]
Estimating causal effects from nonexperimental data is a fundamental problem in many fields of science. We propose a novel local learning approach for covariate selection in nonparametric causal effect estimation. We validate our algorithm through extensive experiments on both synthetic and real-world data.
arXiv Detail & Related papers (2024-11-25T12:08:54Z)
Counterfactual Fairness through Transforming Data Orthogonal to Bias [7.109458605736819]
We propose a novel data pre-processing algorithm, Orthogonal to Bias (OB) OB is designed to eliminate the influence of a group of continuous sensitive variables, thus promoting counterfactual fairness in machine learning applications. OB is model-agnostic, making it applicable to a wide range of machine learning models and tasks.
arXiv Detail & Related papers (2024-03-26T16:40:08Z)
On the Limited Representational Power of Value Functions and its Links to Statistical (In)Efficiency [6.408072565019087]
We show information about the transition dynamics may be impossible to represent in the space of value functions. A deeper investigation points to the limitations of the representational power as the driver of the inefficiency.
arXiv Detail & Related papers (2024-03-11T20:05:48Z)
Information-Theoretic State Variable Selection for Reinforcement Learning [4.2050490361120465]
We introduce the Transfer Entropy Redundancy Criterion (TERC), an information-theoretic criterion. TERC determines if there is textitentropy transferred from state variables to actions during training. We define an algorithm based on TERC that provably excludes variables from the state that have no effect on the final performance of the agent.
arXiv Detail & Related papers (2024-01-21T14:51:09Z)
It's All in the Mix: Wasserstein Machine Learning with Mixed Features [5.739657897440173]
We present a practically efficient algorithm to solve mixed-feature problems. We demonstrate that our approach can significantly outperform existing methods that are to the presence of discrete features.
arXiv Detail & Related papers (2023-12-19T15:15:52Z)
Towards stable real-world equation discovery with assessing differentiating quality influence [52.2980614912553]
We propose alternatives to the commonly used finite differences-based method. We evaluate these methods in terms of applicability to problems, similar to the real ones, and their ability to ensure the convergence of equation discovery algorithms.
arXiv Detail & Related papers (2023-11-09T23:32:06Z)
Effective Restoration of Source Knowledge in Continual Test Time Adaptation [44.17577480511772]
This paper introduces an unsupervised domain change detection method that is capable of identifying domain shifts in dynamic environments. By restoring the knowledge from the source, it effectively corrects the negative consequences arising from the gradual deterioration of model parameters. We perform extensive experiments on benchmark datasets to demonstrate the superior performance of our method compared to state-of-the-art adaptation methods.
arXiv Detail & Related papers (2023-11-08T19:21:48Z)
Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data. We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures. We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z)
Towards Robust and Adaptive Motion Forecasting: A Causal Representation Perspective [72.55093886515824]
We introduce a causal formalism of motion forecasting, which casts the problem as a dynamic process with three groups of latent variables. We devise a modular architecture that factorizes the representations of invariant mechanisms and style confounders to approximate a causal graph. Experiment results on synthetic and real datasets show that our three proposed components significantly improve the robustness and reusability of the learned motion representations.
arXiv Detail & Related papers (2021-11-29T18:59:09Z)
Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets. Part of the challenge of learning robust models lies in the influence of unobserved confounders. We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z)
A One-step Approach to Covariate Shift Adaptation [82.01909503235385]
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution. We propose a novel one-step approach that jointly learns the predictive model and the associated weights in one optimization.
arXiv Detail & Related papers (2020-07-08T11:35:47Z)
Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments. We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data. Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.