Estimating Conditional Mutual Information for Dynamic Feature Selection
- URL: http://arxiv.org/abs/2306.03301v1
- Date: Mon, 5 Jun 2023 23:03:03 GMT
- Title: Estimating Conditional Mutual Information for Dynamic Feature Selection
- Authors: Soham Gadgil, Ian Covert, Su-In Lee
- Abstract summary: Dynamic feature selection is a promising paradigm to reduce feature acquisition costs and provide transparency into the prediction process.
Here, we take an information-theoretic perspective and prioritize features based on their mutual information with the response variable.
We introduce several further improvements: allowing variable feature budgets across samples, enabling non-uniform costs between features, incorporating prior information, and exploring modern architectures to handle partial input information.
- Score: 14.50261153230204
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dynamic feature selection, where we sequentially query features to make
accurate predictions with a minimal budget, is a promising paradigm to reduce
feature acquisition costs and provide transparency into the prediction process.
The problem is challenging, however, as it requires both making predictions
with arbitrary feature sets and learning a policy to identify the most valuable
selections. Here, we take an information-theoretic perspective and prioritize
features based on their mutual information with the response variable. The main
challenge is learning this selection policy, and we design a straightforward
new modeling approach that estimates the mutual information in a discriminative
rather than generative fashion. Building on our learning approach, we introduce
several further improvements: allowing variable feature budgets across samples,
enabling non-uniform costs between features, incorporating prior information,
and exploring modern architectures to handle partial input information. We find
that our method provides consistent gains over recent state-of-the-art methods
across a variety of datasets.
Related papers
- Enhancing Neural Subset Selection: Integrating Background Information into Set Representations [53.15923939406772]
We show that when the target value is conditioned on both the input set and subset, it is essential to incorporate an textitinvariant sufficient statistic of the superset into the subset of interest.
This ensures that the output value remains invariant to permutations of the subset and its corresponding superset, enabling identification of the specific superset from which the subset originated.
arXiv Detail & Related papers (2024-02-05T16:09:35Z) - Debiasing Multimodal Models via Causal Information Minimization [65.23982806840182]
We study bias arising from confounders in a causal graph for multimodal data.
Robust predictive features contain diverse information that helps a model generalize to out-of-distribution data.
We use these features as confounder representations and use them via methods motivated by causal theory to remove bias from models.
arXiv Detail & Related papers (2023-11-28T16:46:14Z) - Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Learning to Maximize Mutual Information for Dynamic Feature Selection [13.821253491768168]
We consider the dynamic feature selection (DFS) problem where a model sequentially queries features based on the presently available information.
We explore a simpler approach of greedily selecting features based on their conditional mutual information.
The proposed method is shown to recover the greedy policy when trained to optimality, and it outperforms numerous existing feature selection methods in our experiments.
arXiv Detail & Related papers (2023-01-02T08:31:56Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - VFDS: Variational Foresight Dynamic Selection in Bayesian Neural
Networks for Efficient Human Activity Recognition [81.29900407096977]
Variational Foresight Dynamic Selection (VFDS) learns a policy that selects the next feature subset to observe.
We apply VFDS on the Human Activity Recognition (HAR) task where the performance-cost trade-off is critical in its practice.
arXiv Detail & Related papers (2022-03-31T22:52:43Z) - Model-agnostic and Scalable Counterfactual Explanations via
Reinforcement Learning [0.5729426778193398]
We propose a deep reinforcement learning approach that transforms the optimization procedure into an end-to-end learnable process.
Our experiments on real-world data show that our method is model-agnostic, relying only on feedback from model predictions.
arXiv Detail & Related papers (2021-06-04T16:54:36Z) - Feature Selection Using Reinforcement Learning [0.0]
The space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially.
Identifying the most characterizing features that minimizes the variance without jeopardizing the bias of our models is critical to successfully training a machine learning model.
arXiv Detail & Related papers (2021-01-23T09:24:37Z) - Feature Selection for Huge Data via Minipatch Learning [0.0]
We propose Stable Minipatch Selection (STAMPS) and Adaptive STAMPS.
STAMPS are meta-algorithms that build ensembles of selection events of base feature selectors trained on tiny, (ly-adaptive) random subsets of both the observations and features of the data.
Our approaches are general and can be employed with a variety of existing feature selection strategies and machine learning techniques.
arXiv Detail & Related papers (2020-10-16T17:41:08Z) - Sequential Transfer in Reinforcement Learning with a Generative Model [48.40219742217783]
We show how to reduce the sample complexity for learning new tasks by transferring knowledge from previously-solved ones.
We derive PAC bounds on its sample complexity which clearly demonstrate the benefits of using this kind of prior knowledge.
We empirically verify our theoretical findings in simple simulated domains.
arXiv Detail & Related papers (2020-07-01T19:53:35Z) - Dynamic Feature Acquisition with Arbitrary Conditional Flows [11.655069211977464]
We propose models that dynamically acquire new features to further improve the prediction assessment.
We leverage an information theoretic metric, conditional mutual information, to select the most informative feature to acquire.
Our model demonstrates superior performance over baselines evaluated in multiple settings.
arXiv Detail & Related papers (2020-06-13T19:01:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.