Partial Information Decomposition for Data Interpretability and Feature Selection
- URL: http://arxiv.org/abs/2405.19212v2
- Date: Fri, 7 Jun 2024 09:04:47 GMT
- Title: Partial Information Decomposition for Data Interpretability and Feature Selection
- Authors: Charles Westphal, Stephen Hailes, Mirco Musolesi,
- Abstract summary: Partial Information Decomposition of Features (PIDF) is a new paradigm for simultaneous data interpretability and feature selection.
We extensively evaluate PIDF using both synthetic and real-world data, demonstrating its potential applications and effectiveness.
- Score: 3.7414804164475983
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce Partial Information Decomposition of Features (PIDF), a new paradigm for simultaneous data interpretability and feature selection. Contrary to traditional methods that assign a single importance value, our approach is based on three metrics per feature: the mutual information shared with the target variable, the feature's contribution to synergistic information, and the amount of this information that is redundant. In particular, we develop a novel procedure based on these three metrics, which reveals not only how features are correlated with the target but also the additional and overlapping information provided by considering them in combination with other features. We extensively evaluate PIDF using both synthetic and real-world data, demonstrating its potential applications and effectiveness, by considering case studies from genetics and neuroscience.
Related papers
- Quantifying Spuriousness of Biased Datasets Using Partial Information Decomposition [14.82261635235695]
Spurious patterns refer to a mathematical association between two or more variables in a dataset that are not causally related.
This work presents the first information-theoretic formalization of spuriousness in a dataset (given a split of spurious and core features) using a mathematical framework called Partial Information Decomposition (PID)
We disentangle the joint information content that the spurious and core features share about another target variable into distinct components, namely unique, redundant, and synergistic information.
arXiv Detail & Related papers (2024-06-29T16:05:47Z) - Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features.
Based on these observations, we propose a conceptual framework for feature learning.
Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - Relational Local Explanations [11.679389861042]
We develop a novel model-agnostic and permutation-based feature attribution algorithm based on relational analysis between input variables.
We are able to gain a broader insight into machine learning model decisions and data.
arXiv Detail & Related papers (2022-12-23T14:46:23Z) - FUNCK: Information Funnels and Bottlenecks for Invariant Representation
Learning [7.804994311050265]
We investigate a set of related information funnels and bottleneck problems that claim to learn invariant representations from the data.
We propose a new element to this family of information-theoretic objectives: The Conditional Privacy Funnel with Side Information.
Given the generally intractable objectives, we derive tractable approximations using amortized variational inference parameterized by neural networks.
arXiv Detail & Related papers (2022-11-02T19:37:55Z) - Self-Attention Neural Bag-of-Features [103.70855797025689]
We build on the recently introduced 2D-Attention and reformulate the attention learning methodology.
We propose a joint feature-temporal attention mechanism that learns a joint 2D attention mask highlighting relevant information.
arXiv Detail & Related papers (2022-01-26T17:54:14Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - A Rigorous Information-Theoretic Definition of Redundancy and Relevancy
in Feature Selection Based on (Partial) Information Decomposition [0.0483420384410068]
We argue that information theory does not provide measures to decompose the information a set of variables provides about a target into unique, redundant, and synergistic contributions.
Using partial information decomposition (PID) we provide a novel definition of feature relevancy and redundancy in PID terms.
We propose an iterative, CMI-based algorithm for practical feature selection.
arXiv Detail & Related papers (2021-05-10T08:33:10Z) - A User-Guided Bayesian Framework for Ensemble Feature Selection in Life
Science Applications (UBayFS) [0.0]
We propose UBayFS, an ensemble feature selection technique, embedded in a Bayesian statistical framework.
Our approach enhances the feature selection process by considering two sources of information: data and domain knowledge.
A comparison with standard feature selectors underlines that UBayFS achieves competitive performance, while providing additional flexibility to incorporate domain knowledge.
arXiv Detail & Related papers (2021-04-30T06:51:33Z) - Interactive Fusion of Multi-level Features for Compositional Activity
Recognition [100.75045558068874]
We present a novel framework that accomplishes this goal by interactive fusion.
We implement the framework in three steps, namely, positional-to-appearance feature extraction, semantic feature interaction, and semantic-to-positional prediction.
We evaluate our approach on two action recognition datasets, Something-Something and Charades.
arXiv Detail & Related papers (2020-12-10T14:17:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.