Selecting Robust Features for Machine Learning Applications using
Multidata Causal Discovery
- URL: http://arxiv.org/abs/2304.05294v5
- Date: Fri, 30 Jun 2023 14:14:23 GMT
- Title: Selecting Robust Features for Machine Learning Applications using
Multidata Causal Discovery
- Authors: Saranya Ganesh S., Tom Beucler, Frederick Iat-Hin Tam, Milton S.
Gomez, Jakob Runge, and Andreas Gerhardus
- Abstract summary: We introduce a Multidata causal feature selection approach that simultaneously processes an ensemble of time series datasets.
This approach uses the causal discovery algorithms PC1 or PCMCI that are implemented in the Tigramite Python package.
We apply our framework to the statistical intensity prediction of Western Pacific Tropical Cyclones.
- Score: 7.8814500102882805
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robust feature selection is vital for creating reliable and interpretable
Machine Learning (ML) models. When designing statistical prediction models in
cases where domain knowledge is limited and underlying interactions are
unknown, choosing the optimal set of features is often difficult. To mitigate
this issue, we introduce a Multidata (M) causal feature selection approach that
simultaneously processes an ensemble of time series datasets and produces a
single set of causal drivers. This approach uses the causal discovery
algorithms PC1 or PCMCI that are implemented in the Tigramite Python package.
These algorithms utilize conditional independence tests to infer parts of the
causal graph. Our causal feature selection approach filters out
causally-spurious links before passing the remaining causal features as inputs
to ML models (Multiple linear regression, Random Forest) that predict the
targets. We apply our framework to the statistical intensity prediction of
Western Pacific Tropical Cyclones (TC), for which it is often difficult to
accurately choose drivers and their dimensionality reduction (time lags,
vertical levels, and area-averaging). Using more stringent significance
thresholds in the conditional independence tests helps eliminate spurious
causal relationships, thus helping the ML model generalize better to unseen TC
cases. M-PC1 with a reduced number of features outperforms M-PCMCI, non-causal
ML, and other feature selection methods (lagged correlation, random), even
slightly outperforming feature selection based on eXplainable Artificial
Intelligence. The optimal causal drivers obtained from our causal feature
selection help improve our understanding of underlying relationships and
suggest new potential drivers of TC intensification.
Related papers
- TAROT: Targeted Data Selection via Optimal Transport [64.56083922130269]
TAROT is a targeted data selection framework grounded in optimal transport theory.
Previous targeted data selection methods rely on influence-based greedys to enhance domain-specific performance.
We evaluate TAROT across multiple tasks, including semantic segmentation, motion prediction, and instruction tuning.
arXiv Detail & Related papers (2024-11-30T10:19:51Z) - Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study [61.64685376882383]
Counterfactual learning to rank (CLTR) has attracted extensive attention in the IR community for its ability to leverage massive logged user interaction data to train ranking models.
This paper investigates the robustness of existing CLTR models in complex and diverse situations.
We find that the DLA models and IPS-DCM show better robustness under various simulation settings than IPS-PBM and PRS with offline propensity estimation.
arXiv Detail & Related papers (2024-04-04T10:54:38Z) - Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Confidence-Based Model Selection: When to Take Shortcuts for
Subpopulation Shifts [119.22672589020394]
We propose COnfidence-baSed MOdel Selection (CosMoS), where model confidence can effectively guide model selection.
We evaluate CosMoS on four datasets with spurious correlations, each with multiple test sets with varying levels of data distribution shift.
arXiv Detail & Related papers (2023-06-19T18:48:15Z) - Flexible variable selection in the presence of missing data [0.0]
We propose a non-parametric variable selection algorithm combined with multiple imputation to develop flexible panels in the presence of missing-at-random data.
We show that our proposal has good operating characteristics and results in panels with higher classification and variable selection performance.
arXiv Detail & Related papers (2022-02-25T21:41:03Z) - Understanding Interlocking Dynamics of Cooperative Rationalization [90.6863969334526]
Selective rationalization explains the prediction of complex neural networks by finding a small subset of the input that is sufficient to predict the neural model output.
We reveal a major problem with such cooperative rationalization paradigm -- model interlocking.
We propose a new rationalization framework, called A2R, which introduces a third component into the architecture, a predictor driven by soft attention as opposed to selection.
arXiv Detail & Related papers (2021-10-26T17:39:18Z) - Model-based micro-data reinforcement learning: what are the crucial
model properties and which model to choose? [0.2836066255205732]
We contribute to micro-data model-based reinforcement learning (MBRL) by rigorously comparing popular generative models.
We find that on an environment that requires multimodal posterior predictives, mixture density nets outperform all other models by a large margin.
We also found that deterministic models are on par, in fact they consistently (although non-significantly) outperform their probabilistic counterparts.
arXiv Detail & Related papers (2021-07-24T11:38:25Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - Improving Sample and Feature Selection with Principal Covariates
Regression [0.0]
We focus on two popular sub-selection schemes which have been applied to this end.
We show that incorporating target information provides selections that perform better in supervised tasks.
We also show that incorporating aspects of simple supervised learning models can improve the accuracy of more complex models.
arXiv Detail & Related papers (2020-12-22T18:52:06Z) - Feature Selection for Huge Data via Minipatch Learning [0.0]
We propose Stable Minipatch Selection (STAMPS) and Adaptive STAMPS.
STAMPS are meta-algorithms that build ensembles of selection events of base feature selectors trained on tiny, (ly-adaptive) random subsets of both the observations and features of the data.
Our approaches are general and can be employed with a variety of existing feature selection strategies and machine learning techniques.
arXiv Detail & Related papers (2020-10-16T17:41:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.