TangledFeatures: Robust Feature Selection in Highly Correlated Spaces
- URL: http://arxiv.org/abs/2510.15005v1
- Date: Thu, 16 Oct 2025 05:54:04 GMT
- Title: TangledFeatures: Robust Feature Selection in Highly Correlated Spaces
- Authors: Allen Daniel Sunny,
- Abstract summary: We introduce TangledFeatures, a framework for feature selection in correlated feature spaces.<n>It identifies representative features from groups of entangled predictors, reducing redundancy while retaining explanatory power.<n>We show that the selected features correspond to structurally meaningful intra-atomic distances that explain variation in backbone torsional angles.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Feature selection is a fundamental step in model development, shaping both predictive performance and interpretability. Yet, most widely used methods focus on predictive accuracy, and their performance degrades in the presence of correlated predictors. To address this gap, we introduce TangledFeatures, a framework for feature selection in correlated feature spaces. It identifies representative features from groups of entangled predictors, reducing redundancy while retaining explanatory power. The resulting feature subset can be directly applied in downstream models, offering a more interpretable and stable basis for analysis compared to traditional selection techniques. We demonstrate the effectiveness of TangledFeatures on Alanine Dipeptide, applying it to the prediction of backbone torsional angles and show that the selected features correspond to structurally meaningful intra-atomic distances that explain variation in these angles.
Related papers
- Detecting and Pruning Prominent but Detrimental Neurons in Large Language Models [68.57424628540907]
Large language models (LLMs) often develop learned mechanisms specialized to specific datasets.<n>We introduce a fine-tuning approach designed to enhance generalization by identifying and pruning neurons associated with dataset-specific mechanisms.<n>Our method employs Integrated Gradients to quantify each neuron's influence on high-confidence predictions, pinpointing those that disproportionately contribute to dataset-specific performance.
arXiv Detail & Related papers (2025-07-12T08:10:10Z) - TRIP: A Nonparametric Test to Diagnose Biased Feature Importance Scores [0.0]
TRIP is a test requiring minimal assumptions that is able to detect unreliable permutation feature importance scores.<n>Our results show that the test can be used to reliably detect when permutation feature importance scores are unreliable.
arXiv Detail & Related papers (2025-07-09T20:49:10Z) - Comprehensive Attribution: Inherently Explainable Vision Model with Feature Detector [30.23453108681447]
Inherently explainable attribution method aims to enhance the understanding of model behavior.
It is achieved by cooperatively training a selector (generating an attribution map to identify important features) and a predictor.
We introduce a new objective that discourages the presence of discriminative features in the masked-out regions.
Our model makes accurate predictions with higher accuracy than the regular black-box model.
arXiv Detail & Related papers (2024-07-27T17:45:20Z) - Detecting and Identifying Selection Structure in Sequential Data [53.24493902162797]
We argue that the selective inclusion of data points based on latent objectives is common in practical situations, such as music sequences.
We show that selection structure is identifiable without any parametric assumptions or interventional experiments.
We also propose a provably correct algorithm to detect and identify selection structures as well as other types of dependencies.
arXiv Detail & Related papers (2024-06-29T20:56:34Z) - Feature Selection as Deep Sequential Generative Learning [50.00973409680637]
We develop a deep variational transformer model over a joint of sequential reconstruction, variational, and performance evaluator losses.
Our model can distill feature selection knowledge and learn a continuous embedding space to map feature selection decision sequences into embedding vectors associated with utility scores.
arXiv Detail & Related papers (2024-03-06T16:31:56Z) - AFD: Mitigating Feature Gap for Adversarial Robustness by Feature Disentanglement [56.90364259986057]
Adversarial fine-tuning methods enhance adversarial robustness via fine-tuning the pre-trained model in an adversarial training manner.<n>We propose a disentanglement-based approach to explicitly model and remove the specific latent features.<n>Our approach surpasses existing adversarial fine-tuning methods and adversarial training baselines.
arXiv Detail & Related papers (2024-01-26T08:38:57Z) - Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Deep Unsupervised Feature Selection by Discarding Nuisance and
Correlated Features [7.288137686773523]
Modern datasets contain large subsets of correlated features and nuisance features.
In the presence of large numbers of nuisance features, the Laplacian must be computed on the subset of selected features.
We employ an autoencoder architecture to cope with correlated features, trained to reconstruct the data from the subset of selected features.
arXiv Detail & Related papers (2021-10-11T14:26:13Z) - Top-$k$ Regularization for Supervised Feature Selection [11.927046591097623]
We introduce a novel, simple yet effective regularization approach, named top-$k$ regularization, to supervised feature selection.
We show that the top-$k$ regularization is effective and stable for supervised feature selection.
arXiv Detail & Related papers (2021-06-04T01:12:47Z) - Aspect Based Sentiment Analysis with Aspect-Specific Opinion Spans [66.77264982885086]
We present a neat and effective structured attention model by aggregating multiple linear-chain CRFs.
Such a design allows the model to extract aspect-specific opinion spans and then evaluate sentiment polarity by exploiting the extracted opinion features.
arXiv Detail & Related papers (2020-10-06T13:18:35Z) - Leveraging Model Inherent Variable Importance for Stable Online Feature
Selection [16.396739487911056]
We introduce FIRES, a novel framework for online feature selection.
Our framework is generic in that it leaves the choice of the underlying model to the user.
Experiments show that the proposed framework is clearly superior in terms of feature selection stability.
arXiv Detail & Related papers (2020-06-18T10:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.