Sparsity-based Feature Selection for Anomalous Subgroup Discovery
- URL: http://arxiv.org/abs/2201.02008v1
- Date: Thu, 6 Jan 2022 10:56:43 GMT
- Title: Sparsity-based Feature Selection for Anomalous Subgroup Discovery
- Authors: Girmaw Abebe Tadesse, William Ogallo, Catherine Wanjiru, Charles
Wachira, Isaiah Onando Mulang', Vibha Anand, Aisha Walcott-Bryant, Skyler
Speakman
- Abstract summary: Anomalous pattern detection aims to identify instances where deviation from normalcy is evident, and is widely applicable across domains.
There is a common lack of a principled and scalable feature selection method for efficient discovery.
In this paper, we proposed a sparsity-based automated feature selection framework, which encodes systemic outcome deviations via the sparsity of feature-driven odds ratios.
- Score: 5.960402015658508
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Anomalous pattern detection aims to identify instances where deviation from
normalcy is evident, and is widely applicable across domains. Multiple
anomalous detection techniques have been proposed in the state of the art.
However, there is a common lack of a principled and scalable feature selection
method for efficient discovery. Existing feature selection techniques are often
conducted by optimizing the performance of prediction outcomes rather than its
systemic deviations from the expected. In this paper, we proposed a
sparsity-based automated feature selection (SAFS) framework, which encodes
systemic outcome deviations via the sparsity of feature-driven odds ratios.
SAFS is a model-agnostic approach with usability across different discovery
techniques. SAFS achieves more than $3\times$ reduction in computation time
while maintaining detection performance when validated on publicly available
critical care dataset. SAFS also results in a superior performance when
compared against multiple baselines for feature selection.
Related papers
- Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Budgeted Classification with Rejection: An Evolutionary Method with
Multiple Objectives [0.0]
Budgeted, sequential classifiers (BSCs) process inputs through a sequence of partial feature acquisition and evaluation steps.
This allows for an efficient evaluation of inputs that prevents unneeded feature acquisition.
We propose a problem-specific genetic algorithm to build budgeted, sequential classifiers with confidence-based reject options.
arXiv Detail & Related papers (2022-05-01T22:05:16Z) - Distributed Dynamic Safe Screening Algorithms for Sparse Regularization [73.85961005970222]
We propose a new distributed dynamic safe screening (DDSS) method for sparsity regularized models and apply it on shared-memory and distributed-memory architecture respectively.
We prove that the proposed method achieves the linear convergence rate with lower overall complexity and can eliminate almost all the inactive features in a finite number of iterations almost surely.
arXiv Detail & Related papers (2022-04-23T02:45:55Z) - Error-based Knockoffs Inference for Controlled Feature Selection [49.99321384855201]
We propose an error-based knockoff inference method by integrating the knockoff features, the error-based feature importance statistics, and the stepdown procedure together.
The proposed inference procedure does not require specifying a regression model and can handle feature selection with theoretical guarantees.
arXiv Detail & Related papers (2022-03-09T01:55:59Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - Dynamic Bayesian Approach for decision-making in Ego-Things [8.577234269009042]
This paper presents a novel approach to detect abnormalities in dynamic systems based on multisensory data and feature selection.
Growing neural gas (GNG) is employed for clustering multisensory data into a set of nodes.
Our method uses a Markov Jump particle filter (MJPF) for state estimation and abnormality detection.
arXiv Detail & Related papers (2020-10-28T11:38:51Z) - Joint Adaptive Graph and Structured Sparsity Regularization for
Unsupervised Feature Selection [6.41804410246642]
We propose a joint adaptive graph and structured sparsity regularization unsupervised feature selection (JASFS) method.
A subset of optimal features will be selected in group, and the number of selected features will be determined automatically.
Experimental results on eight benchmarks demonstrate the effectiveness and efficiency of the proposed method.
arXiv Detail & Related papers (2020-10-09T08:17:04Z) - IVFS: Simple and Efficient Feature Selection for High Dimensional
Topology Preservation [33.424663018395684]
We propose a simple and effective feature selection algorithm to enhance sample similarity preservation.
The proposed algorithm is able to well preserve the pairwise distances, as well as topological patterns, of the full data.
arXiv Detail & Related papers (2020-04-02T23:05:00Z) - Outlier Detection Ensemble with Embedded Feature Selection [42.8338013000469]
We propose an outlier detection ensemble framework with embedded feature selection (ODEFS)
For each random sub-sampling based learning component, ODEFS unifies feature selection and outlier detection into a pairwise ranking formulation.
We adopt the thresholded self-paced learning to simultaneously optimize feature selection and example selection.
arXiv Detail & Related papers (2020-01-15T13:14:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.