Correlation Based Feature Subset Selection for Multivariate Time-Series
Data
- URL: http://arxiv.org/abs/2112.03705v1
- Date: Fri, 26 Nov 2021 17:39:33 GMT
- Title: Correlation Based Feature Subset Selection for Multivariate Time-Series
Data
- Authors: Bahavathy Kathirgamanathan, Padraig Cunningham
- Abstract summary: Correlations in streams of time series data mean that only a small subset of the features are required for a given data mining task.
We propose a technique which does feature subset selection based on the correlation patterns of single feature classifier outputs.
- Score: 2.055949720959582
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Correlations in streams of multivariate time series data means that
typically, only a small subset of the features are required for a given data
mining task. In this paper, we propose a technique which we call Merit Score
for Time-Series data (MSTS) that does feature subset selection based on the
correlation patterns of single feature classifier outputs. We assign a Merit
Score to the feature subsets which is used as the basis for selecting 'good'
feature subsets. The proposed technique is evaluated on datasets from the UEA
multivariate time series archive and is compared against a Wrapper approach for
feature subset selection. MSTS is shown to be effective for feature subset
selection and is in particular effective as a data reduction technique. MSTS is
shown here to be computationally more efficient than the Wrapper strategy in
selecting a suitable feature subset, being more than 100 times faster for some
larger datasets while also maintaining a good classification accuracy.
Related papers
- Adapt-$\infty$: Scalable Lifelong Multimodal Instruction Tuning via Dynamic Data Selection [89.42023974249122]
Adapt-$infty$ is a new multi-way and adaptive data selection approach for Lifelong Instruction Tuning.
We construct pseudo-skill clusters by grouping gradient-based sample vectors.
We select the best-performing data selector for each skill cluster from a pool of selector experts.
arXiv Detail & Related papers (2024-10-14T15:48:09Z) - Feature Selection as Deep Sequential Generative Learning [50.00973409680637]
We develop a deep variational transformer model over a joint of sequential reconstruction, variational, and performance evaluator losses.
Our model can distill feature selection knowledge and learn a continuous embedding space to map feature selection decision sequences into embedding vectors associated with utility scores.
arXiv Detail & Related papers (2024-03-06T16:31:56Z) - Towards Free Data Selection with General-Purpose Models [71.92151210413374]
A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets.
Current approaches, represented by active learning methods, typically follow a cumbersome pipeline that iterates the time-consuming model training and batch data selection repeatedly.
FreeSel bypasses the heavy batch selection process, achieving a significant improvement in efficiency and being 530x faster than existing active learning methods.
arXiv Detail & Related papers (2023-09-29T15:50:14Z) - Utilizing Semantic Textual Similarity for Clinical Survey Data Feature
Selection [4.5574502769585745]
Machine learning models that attempt to predict outcomes from survey data can overfit and result in poor generalizability.
One remedy to this issue is feature selection, which attempts to select an optimal subset of features to learn upon.
The relationships between feature names and target names can be evaluated using language models (LMs) to produce semantic textual similarity (STS) scores.
We examine the performance using STS to select features directly and in the minimal-redundancy-maximal-relevance (mRMR) algorithm.
arXiv Detail & Related papers (2023-08-19T03:10:51Z) - Parallel feature selection based on the trace ratio criterion [4.30274561163157]
This work presents a novel parallel feature selection approach for classification, namely Parallel Feature Selection using Trace criterion (PFST)
Our method uses trace criterion, a measure of class separability used in Fisher's Discriminant Analysis, to evaluate feature usefulness.
The experiments show that our method can produce a small set of features in a fraction of the amount of time by the other methods under comparison.
arXiv Detail & Related papers (2022-03-03T10:50:33Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - A Supervised Feature Selection Method For Mixed-Type Data using
Density-based Feature Clustering [1.3048920509133808]
This paper proposes a supervised feature selection method using density-based feature clustering (SFSDFC)
SFSDFC decomposes the feature space into a set of disjoint feature clusters using a novel density-based clustering method.
Then, an effective feature selection strategy is employed to obtain a subset of important features with minimal redundancy from those feature clusters.
arXiv Detail & Related papers (2021-11-10T15:05:15Z) - A Feature Selection Method for Multi-Dimension Time-Series Data [2.055949720959582]
Time-series data in application areas such as motion capture and activity recognition is often multi-dimension.
There is a lot of redundancy in these data streams and good classification accuracy will often be achievable with a small number of features.
We present a method for feature subset selection on multidimensional time-series data based on mutual information.
arXiv Detail & Related papers (2021-04-22T14:49:00Z) - Supervised Feature Subset Selection and Feature Ranking for Multivariate
Time Series without Feature Extraction [78.84356269545157]
We introduce supervised feature ranking and feature subset selection algorithms for MTS classification.
Unlike most existing supervised/unsupervised feature selection algorithms for MTS our techniques do not require a feature extraction step to generate a one-dimensional feature vector from the time series.
arXiv Detail & Related papers (2020-05-01T07:46:29Z) - Selecting Relevant Features from a Multi-domain Representation for
Few-shot Classification [91.67977602992657]
We propose a new strategy based on feature selection, which is both simpler and more effective than previous feature adaptation approaches.
We show that a simple non-parametric classifier built on top of such features produces high accuracy and generalizes to domains never seen during training.
arXiv Detail & Related papers (2020-03-20T15:44:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.