Exploring Hierarchical Classification Performance for Time Series Data:
Dissimilarity Measures and Classifier Comparisons
- URL: http://arxiv.org/abs/2402.05275v1
- Date: Wed, 7 Feb 2024 21:46:26 GMT
- Title: Exploring Hierarchical Classification Performance for Time Series Data:
Dissimilarity Measures and Classifier Comparisons
- Authors: Celal Alagoz
- Abstract summary: This study investigates the comparative performance of hierarchical classification (HC) and flat classification (FC) methodologies in time series data analysis.
Dissimilarity measures, including Jensen-Shannon Distance (JSD), Task Similarity Distance (TSD), and Based Distance (CBD) are leveraged.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The comparative performance of hierarchical classification (HC) and flat
classification (FC) methodologies in the realm of time series data analysis is
investigated in this study. Dissimilarity measures, including Jensen-Shannon
Distance (JSD), Task Similarity Distance (TSD), and Classifier Based Distance
(CBD), are leveraged alongside various classifiers such as MINIROCKET, STSF,
and SVM. A subset of datasets from the UCR archive, focusing on multi-class
cases comprising more than two classes, is employed for analysis. A significant
trend is observed wherein HC demonstrates significant superiority over FC when
paired with MINIROCKET utilizing TSD, diverging from conventional
understandings. Conversely, FC exhibits consistent dominance across all
configurations when employing alternative classifiers such as STSF and SVM.
Moreover, TSD is found to consistently outperform both CBD and JSD across
nearly all scenarios, except in instances involving the STSF classifier where
CBD showcases superior performance. This discrepancy underscores the nuanced
nature of dissimilarity measures and emphasizes the importance of their
tailored selection based on the dataset and classifier employed. Valuable
insights into the dynamic interplay between classification methodologies and
dissimilarity measures in the realm of time series data analysis are provided
by these findings. By elucidating the performance variations across different
configurations, a foundation is laid for refining classification methodologies
and dissimilarity measures to optimize performance in diverse analytical
scenarios. Furthermore, the need for continued research aimed at elucidating
the underlying mechanisms driving classification performance in time series
data analysis is underscored, with implications for enhancing predictive
modeling and decision-making in various domains.
Related papers
- MetaCoCo: A New Few-Shot Classification Benchmark with Spurious Correlation [46.50551811108464]
We present a benchmark with spurious-correlation shifts collected from real-world scenarios.
We also propose a metric by using CLIP as a pre-trained vision-language model.
The experimental results show that the performance of the existing methods degrades significantly in the presence of spurious-correlation shifts.
arXiv Detail & Related papers (2024-04-30T15:45:30Z) - Synergistic eigenanalysis of covariance and Hessian matrices for enhanced binary classification [72.77513633290056]
We present a novel approach that combines the eigenanalysis of a covariance matrix evaluated on a training set with a Hessian matrix evaluated on a deep learning model.
Our method captures intricate patterns and relationships, enhancing classification performance.
arXiv Detail & Related papers (2024-02-14T16:10:42Z) - Generating Hierarchical Structures for Improved Time Series
Classification Using Stochastic Splitting Functions [0.0]
This study introduces a novel hierarchical divisive clustering approach with splitting functions (SSFs) to enhance classification performance in multi-class datasets through hierarchical classification (HC)
The method has the unique capability of generating hierarchy without requiring explicit information, making it suitable for datasets lacking prior knowledge of hierarchy.
arXiv Detail & Related papers (2023-09-21T10:34:50Z) - Robust Classification of High-Dimensional Data using Data-Adaptive
Energy Distance [0.0]
classification of high-dimensional low sample size (HDLSS) data poses a challenge in a variety of real-world situations.
This article presents the development and analysis of some classifiers that are specifically designed for HDLSS data.
It is shown that they yield perfect classification in the HDLSS regime, under some fairly general conditions.
arXiv Detail & Related papers (2023-06-24T14:39:44Z) - Characterizing the Optimal 0-1 Loss for Multi-class Classification with
a Test-time Attacker [57.49330031751386]
We find achievable information-theoretic lower bounds on loss in the presence of a test-time attacker for multi-class classifiers on any discrete dataset.
We provide a general framework for finding the optimal 0-1 loss that revolves around the construction of a conflict hypergraph from the data and adversarial constraints.
arXiv Detail & Related papers (2023-02-21T15:17:13Z) - Early Time-Series Classification Algorithms: An Empirical Comparison [59.82930053437851]
Early Time-Series Classification (ETSC) is the task of predicting the class of incoming time-series by observing as few measurements as possible.
We evaluate six existing ETSC algorithms on publicly available data, as well as on two newly introduced datasets.
arXiv Detail & Related papers (2022-03-03T10:43:56Z) - A Novel Intrinsic Measure of Data Separability [0.0]
In machine learning, the performance of a classifier depends on the separability/complexity of datasets.
We create an intrinsic measure -- the Distance-based Separability Index (DSI)
We show that the DSI can indicate whether the distributions of datasets are identical for any dimensionality.
arXiv Detail & Related papers (2021-09-11T04:20:08Z) - Fast, Accurate and Interpretable Time Series Classification Through
Randomization [20.638480955703102]
Time series classification (TSC) aims to predict the class label of a given time series.
We propose a novel TSC method - the Randomized-Supervised Time Series Forest (r-STSF)
r-STSF is highly efficient, achieves state-of-the-art classification accuracy and enables interpretability.
arXiv Detail & Related papers (2021-05-31T10:59:11Z) - A comprehensive comparative evaluation and analysis of Distributional
Semantic Models [61.41800660636555]
We perform a comprehensive evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT.
The results show that the alleged superiority of predict based models is more apparent than real, and surely not ubiquitous.
We borrow from cognitive neuroscience the methodology of Representational Similarity Analysis (RSA) to inspect the semantic spaces generated by distributional models.
arXiv Detail & Related papers (2021-05-20T15:18:06Z) - Binary Classification from Multiple Unlabeled Datasets via Surrogate Set
Classification [94.55805516167369]
We propose a new approach for binary classification from m U-sets for $mge2$.
Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC)
arXiv Detail & Related papers (2021-02-01T07:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.