A Formally Robust Time Series Distance Metric
- URL: http://arxiv.org/abs/2008.07865v1
- Date: Tue, 18 Aug 2020 11:28:50 GMT
- Title: A Formally Robust Time Series Distance Metric
- Authors: Maximilian Toller, Bernhard C. Geiger, Roman Kern
- Abstract summary: We propose a novel distance metric that is robust against arbitrarily "bad" contamination.
We show in an empirical evaluation that the metric yields competitive classification accuracy when applied in k-Nearest Neighbor time series classification.
- Score: 6.929025509877642
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distance-based classification is among the most competitive classification
methods for time series data. The most critical component of distance-based
classification is the selected distance function. Past research has proposed
various different distance metrics or measures dedicated to particular aspects
of real-world time series data, yet there is an important aspect that has not
been considered so far: Robustness against arbitrary data contamination. In
this work, we propose a novel distance metric that is robust against
arbitrarily "bad" contamination and has a worst-case computational complexity
of $\mathcal{O}(n\log n)$. We formally argue why our proposed metric is robust,
and demonstrate in an empirical evaluation that the metric yields competitive
classification accuracy when applied in k-Nearest Neighbor time series
classification.
Related papers
- CADM: Cluster-customized Adaptive Distance Metric for Categorical Data Clustering [54.20010572648918]
An appropriate distance metric is crucial for categorical data clustering, as the distance between categorical data cannot be directly calculated.<n>We propose a cluster-customized distance metric for categorical data clustering, which can competitively update distances based on different distributions of attributes in each cluster.
arXiv Detail & Related papers (2025-11-08T03:24:22Z) - Categorical Data Clustering via Value Order Estimated Distance Metric Learning [53.28598689867732]
This paper introduces a novel order distance metric learning approach to intuitively represent categorical attribute values.<n>A new joint learning paradigm is developed to alternatively perform clustering and order distance metric learning.<n>The proposed method achieves superior clustering accuracy on categorical and mixed datasets.
arXiv Detail & Related papers (2024-11-19T08:23:25Z) - Revisiting Evaluation Metrics for Semantic Segmentation: Optimization
and Evaluation of Fine-grained Intersection over Union [113.20223082664681]
We propose the use of fine-grained mIoUs along with corresponding worst-case metrics.
These fine-grained metrics offer less bias towards large objects, richer statistical information, and valuable insights into model and dataset auditing.
Our benchmark study highlights the necessity of not basing evaluations on a single metric and confirms that fine-grained mIoUs reduce the bias towards large objects.
arXiv Detail & Related papers (2023-10-30T03:45:15Z) - Mixed-type Distance Shrinkage and Selection for Clustering via Kernel Metric Learning [0.0]
We propose a metric called KDSUM that uses mixed kernels to measure dissimilarity.
We demonstrate that KDSUM is a shrinkage method from existing mixed-type metrics to a uniform dissimilarity metric.
arXiv Detail & Related papers (2023-06-02T19:51:48Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - Matrix Profile XXVII: A Novel Distance Measure for Comparing Long Time
Series [18.205595410817327]
We introduce PRCIS, which stands for Pattern Representation Comparison in Series.
PRCIS is a distance measure for long time series, which exploits recent progress in our ability to summarize time series with dictionaries.
arXiv Detail & Related papers (2022-12-09T23:02:23Z) - Evaluation of the impact of the indiscernibility relation on the
fuzzy-rough nearest neighbours algorithm [1.4213973379473654]
Fuzzy-rough nearest neighbours (FRNN) is a classification algorithm based on the classical k-nearest neighbours algorithm.
In this paper, we investigate the impact of the indiscernibility relation on the performance of FRNN classification.
arXiv Detail & Related papers (2022-11-25T14:17:56Z) - Local Evaluation of Time Series Anomaly Detection Algorithms [9.717823994163277]
We show that an adversary algorithm can reach high precision and recall on almost any dataset under weak assumption.
We propose a theoretically grounded, robust, parameter-free and interpretable extension to precision/recall metrics.
arXiv Detail & Related papers (2022-06-27T10:18:41Z) - Kernel distance measures for time series, random fields and other
structured data [71.61147615789537]
kdiff is a novel kernel-based measure for estimating distances between instances of structured data.
It accounts for both self and cross similarities across the instances and is defined using a lower quantile of the distance distribution.
Some theoretical results are provided for separability conditions using kdiff as a distance measure for clustering and classification problems.
arXiv Detail & Related papers (2021-09-29T22:54:17Z) - Ranking the information content of distance measures [61.754016309475745]
We introduce a statistical test that can assess the relative information retained when using two different distance measures.
This in turn allows finding the most informative distance measure out of a pool of candidates.
arXiv Detail & Related papers (2021-04-30T15:57:57Z) - $\gamma$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a
Robust Divergence Estimator [95.71091446753414]
We propose to use a nearest-neighbor-based $gamma$-divergence estimator as a data discrepancy measure.
Our method achieves significantly higher robustness than existing discrepancy measures.
arXiv Detail & Related papers (2020-06-13T06:09:27Z) - Provably Robust Metric Learning [98.50580215125142]
We show that existing metric learning algorithms can result in metrics that are less robust than the Euclidean distance.
We propose a novel metric learning algorithm to find a Mahalanobis distance that is robust against adversarial perturbations.
Experimental results show that the proposed metric learning algorithm improves both certified robust errors and empirical robust errors.
arXiv Detail & Related papers (2020-06-12T09:17:08Z) - A Practical Index Structure Supporting Fr\'echet Proximity Queries Among
Trajectories [1.9335262420787858]
We present a scalable approach for range and $k$ nearest neighbor queries under computationally expensive metrics.
Based on clustering for metric indexes, we obtain a dynamic tree structure whose size is linear in the number of trajectories.
We analyze the efficiency and effectiveness of our methods with extensive experiments on diverse synthetic and real-world data sets.
arXiv Detail & Related papers (2020-05-28T04:12:43Z) - MALTS: Matching After Learning to Stretch [86.84454964051014]
We learn an interpretable distance metric for matching, which leads to substantially higher quality matches.
Our ability to learn flexible distance metrics leads to matches that are interpretable and useful for the estimation of conditional average treatment effects.
arXiv Detail & Related papers (2018-11-18T22:29:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.