Trajectory Clustering Performance Evaluation: If we know the answer,
it's not clustering
- URL: http://arxiv.org/abs/2112.01570v1
- Date: Thu, 2 Dec 2021 19:25:38 GMT
- Title: Trajectory Clustering Performance Evaluation: If we know the answer,
it's not clustering
- Authors: Mohsen Rezaie and Nicolas Saunier
- Abstract summary: Trajectory clustering is an unsupervised task.
We perform a comprehensive comparison of similarity measures, clustering algorithms and evaluation measures using trajectory data from seven intersections.
- Score: 0.6472434306724609
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Advancements in Intelligent Traffic Systems (ITS) have made huge amounts of
traffic data available through automatic data collection. A big part of this
data is stored as trajectories of moving vehicles and road users. Automatic
analysis of this data with minimal human supervision would both lower the costs
and eliminate subjectivity of the analysis. Trajectory clustering is an
unsupervised task.
In this paper, we perform a comprehensive comparison of similarity measures,
clustering algorithms and evaluation measures using trajectory data from seven
intersections. We also propose a method to automatically generate trajectory
reference clusters based on their origin and destination points to be used for
label-based evaluation measures. Therefore, the entire procedure remains
unsupervised both in clustering and evaluation levels. Finally, we use a
combination of evaluation measures to find the top performing similarity
measures and clustering algorithms for each intersection. The results show that
there is no single combination of distance and clustering algorithm that is
always among the top ten clustering setups.
Related papers
- Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Detection and Evaluation of Clusters within Sequential Data [58.720142291102135]
Clustering algorithms for Block Markov Chains possess theoretical optimality guarantees.
In particular, our sequential data is derived from human DNA, written text, animal movement data and financial markets.
It is found that the Block Markov Chain model assumption can indeed produce meaningful insights in exploratory data analyses.
arXiv Detail & Related papers (2022-10-04T15:22:39Z) - SSDBCODI: Semi-Supervised Density-Based Clustering with Outliers
Detection Integrated [1.8444322599555096]
Clustering analysis is one of the critical tasks in machine learning.
Due to the fact that the performance of clustering clustering can be significantly eroded by outliers, algorithms try to incorporate the process of outlier detection.
We have proposed SSDBCODI, a semi-supervised detection element.
arXiv Detail & Related papers (2022-08-10T21:06:38Z) - Clustering Plotted Data by Image Segmentation [12.443102864446223]
Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data.
In this paper, we present a wholly different way of clustering points in 2-dimensional space, inspired by how humans cluster data.
Our approach, Visual Clustering, has several advantages over traditional clustering algorithms.
arXiv Detail & Related papers (2021-10-06T06:19:30Z) - Applying Semi-Automated Hyperparameter Tuning for Clustering Algorithms [0.0]
This study proposes a framework for semi-automated hyperparameter tuning of clustering problems.
It uses a grid search to develop a series of graphs and easy to interpret metrics that can then be used for more efficient domain-specific evaluation.
Preliminary results show that internal metrics are unable to capture the semantic quality of the clusters developed.
arXiv Detail & Related papers (2021-08-25T05:48:06Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - Large Scale Autonomous Driving Scenarios Clustering with Self-supervised
Feature Extraction [6.804209932400134]
This article proposes a comprehensive data clustering framework for a large set of vehicle driving data.
Our approach thoroughly considers the traffic elements, including both in-traffic agent objects and map information.
With the newly designed driving data clustering evaluation metrics based on data-augmentation, the accuracy assessment does not require a human-labeled data-set.
arXiv Detail & Related papers (2021-03-30T06:22:40Z) - (k, l)-Medians Clustering of Trajectories Using Continuous Dynamic Time
Warping [57.316437798033974]
In this work we consider the problem of center-based clustering of trajectories.
We propose the usage of a continuous version of DTW as distance measure, which we call continuous dynamic time warping (CDTW)
We show a practical way to compute a center from a set of trajectories and subsequently iteratively improve it.
arXiv Detail & Related papers (2020-12-01T13:17:27Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - LSD-C: Linearly Separable Deep Clusters [145.89790963544314]
We present LSD-C, a novel method to identify clusters in an unlabeled dataset.
Our method draws inspiration from recent semi-supervised learning practice and proposes to combine our clustering algorithm with self-supervised pretraining and strong data augmentation.
We show that our approach significantly outperforms competitors on popular public image benchmarks including CIFAR 10/100, STL 10 and MNIST, as well as the document classification dataset Reuters 10K.
arXiv Detail & Related papers (2020-06-17T17:58:10Z) - Unsupervised and Supervised Learning with the Random Forest Algorithm
for Traffic Scenario Clustering and Classification [4.169845583045265]
The goal of this paper is to provide a method, which is able to find categories of traffic scenarios automatically.
The architecture consists of three main components: A microscopic traffic simulation, a clustering technique and a classification technique for the operational phase.
arXiv Detail & Related papers (2020-04-05T08:26:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.