Mining frequency-based sequential trajectory co-clusters
- URL: http://arxiv.org/abs/2110.14110v1
- Date: Wed, 27 Oct 2021 01:18:42 GMT
- Title: Mining frequency-based sequential trajectory co-clusters
- Authors: Yuri Santos, J\^onata Tyska, Vania Bogorny
- Abstract summary: We propose a new trajectory co-clustering method for mining semantic trajectory co-clusters.
It simultaneously clusters the trajectories and their elements taking into account the order in which they appear.
We evaluate the proposed approach using real-world a publicly available dataset.
- Score: 0.2578242050187029
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Co-clustering is a specific type of clustering that addresses the problem of
finding groups of objects without necessarily considering all attributes. This
technique has shown to have more consistent results in high-dimensional sparse
data than traditional clustering. In trajectory co-clustering, the methods
found in the literature have two main limitations: first, the space and time
dimensions have to be constrained by user-defined thresholds; second, elements
(trajectory points) are clustered ignoring the trajectory sequence, assuming
that the points are independent among them. To address the limitations above,
we propose a new trajectory co-clustering method for mining semantic trajectory
co-clusters. It simultaneously clusters the trajectories and their elements
taking into account the order in which they appear. This new method uses the
element frequency to identify candidate co-clusters. Besides, it uses an
objective cost function that automatically drives the co-clustering process,
avoiding the need for constraining dimensions. We evaluate the proposed
approach using real-world a publicly available dataset. The experimental
results show that our proposal finds frequent and meaningful contiguous
sequences revealing mobility patterns, thereby the most relevant elements.
Related papers
- Stable Trajectory Clustering: An Efficient Split and Merge Algorithm [1.9253333342733674]
Clustering algorithms group data points by characteristics to identify patterns.
This paper presents whole-trajectory clustering and sub-trajectory clustering algorithms based on DBSCAN line segment clustering.
arXiv Detail & Related papers (2025-04-30T17:11:36Z) - Hyperoctant Search Clustering: A Method for Clustering Data in High-Dimensional Hyperspheres [0.0]
We propose a new clustering method based on a topological approach applied to regions of space defined by signs of coordinates (hyperoctants)
According to a density criterion, the method builds clusters of data points based on the partitioning of a graph.
We choose the application of topic detection, which is an important task in text mining.
arXiv Detail & Related papers (2025-03-10T23:41:44Z) - Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task.
We propose a co-training-based framework that encourages clustering consistency.
Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - A Computational Theory and Semi-Supervised Algorithm for Clustering [0.0]
A semi-supervised clustering algorithm is presented.
The kernel of the clustering method is Mohammad's anomaly detection algorithm.
Results are presented on synthetic and realworld data sets.
arXiv Detail & Related papers (2023-06-12T09:15:58Z) - Hard Regularization to Prevent Deep Online Clustering Collapse without
Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed.
While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster.
We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z) - Near-Optimal Correlation Clustering with Privacy [37.94795032297396]
Correlation clustering is a central problem in unsupervised learning.
In this paper, we introduce a simple and computationally efficient algorithm for the correlation clustering problem with provable privacy guarantees.
arXiv Detail & Related papers (2022-03-02T22:30:19Z) - Anomaly Clustering: Grouping Images into Coherent Clusters of Anomaly
Types [60.45942774425782]
We introduce anomaly clustering, whose goal is to group data into coherent clusters of anomaly types.
This is different from anomaly detection, whose goal is to divide anomalies from normal data.
We present a simple yet effective clustering framework using a patch-based pretrained deep embeddings and off-the-shelf clustering methods.
arXiv Detail & Related papers (2021-12-21T23:11:33Z) - Weighted Sparse Subspace Representation: A Unified Framework for
Subspace Clustering, Constrained Clustering, and Active Learning [0.3553493344868413]
We first propose a novel spectral-based subspace clustering algorithm that seeks to represent each point as a sparse convex combination of a few nearby points.
We then extend the algorithm to constrained clustering and active learning settings.
Our motivation for developing such a framework stems from the fact that typically either a small amount of labelled data is available in advance; or it is possible to label some points at a cost.
arXiv Detail & Related papers (2021-06-08T13:39:43Z) - (k, l)-Medians Clustering of Trajectories Using Continuous Dynamic Time
Warping [57.316437798033974]
In this work we consider the problem of center-based clustering of trajectories.
We propose the usage of a continuous version of DTW as distance measure, which we call continuous dynamic time warping (CDTW)
We show a practical way to compute a center from a set of trajectories and subsequently iteratively improve it.
arXiv Detail & Related papers (2020-12-01T13:17:27Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - LSD-C: Linearly Separable Deep Clusters [145.89790963544314]
We present LSD-C, a novel method to identify clusters in an unlabeled dataset.
Our method draws inspiration from recent semi-supervised learning practice and proposes to combine our clustering algorithm with self-supervised pretraining and strong data augmentation.
We show that our approach significantly outperforms competitors on popular public image benchmarks including CIFAR 10/100, STL 10 and MNIST, as well as the document classification dataset Reuters 10K.
arXiv Detail & Related papers (2020-06-17T17:58:10Z) - Simple and Scalable Sparse k-means Clustering via Feature Ranking [14.839931533868176]
We propose a novel framework for sparse k-means clustering that is intuitive, simple to implement, and competitive with state-of-the-art algorithms.
Our core method readily generalizes to several task-specific algorithms such as clustering on subsets of attributes and in partially observed data settings.
arXiv Detail & Related papers (2020-02-20T02:41:02Z) - Point-Set Kernel Clustering [11.093960688450602]
This paper introduces a new similarity measure called point-set kernel which computes the similarity between an object and a set of objects.
We show that the new clustering procedure is both effective and efficient that enables it to deal with large scale datasets.
arXiv Detail & Related papers (2020-02-14T00:00:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.