Related papers: Choose A Table: Tensor Dirichlet Process Multinomial Mixture Model with Graphs for Passenger Trajectory Clustering

Choose A Table: Tensor Dirichlet Process Multinomial Mixture Model with Graphs for Passenger Trajectory Clustering

URL: http://arxiv.org/abs/2310.20224v1
Date: Tue, 31 Oct 2023 06:53:04 GMT
Title: Choose A Table: Tensor Dirichlet Process Multinomial Mixture Model with Graphs for Passenger Trajectory Clustering
Authors: Ziyue Li, Hao Yan, Chen Zhang, Lijun Sun, Wolfgang Ketter, Fugee Tsung
Abstract summary: We propose a novel tensor Dirichlet Process Multinomial Mixture model with graphs. The model can preserve the hierarchical structure of the multi-dimensional trip information and cluster them in a unified one-step manner. A case study based on Hong Kong metro passenger data is conducted to demonstrate the automatic process of cluster amount evolution.
Score: 33.36290451052104
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Passenger clustering based on trajectory records is essential for transportation operators. However, existing methods cannot easily cluster the passengers due to the hierarchical structure of the passenger trip information, including multiple trips within each passenger and multi-dimensional information about each trip. Furthermore, existing approaches rely on an accurate specification of the clustering number to start. Finally, existing methods do not consider spatial semantic graphs such as geographical proximity and functional similarity between the locations. In this paper, we propose a novel tensor Dirichlet Process Multinomial Mixture model with graphs, which can preserve the hierarchical structure of the multi-dimensional trip information and cluster them in a unified one-step manner with the ability to determine the number of clusters automatically. The spatial graphs are utilized in community detection to link the semantic neighbors. We further propose a tensor version of Collapsed Gibbs Sampling method with a minimum cluster size requirement. A case study based on Hong Kong metro passenger data is conducted to demonstrate the automatic process of cluster amount evolution and better cluster quality measured by within-cluster compactness and cross-cluster separateness. The code is available at https://github.com/bonaldli/TensorDPMM-G.

Related papers

Measures of Overlapping Multivariate Gaussian Clusters in Unsupervised Online Learning [0.0]
The aim of online learning from data streams is to create clustering, classification, or regression models that can adapt over time.<n>In the case of clustering, this can result in a large number of clusters that may overlap and should be merged.<n>Our proposed dissimilarity measure is specifically designed to detect overlap rather than dissimilarity.
arXiv Detail & Related papers (2025-08-21T11:06:02Z)
Hierarchical clustering with maximum density paths and mixture models [39.42511559155036]
Hierarchical clustering is an effective and interpretable technique for analyzing structure in data. It is particularly helpful in settings where the exact number of clusters is unknown, and provides a robust framework for exploring complex datasets. Our method addresses this limitation by leveraging a two-stage approach, first employing a Gaussian or Student's t mixture model to overcluster the data, and then hierarchically merging clusters based on the induced density landscape. This approach yields state-of-the-art clustering performance while also providing a meaningful hierarchy, making it a valuable tool for exploratory data analysis.
arXiv Detail & Related papers (2025-03-19T15:37:51Z)
Hyperoctant Search Clustering: A Method for Clustering Data in High-Dimensional Hyperspheres [0.0]
We propose a new clustering method based on a topological approach applied to regions of space defined by signs of coordinates (hyperoctants) According to a density criterion, the method builds clusters of data points based on the partitioning of a graph. We choose the application of topic detection, which is an important task in text mining.
arXiv Detail & Related papers (2025-03-10T23:41:44Z)
Categorical Data Clustering via Value Order Estimated Distance Metric Learning [53.28598689867732]
This paper introduces a novel order distance metric learning approach to intuitively represent categorical attribute values.<n>A new joint learning paradigm is developed to alternatively perform clustering and order distance metric learning.<n>The proposed method achieves superior clustering accuracy on categorical and mixed datasets.
arXiv Detail & Related papers (2024-11-19T08:23:25Z)
Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering. In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework. In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z)
Tensor Dirichlet Process Multinomial Mixture Model for Passenger Trajectory Clustering [21.51161506280304]
We propose a novel Dirichlet Process Multinomial model (Tensor-DPMM) It is designed to preserve the multi-mode and hierarchical structure of the multi-dimensional trip information via tensor, and cluster them in a unified one-step manner. It also has the ability to determine the number of clusters automatically by using the Dirichlet Process to decide the probabilities for a passenger to be either assigned in an existing cluster or to create a new cluster.
arXiv Detail & Related papers (2023-06-23T21:44:07Z)
DMS: Differentiable Mean Shift for Dataset Agnostic Task Specific Clustering Using Side Information [0.0]
We present a novel approach, in which we learn to cluster data directly from side information. We do not need to know the number of clusters, their centers or any kind of distance metric for similarity. Our method is able to divide the same data points in various ways dependant on the needs of a specific task.
arXiv Detail & Related papers (2023-05-29T13:45:49Z)
Hard Regularization to Prevent Deep Online Clustering Collapse without Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed. While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster. We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z)
Fine-grained Graph Learning for Multi-view Subspace Clustering [2.4094285826152593]
We propose a fine-grained graph learning framework for multi-view subspace clustering (FGL-MSC) The main challenge is how to optimize the fine-grained fusion weights while generating the learned graph that fits the clustering task. Experiments on eight real-world datasets show that the proposed framework has comparable performance to the state-of-the-art methods.
arXiv Detail & Related papers (2022-01-12T18:00:29Z)
Anomaly Clustering: Grouping Images into Coherent Clusters of Anomaly Types [60.45942774425782]
We introduce anomaly clustering, whose goal is to group data into coherent clusters of anomaly types. This is different from anomaly detection, whose goal is to divide anomalies from normal data. We present a simple yet effective clustering framework using a patch-based pretrained deep embeddings and off-the-shelf clustering methods.
arXiv Detail & Related papers (2021-12-21T23:11:33Z)
Trajectory Clustering Performance Evaluation: If we know the answer, it's not clustering [0.6472434306724609]
Trajectory clustering is an unsupervised task. We perform a comprehensive comparison of similarity measures, clustering algorithms and evaluation measures using trajectory data from seven intersections.
arXiv Detail & Related papers (2021-12-02T19:25:38Z)
Learning Hierarchical Graph Neural Networks for Image Clustering [81.5841862489509]
We propose a hierarchical graph neural network (GNN) model that learns how to cluster a set of images into an unknown number of identities. Our hierarchical GNN uses a novel approach to merge connected components predicted at each level of the hierarchy to form a new graph at the next level.
arXiv Detail & Related papers (2021-07-03T01:28:42Z)
Variational Auto Encoder Gradient Clustering [0.0]
Clustering using deep neural network models have been extensively studied in recent years. This article investigates how probability function gradient ascent can be used to process data in order to achieve better clustering. We propose a simple yet effective method for investigating suitable number of clusters for data, based on the DBSCAN clustering algorithm.
arXiv Detail & Related papers (2021-05-11T08:00:36Z)
Structured Graph Learning for Clustering and Semi-supervised Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data. Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure. Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z)
LSD-C: Linearly Separable Deep Clusters [145.89790963544314]
We present LSD-C, a novel method to identify clusters in an unlabeled dataset. Our method draws inspiration from recent semi-supervised learning practice and proposes to combine our clustering algorithm with self-supervised pretraining and strong data augmentation. We show that our approach significantly outperforms competitors on popular public image benchmarks including CIFAR 10/100, STL 10 and MNIST, as well as the document classification dataset Reuters 10K.
arXiv Detail & Related papers (2020-06-17T17:58:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.