Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning
- URL: http://arxiv.org/abs/2407.15899v3
- Date: Thu, 25 Jul 2024 07:18:05 GMT
- Title: Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning
- Authors: Letian Gong, Huaiyu Wan, Shengnan Guo, Xiucheng Li, Yan Lin, Erwen Zheng, Tianyi Wang, Zeyu Zhou, Youfang Lin,
- Abstract summary: We propose a novel Spatial-Temporal Cross-view Contrastive Representation (ST CCR) framework for check-in sequence representation learning.
ST CCR employs self-supervision from "spatial topic" and "temporal intention" views, facilitating effective fusion of spatial and temporal information at the semantic level.
We extensively evaluate ST CCR on three real-world datasets and demonstrate its superior performance across three downstream tasks.
- Score: 21.580705078081078
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid growth of location-based services (LBS) has yielded massive amounts of data on human mobility. Effectively extracting meaningful representations for user-generated check-in sequences is pivotal for facilitating various downstream services. However, the user-generated check-in data are simultaneously influenced by the surrounding objective circumstances and the user's subjective intention. Specifically, the temporal uncertainty and spatial diversity exhibited in check-in data make it difficult to capture the macroscopic spatial-temporal patterns of users and to understand the semantics of user mobility activities. Furthermore, the distinct characteristics of the temporal and spatial information in check-in sequences call for an effective fusion method to incorporate these two types of information. In this paper, we propose a novel Spatial-Temporal Cross-view Contrastive Representation (STCCR) framework for check-in sequence representation learning. Specifically, STCCR addresses the above challenges by employing self-supervision from "spatial topic" and "temporal intention" views, facilitating effective fusion of spatial and temporal information at the semantic level. Besides, STCCR leverages contrastive clustering to uncover users' shared spatial topics from diverse mobility activities, while employing angular momentum contrast to mitigate the impact of temporal uncertainty and noise. We extensively evaluate STCCR on three real-world datasets and demonstrate its superior performance across three downstream tasks.
Related papers
- Spatio-Temporal Branching for Motion Prediction using Motion Increments [55.68088298632865]
Human motion prediction (HMP) has emerged as a popular research topic due to its diverse applications.
Traditional methods rely on hand-crafted features and machine learning techniques.
We propose a noveltemporal-temporal branching network using incremental information for HMP.
arXiv Detail & Related papers (2023-08-02T12:04:28Z) - STS-CCL: Spatial-Temporal Synchronous Contextual Contrastive Learning
for Urban Traffic Forecasting [4.947443433688782]
This work employs the advanced contrastive learning and proposes a novel Spatial-Temporalous Contextual Contrastive Learning (STS-CCL) model.
Experiments and evaluations demonstrate that building a predictor upon STS-CCL contrastive learning model gains superior performance than existing traffic forecasting benchmarks.
arXiv Detail & Related papers (2023-07-05T03:47:28Z) - Spatiotemporal k-means [39.98633724527769]
We propose a twotemporal clustering method called k-means (STk) that is able to analyze multi-scale clusters.
We show how STkM can be extended to more complex machine learning tasks, particularly unsupervised region of interest detection and tracking in videos.
arXiv Detail & Related papers (2022-11-10T04:40:31Z) - Spatial-Temporal Feature Extraction and Evaluation Network for Citywide
Traffic Condition Prediction [1.321203201549798]
A Double Layer - Spatial Temporal Feature Extraction and Evaluation (DL-STFEE) model is proposed.
Three sets of experiments are performed on actual traffic datasets to show that DL-STFEE can effectively capture the spatial-temporal features.
arXiv Detail & Related papers (2022-07-22T12:15:41Z) - Interpretable Time-series Representation Learning With Multi-Level
Disentanglement [56.38489708031278]
Disentangle Time Series (DTS) is a novel disentanglement enhancement framework for sequential data.
DTS generates hierarchical semantic concepts as the interpretable and disentangled representation of time-series.
DTS achieves superior performance in downstream applications, with high interpretability of semantic concepts.
arXiv Detail & Related papers (2021-05-17T22:02:24Z) - Spatial-Temporal Correlation and Topology Learning for Person
Re-Identification in Videos [78.45050529204701]
We propose a novel framework to pursue discriminative and robust representation by modeling cross-scale spatial-temporal correlation.
CTL utilizes a CNN backbone and a key-points estimator to extract semantic local features from human body.
It explores a context-reinforced topology to construct multi-scale graphs by considering both global contextual information and physical connections of human body.
arXiv Detail & Related papers (2021-04-15T14:32:12Z) - Temporal Memory Relation Network for Workflow Recognition from Surgical
Video [53.20825496640025]
We propose a novel end-to-end temporal memory relation network (TMNet) for relating long-range and multi-scale temporal patterns.
We have extensively validated our approach on two benchmark surgical video datasets.
arXiv Detail & Related papers (2021-03-30T13:20:26Z) - An Enhanced Adversarial Network with Combined Latent Features for
Spatio-Temporal Facial Affect Estimation in the Wild [1.3007851628964147]
This paper proposes a novel model that efficiently extracts both spatial and temporal features of the data by means of its enhanced temporal modelling based on latent features.
Our proposed model consists of three major networks, coined Generator, Discriminator, and Combiner, which are trained in an adversarial setting combined with curriculum learning to enable our adaptive attention modules.
arXiv Detail & Related papers (2021-02-18T04:10:12Z) - Co-Saliency Spatio-Temporal Interaction Network for Person
Re-Identification in Videos [85.6430597108455]
We propose a novel Co-Saliency Spatio-Temporal Interaction Network (CSTNet) for person re-identification in videos.
It captures the common salient foreground regions among video frames and explores the spatial-temporal long-range context interdependency from such regions.
Multiple spatialtemporal interaction modules within CSTNet are proposed, which exploit the spatial and temporal long-range context interdependencies on such features and spatial-temporal information correlation.
arXiv Detail & Related papers (2020-04-10T10:23:58Z) - A Spatial-Temporal Attentive Network with Spatial Continuity for
Trajectory Prediction [74.00750936752418]
We propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC)
First, spatial-temporal attention mechanism is presented to explore the most useful and important information.
Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity.
arXiv Detail & Related papers (2020-03-13T04:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.