Temporal Embeddings: Scalable Self-Supervised Temporal Representation
Learning from Spatiotemporal Data for Multimodal Computer Vision
- URL: http://arxiv.org/abs/2401.08581v1
- Date: Mon, 16 Oct 2023 02:53:29 GMT
- Title: Temporal Embeddings: Scalable Self-Supervised Temporal Representation
Learning from Spatiotemporal Data for Multimodal Computer Vision
- Authors: Yi Cao and Swetava Ganguli and Vipul Pandey
- Abstract summary: A novel approach is proposed to stratify landscape based on mobility activity time series.
The pixel-wise embeddings are converted to image-like channels that can be used for task-based, multimodal modeling.
- Score: 1.4127889233510498
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: There exists a correlation between geospatial activity temporal patterns and
type of land use. A novel self-supervised approach is proposed to stratify
landscape based on mobility activity time series. First, the time series signal
is transformed to the frequency domain and then compressed into task-agnostic
temporal embeddings by a contractive autoencoder, which preserves cyclic
temporal patterns observed in time series. The pixel-wise embeddings are
converted to image-like channels that can be used for task-based, multimodal
modeling of downstream geospatial tasks using deep semantic segmentation.
Experiments show that temporal embeddings are semantically meaningful
representations of time series data and are effective across different tasks
such as classifying residential area and commercial areas. Temporal embeddings
transform sequential, spatiotemporal motion trajectory data into semantically
meaningful image-like tensor representations that can be combined (multimodal
fusion) with other data modalities that are or can be transformed into
image-like tensor representations (for e.g., RBG imagery, graph embeddings of
road networks, passively collected imagery like SAR, etc.) to facilitate
multimodal learning in geospatial computer vision. Multimodal computer vision
is critical for training machine learning models for geospatial feature
detection to keep a geospatial mapping service up-to-date in real-time and can
significantly improve user experience and above all, user safety.
Related papers
- TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling [67.02157180089573]
Time series pre-training has recently garnered wide attention for its potential to reduce labeling expenses and benefit various downstream tasks.
This paper proposes TimeSiam as a simple but effective self-supervised pre-training framework for Time series based on Siamese networks.
arXiv Detail & Related papers (2024-02-04T13:10:51Z) - Automated Spatio-Temporal Graph Contrastive Learning [18.245433428868775]
We develop an automated-temporal augmentation scheme with a parameterized contrastive view generator.
AutoST can adapt to the heterogeneous graph with multi-view semantics well preserved.
Experiments for three downstream-temporal mining tasks on several real-world datasets demonstrate the significant performance gain.
arXiv Detail & Related papers (2023-05-06T03:52:33Z) - Self-Supervised Temporal Analysis of Spatiotemporal Data [2.2720298829059966]
There exists a correlation between geospatial activity temporal patterns and type of land use.
A novel self-supervised approach is proposed to stratify landscape based on mobility activity time series.
Experiments show that temporal embeddings are semantically meaningful representations of time series data and are effective across different tasks.
arXiv Detail & Related papers (2023-04-25T20:34:38Z) - Scalable Self-Supervised Representation Learning from Spatiotemporal
Motion Trajectories for Multimodal Computer Vision [0.0]
We propose a self-supervised, unlabeled method for learning representations of geographic locations from GPS trajectories.
We show that reachability embeddings are semantically meaningful representations and result in 4-23% gain in performance as measured using area under precision-recall curve (AUPRC) metric.
arXiv Detail & Related papers (2022-10-07T02:41:02Z) - SatMAE: Pre-training Transformers for Temporal and Multi-Spectral
Satellite Imagery [74.82821342249039]
We present SatMAE, a pre-training framework for temporal or multi-spectral satellite imagery based on Masked Autoencoder (MAE)
To leverage temporal information, we include a temporal embedding along with independently masking image patches across time.
arXiv Detail & Related papers (2022-07-17T01:35:29Z) - A Spatio-Temporal Multilayer Perceptron for Gesture Recognition [70.34489104710366]
We propose a multilayer state-weighted perceptron for gesture recognition in the context of autonomous vehicles.
An evaluation of TCG and Drive&Act datasets is provided to showcase the promising performance of our approach.
We deploy our model to our autonomous vehicle to show its real-time capability and stable execution.
arXiv Detail & Related papers (2022-04-25T08:42:47Z) - TCGL: Temporal Contrastive Graph for Self-supervised Video
Representation Learning [79.77010271213695]
We propose a novel video self-supervised learning framework named Temporal Contrastive Graph Learning (TCGL)
Our TCGL integrates the prior knowledge about the frame and snippet orders into graph structures, i.e., the intra-/inter- snippet Temporal Contrastive Graphs (TCG)
To generate supervisory signals for unlabeled videos, we introduce an Adaptive Snippet Order Prediction (ASOP) module.
arXiv Detail & Related papers (2021-12-07T09:27:56Z) - Reachability Embeddings: Scalable Self-Supervised Representation
Learning from Markovian Trajectories for Geospatial Computer Vision [0.0]
We propose a self-supervised method for learning representations of geographic locations from unlabeled GPS trajectories.
A scalable and distributed algorithm is presented to compute image-like representations, called reachability summaries.
We show that reachability embeddings are semantically meaningful representations and result in 4-23% gain in performance.
arXiv Detail & Related papers (2021-10-24T20:10:22Z) - Unsupervised Representation Learning for Time Series with Temporal
Neighborhood Coding [8.45908939323268]
We propose a self-supervised framework for learning generalizable representations for non-stationary time series.
Our motivation stems from the medical field, where the ability to model the dynamic nature of time series data is especially valuable.
arXiv Detail & Related papers (2021-06-01T19:53:24Z) - Spatial-Temporal Correlation and Topology Learning for Person
Re-Identification in Videos [78.45050529204701]
We propose a novel framework to pursue discriminative and robust representation by modeling cross-scale spatial-temporal correlation.
CTL utilizes a CNN backbone and a key-points estimator to extract semantic local features from human body.
It explores a context-reinforced topology to construct multi-scale graphs by considering both global contextual information and physical connections of human body.
arXiv Detail & Related papers (2021-04-15T14:32:12Z) - Geography-Aware Self-Supervised Learning [79.4009241781968]
We show that due to their different characteristics, a non-trivial gap persists between contrastive and supervised learning on standard benchmarks.
We propose novel training methods that exploit the spatially aligned structure of remote sensing data.
Our experiments show that our proposed method closes the gap between contrastive and supervised learning on image classification, object detection and semantic segmentation for remote sensing.
arXiv Detail & Related papers (2020-11-19T17:29:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.