Related papers: Efficiently Scanning and Resampling Spatio-Temporal Tasks with Irregular Observations

Efficiently Scanning and Resampling Spatio-Temporal Tasks with Irregular Observations

URL: http://arxiv.org/abs/2410.08681v1
Date: Fri, 11 Oct 2024 10:11:31 GMT
Title: Efficiently Scanning and Resampling Spatio-Temporal Tasks with Irregular Observations
Authors: Bryce Ferenczi, Michael Burke, Tom Drummond,
Abstract summary: We propose an algorithm that alternates between cross-attention between a 2D latent state and observation, and a discounted cumulative sum over the sequence dimension. Our algorithm achieves comparable accuracy with a lower parameter count, faster training and inference compared to existing methods.
Score: 13.491183255489396
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Various works have aimed at combining the inference efficiency of recurrent models and training parallelism of multi-head attention for sequence modeling. However, most of these works focus on tasks with fixed-dimension observation spaces, such as individual tokens in language modeling or pixels in image completion. To handle an observation space of varying size, we propose a novel algorithm that alternates between cross-attention between a 2D latent state and observation, and a discounted cumulative sum over the sequence dimension to efficiently accumulate historical information. We find this resampling cycle is critical for performance. To evaluate efficient sequence modeling in this domain, we introduce two multi-agent intention tasks: simulated agents chasing bouncing particles and micromanagement analysis in professional StarCraft II games. Our algorithm achieves comparable accuracy with a lower parameter count, faster training and inference compared to existing methods.

Related papers

Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent [53.637837706712794]
We propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs. Specifically, we introduce a Ghost Spatial Masking (GSM) module embedded within a Transformer encoder for spatial feature extraction. We benchmark three practical sports game datasets, Basketball-U, Football-U, and Soccer-U, for evaluation.
arXiv Detail & Related papers (2024-05-27T22:15:23Z)
Improving Efficiency of Diffusion Models via Multi-Stage Framework and Tailored Multi-Decoder Architectures [12.703947839247693]
Diffusion models, emerging as powerful deep generative tools, excel in various applications. However, their remarkable generative performance is hindered by slow training and sampling. This is due to the necessity of tracking extensive forward and reverse diffusion trajectories. We present a multi-stage framework inspired by our empirical findings to tackle these challenges.
arXiv Detail & Related papers (2023-12-14T17:48:09Z)
Modeling Continuous Motion for 3D Point Cloud Object Tracking [54.48716096286417]
This paper presents a novel approach that views each tracklet as a continuous stream. At each timestamp, only the current frame is fed into the network to interact with multi-frame historical features stored in a memory bank. To enhance the utilization of multi-frame features for robust tracking, a contrastive sequence enhancement strategy is proposed.
arXiv Detail & Related papers (2023-03-14T02:58:27Z)
Continuous-time convolutions model of event sequences [46.3471121117337]
Event sequences are non-uniform and sparse, making traditional models unsuitable. We propose COTIC, a method based on an efficient convolution neural network designed to handle the non-uniform occurrence of events over time. COTIC outperforms existing models in predicting the next event time and type, achieving an average rank of 1.5 compared to 3.714 for the nearest competitor.
arXiv Detail & Related papers (2023-02-13T10:34:51Z)
Spatio-Temporal Point Process for Multiple Object Tracking [30.041104276095624]
Multiple Object Tracking (MOT) focuses on modeling the relationship of detected objects among consecutive frames and merge them into different trajectories. We propose a novel framework that can effectively predict and mask-out noisy and confusing detection results before associating objects into trajectories.
arXiv Detail & Related papers (2023-02-05T18:14:08Z)
The impact of memory on learning sequence-to-sequence tasks [6.603326895384289]
Recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. We propose a model for a seq2seq task that has the advantage of providing explicit control over the degree of memory, or non-Markovianity, in the sequences.
arXiv Detail & Related papers (2022-05-29T14:57:33Z)
Time Series Anomaly Detection by Cumulative Radon Features [32.36217153362305]
In this work, we argue that shallow features suffice when combined with distribution distance measures. Our approach models each time series as a high dimensional empirical distribution of features, where each time-point constitutes a single sample. We show that by parameterizing each time series using cumulative Radon features, we are able to efficiently and effectively model the distribution of normal time series.
arXiv Detail & Related papers (2022-02-08T18:58:53Z)
Self-Attention Neural Bag-of-Features [103.70855797025689]
We build on the recently introduced 2D-Attention and reformulate the attention learning methodology. We propose a joint feature-temporal attention mechanism that learns a joint 2D attention mask highlighting relevant information.
arXiv Detail & Related papers (2022-01-26T17:54:14Z)
Set Based Stochastic Subsampling [85.5331107565578]
We propose a set-based two-stage end-to-end neural subsampling model that is jointly optimized with an textitarbitrary downstream task network. We show that it outperforms the relevant baselines under low subsampling rates on a variety of tasks including image classification, image reconstruction, function reconstruction and few-shot classification.
arXiv Detail & Related papers (2020-06-25T07:36:47Z)
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments [57.33699905852397]
We propose an online algorithm, SwAV, that takes advantage of contrastive methods without requiring to compute pairwise comparisons. Our method simultaneously clusters the data while enforcing consistency between cluster assignments. Our method can be trained with large and small batches and can scale to unlimited amounts of data.
arXiv Detail & Related papers (2020-06-17T14:00:42Z)
Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence. This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time. Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.