Related papers: Learning Sequence Representations by Non-local Recurrent Neural Memory

Learning Sequence Representations by Non-local Recurrent Neural Memory

URL: http://arxiv.org/abs/2207.09710v1
Date: Wed, 20 Jul 2022 07:26:15 GMT
Title: Learning Sequence Representations by Non-local Recurrent Neural Memory
Authors: Wenjie Pei, Xin Feng, Canmiao Fu, Qiong Cao, Guangming Lu and Yu-Wing Tai
Abstract summary: We propose a Non-local Recurrent Neural Memory (NRNM) for supervised sequence representation learning. Our model is able to capture long-range dependencies and latent high-level features can be distilled by our model. Our model compares favorably against other state-of-the-art methods specifically designed for each of these sequence applications.
Score: 61.65105481899744
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The key challenge of sequence representation learning is to capture the long-range temporal dependencies. Typical methods for supervised sequence representation learning are built upon recurrent neural networks to capture temporal dependencies. One potential limitation of these methods is that they only model one-order information interactions explicitly between adjacent time steps in a sequence, hence the high-order interactions between nonadjacent time steps are not fully exploited. It greatly limits the capability of modeling the long-range temporal dependencies since the temporal features learned by one-order interactions cannot be maintained for a long term due to temporal information dilution and gradient vanishing. To tackle this limitation, we propose the Non-local Recurrent Neural Memory (NRNM) for supervised sequence representation learning, which performs non-local operations \MR{by means of self-attention mechanism} to learn full-order interactions within a sliding temporal memory block and models global interactions between memory blocks in a gated recurrent manner. Consequently, our model is able to capture long-range dependencies. Besides, the latent high-level features contained in high-order interactions can be distilled by our model. We validate the effectiveness and generalization of our NRNM on three types of sequence applications across different modalities, including sequence classification, step-wise sequential prediction and sequence similarity learning. Our model compares favorably against other state-of-the-art methods specifically designed for each of these sequence applications.

Related papers

Sequential-Parallel Duality in Prefix Scannable Models [68.39855814099997]
Recent developments have given rise to various models, such as Gated Linear Attention (GLA) and Mamba.<n>This raises a natural question: can we characterize the full class of neural sequence models that support near-constant-time parallel evaluation and linear-time, constant-space sequential inference?
arXiv Detail & Related papers (2025-06-12T17:32:02Z)
ARN-LSTM: A Multi-Stream Fusion Model for Skeleton-based Action Recognition [5.86850933017833]
ARN-LSTM architecture is designed to address the challenge of simultaneously capturing spatial motion and temporal dynamics in action sequences. Our proposed model integrates joint, motion, and temporal information through a multi-stream fusion architecture.
arXiv Detail & Related papers (2024-11-04T03:29:51Z)
Finding the DeepDream for Time Series: Activation Maximization for Univariate Time Series [10.388704631887496]
We introduce Sequence Dreaming, a technique that adapts Maxim Activationization to analyze sequential information. We visualize the temporal dynamics and patterns most influential in model decision-making processes.
arXiv Detail & Related papers (2024-08-20T08:09:44Z)
Multi-Scale Spatial-Temporal Self-Attention Graph Convolutional Networks for Skeleton-based Action Recognition [0.0]
In this paper, we propose self-attention GCN hybrid model, Multi-Scale Spatial-Temporal self-attention (MSST)-GCN. We utilize spatial self-attention module with adaptive topology to understand intra-frame interactions within a frame among different body parts, and temporal self-attention module to examine correlations between frames of a node.
arXiv Detail & Related papers (2024-04-03T10:25:45Z)
Non-autoregressive Sequence-to-Sequence Vision-Language Models [63.77614880533488]
We propose a parallel decoding sequence-to-sequence vision-language model that marginalizes over multiple inference paths in the decoder. The model achieves performance on-par with its state-of-the-art autoregressive counterpart, but is faster at inference time.
arXiv Detail & Related papers (2024-03-04T17:34:59Z)
Long Sequence Hopfield Memory [32.28395813801847]
Sequence memory enables agents to encode, store, and retrieve complex sequences of stimuli and actions. We introduce a nonlinear interaction term, enhancing separation between the patterns. We extend this model to store sequences with variable timing between states' transitions.
arXiv Detail & Related papers (2023-06-07T15:41:03Z)
Deep Explicit Duration Switching Models for Time Series [84.33678003781908]
We propose a flexible model that is capable of identifying both state- and time-dependent switching dynamics. State-dependent switching is enabled by a recurrent state-to-switch connection. An explicit duration count variable is used to improve the time-dependent switching behavior.
arXiv Detail & Related papers (2021-10-26T17:35:21Z)
Efficient Modelling Across Time of Human Actions and Interactions [92.39082696657874]
We argue that current fixed-sized-temporal kernels in 3 convolutional neural networks (CNNDs) can be improved to better deal with temporal variations in the input. We study how we can better handle between classes of actions, by enhancing their feature differences over different layers of the architecture. The proposed approaches are evaluated on several benchmark action recognition datasets and show competitive results.
arXiv Detail & Related papers (2021-10-05T15:39:11Z)
Sequential convolutional network for behavioral pattern extraction in gait recognition [0.7874708385247353]
We propose a sequential convolutional network (SCN) to learn the walking pattern of individuals. In SCN, behavioral information extractors (BIE) are constructed to comprehend intermediate feature maps in time series. A multi-frame aggregator in SCN performs feature integration on a sequence whose length is uncertain, via a mobile 3D convolutional layer.
arXiv Detail & Related papers (2021-04-23T08:44:10Z)
Temporal Memory Relation Network for Workflow Recognition from Surgical Video [53.20825496640025]
We propose a novel end-to-end temporal memory relation network (TMNet) for relating long-range and multi-scale temporal patterns. We have extensively validated our approach on two benchmark surgical video datasets.
arXiv Detail & Related papers (2021-03-30T13:20:26Z)
Temporal Graph Modeling for Skeleton-based Action Recognition [25.788239844759246]
We propose a Temporal Enhanced Graph Convolutional Network (TE-GCN) to capture complex temporal dynamic. The constructed temporal relation graph explicitly builds connections between semantically related temporal features. Experiments are performed on two widely used large-scale datasets.
arXiv Detail & Related papers (2020-12-16T09:02:47Z)
Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence. This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time. Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.