Complex Sequential Understanding through the Awareness of Spatial and
Temporal Concepts
- URL: http://arxiv.org/abs/2006.00212v1
- Date: Sat, 30 May 2020 07:51:50 GMT
- Title: Complex Sequential Understanding through the Awareness of Spatial and
Temporal Concepts
- Authors: Bo Pang, Kaiwen Zha, Hanwen Cao, Jiajun Tang, Minghui Yu, Cewu Lu
- Abstract summary: Semi-Coupled Structure (SCS) consists of deep neural networks that decouple the complex spatial and temporal concepts learning.
SCS can learn to implicitly separate input information into independent parts and process these parts respectively.
For sequence-to-sequence problems, a Semi-Coupled Structure can predict future meteorological radar echo images based on observed images.
- Score: 44.43414201122335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding sequential information is a fundamental task for artificial
intelligence. Current neural networks attempt to learn spatial and temporal
information as a whole, limited their abilities to represent large scale
spatial representations over long-range sequences. Here, we introduce a new
modeling strategy called Semi-Coupled Structure (SCS), which consists of deep
neural networks that decouple the complex spatial and temporal concepts
learning. Semi-Coupled Structure can learn to implicitly separate input
information into independent parts and process these parts respectively.
Experiments demonstrate that a Semi-Coupled Structure can successfully annotate
the outline of an object in images sequentially and perform video action
recognition. For sequence-to-sequence problems, a Semi-Coupled Structure can
predict future meteorological radar echo images based on observed images. Taken
together, our results demonstrate that a Semi-Coupled Structure has the
capacity to improve the performance of LSTM-like models on large scale
sequential tasks.
Related papers
- The Dynamic Net Architecture: Learning Robust and Holistic Visual Representations Through Self-Organizing Networks [3.9848584845601014]
We present a novel intelligent-system architecture called "Dynamic Net Architecture" (DNA)
DNA relies on recurrence-stabilized networks and discuss it in application to vision.
arXiv Detail & Related papers (2024-07-08T06:22:10Z) - Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Disentangling Structured Components: Towards Adaptive, Interpretable and
Scalable Time Series Forecasting [52.47493322446537]
We develop a adaptive, interpretable and scalable forecasting framework, which seeks to individually model each component of the spatial-temporal patterns.
SCNN works with a pre-defined generative process of MTS, which arithmetically characterizes the latent structure of the spatial-temporal patterns.
Extensive experiments are conducted to demonstrate that SCNN can achieve superior performance over state-of-the-art models on three real-world datasets.
arXiv Detail & Related papers (2023-05-22T13:39:44Z) - Deeply-Coupled Convolution-Transformer with Spatial-temporal
Complementary Learning for Video-based Person Re-identification [91.56939957189505]
We propose a novel spatial-temporal complementary learning framework named Deeply-Coupled Convolution-Transformer (DCCT) for high-performance video-based person Re-ID.
Our framework could attain better performances than most state-of-the-art methods.
arXiv Detail & Related papers (2023-04-27T12:16:44Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - A journey in ESN and LSTM visualisations on a language task [77.34726150561087]
We trained ESNs and LSTMs on a Cross-Situationnal Learning (CSL) task.
The results are of three kinds: performance comparison, internal dynamics analyses and visualization of latent space.
arXiv Detail & Related papers (2020-12-03T08:32:01Z) - Sparse Coding Driven Deep Decision Tree Ensembles for Nuclear
Segmentation in Digital Pathology Images [15.236873250912062]
We propose an easily trained yet powerful representation learning approach with performance highly competitive to deep neural networks in a digital pathology image segmentation task.
The method, called sparse coding driven deep decision tree ensembles that we abbreviate as ScD2TE, provides a new perspective on representation learning.
arXiv Detail & Related papers (2020-08-13T02:59:31Z) - Interpreting video features: a comparison of 3D convolutional networks
and convolutional LSTM networks [1.462434043267217]
We compare how 3D convolutional networks and convolutional LSTM networks learn features across temporally dependent frames.
Our findings indicate that the 3D convolutional model concentrates on shorter events in the input sequence, and places its spatial focus on fewer, contiguous areas.
arXiv Detail & Related papers (2020-02-02T11:27:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.