NAC-TCN: Temporal Convolutional Networks with Causal Dilated
Neighborhood Attention for Emotion Understanding
- URL: http://arxiv.org/abs/2312.07507v2
- Date: Sat, 6 Jan 2024 05:18:44 GMT
- Title: NAC-TCN: Temporal Convolutional Networks with Causal Dilated
Neighborhood Attention for Emotion Understanding
- Authors: Alexander Mehta and William Yang
- Abstract summary: We propose a method known as Neighborhood Attention with Convolutions TCN (NAC-TCN)
We accomplish this by introducing a causal version of Dilated Neighborhood Attention while incorporating it with convolutions.
Our model achieves comparable, better, or state-of-the-art performance over TCNs, TCAN, LSTMs, and GRUs while requiring fewer parameters on standard emotion recognition datasets.
- Score: 60.74434735079253
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the task of emotion recognition from videos, a key improvement has been to
focus on emotions over time rather than a single frame. There are many
architectures to address this task such as GRUs, LSTMs, Self-Attention,
Transformers, and Temporal Convolutional Networks (TCNs). However, these
methods suffer from high memory usage, large amounts of operations, or poor
gradients. We propose a method known as Neighborhood Attention with
Convolutions TCN (NAC-TCN) which incorporates the benefits of attention and
Temporal Convolutional Networks while ensuring that causal relationships are
understood which results in a reduction in computation and memory cost. We
accomplish this by introducing a causal version of Dilated Neighborhood
Attention while incorporating it with convolutions. Our model achieves
comparable, better, or state-of-the-art performance over TCNs, TCAN, LSTMs, and
GRUs while requiring fewer parameters on standard emotion recognition datasets.
We publish our code online for easy reproducibility and use in other projects.
Related papers
- Adaptive Spiking Neural Networks with Hybrid Coding [0.0]
Spi-temporal Neural Network (SNN) is a more energy-efficient and effective neural network compared to Artificial Neural Networks (ANNs)
Traditional SNNs utilize same neurons when processing input data across different time steps, limiting their ability to integrate and utilizetemporal information effectively.
This paper introduces a hybrid encoding approach that not only reduces the required time steps for training but also continues to improve the overall network performance.
arXiv Detail & Related papers (2024-08-22T13:58:35Z) - Signal-SGN: A Spiking Graph Convolutional Network for Skeletal Action Recognition via Learning Temporal-Frequency Dynamics [2.9578022754506605]
In skeletal-based action recognition, Graph Convolutional Networks (GCNs) face limitations due to their complexity and high energy consumption.
We propose a Signal-SGN(Spiking Graph Convolutional Network), which leverages the temporal dimension of skeletal sequences as the spiking timestep.
Our experiments show that the proposed models not only surpass existing SNN-based methods in accuracy but also reduce computational storage costs during training.
arXiv Detail & Related papers (2024-08-03T07:47:16Z) - TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - IMDeception: Grouped Information Distilling Super-Resolution Network [7.6146285961466]
Single-Image-Super-Resolution (SISR) is a classical computer vision problem that has benefited from the recent advancements in deep learning methods.
In this work, we propose the Global Progressive Refinement Module (GPRM) as a less parameter-demanding alternative to the IIC module for feature aggregation.
We also propose Grouped Information Distilling Blocks (GIDB) to further decrease the number of parameters and floating point operations persecond (FLOPS)
Experiments reveal that the proposed network performs on par with state-of-the-art models despite having a limited number of parameters and FLOPS
arXiv Detail & Related papers (2022-04-25T06:43:45Z) - AEGNN: Asynchronous Event-based Graph Neural Networks [54.528926463775946]
Event-based Graph Neural Networks generalize standard GNNs to process events as "evolving"-temporal graphs.
AEGNNs are easily trained on synchronous inputs and can be converted to efficient, "asynchronous" networks at test time.
arXiv Detail & Related papers (2022-03-31T16:21:12Z) - Spike-inspired Rank Coding for Fast and Accurate Recurrent Neural
Networks [5.986408771459261]
Biological spiking neural networks (SNNs) can temporally encode information in their outputs, whereas artificial neural networks (ANNs) conventionally do not.
Here we show that temporal coding such as rank coding (RC) inspired by SNNs can also be applied to conventional ANNs such as LSTMs.
RC-training also significantly reduces time-to-insight during inference, with a minimal decrease in accuracy.
We demonstrate these in two toy problems of sequence classification, and in a temporally-encoded MNIST dataset where our RC model achieves 99.19% accuracy after the first input time-step
arXiv Detail & Related papers (2021-10-06T15:51:38Z) - Neural network relief: a pruning algorithm based on neural activity [47.57448823030151]
We propose a simple importance-score metric that deactivates unimportant connections.
We achieve comparable performance for LeNet architectures on MNIST.
The algorithm is not designed to minimize FLOPs when considering current hardware and software implementations.
arXiv Detail & Related papers (2021-09-22T15:33:49Z) - Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain.
In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden.
Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z) - Spatio-Temporal Inception Graph Convolutional Networks for
Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition.
Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.