Non-Linearities Improve OrigiNet based on Active Imaging for Micro
Expression Recognition
- URL: http://arxiv.org/abs/2005.07991v1
- Date: Sat, 16 May 2020 13:44:49 GMT
- Title: Non-Linearities Improve OrigiNet based on Active Imaging for Micro
Expression Recognition
- Authors: Monu Verma, Santosh Kumar Vipparthi, Girdhari Singh
- Abstract summary: We introduce an active imaging concept to segregate active changes in expressive regions of a video into a single frame.
We propose a shallow CNN network: hybrid local receptive field based augmented learning network (OrigiNet) that efficiently learns significant features of the micro-expressions in a video.
- Score: 8.112868317921853
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Micro expression recognition (MER)is a very challenging task as the
expression lives very short in nature and demands feature modeling with the
involvement of both spatial and temporal dynamics. Existing MER systems exploit
CNN networks to spot the significant features of minor muscle movements and
subtle changes. However, existing networks fail to establish a relationship
between spatial features of facial appearance and temporal variations of facial
dynamics. Thus, these networks were not able to effectively capture minute
variations and subtle changes in expressive regions. To address these issues,
we introduce an active imaging concept to segregate active changes in
expressive regions of a video into a single frame while preserving facial
appearance information. Moreover, we propose a shallow CNN network: hybrid
local receptive field based augmented learning network (OrigiNet) that
efficiently learns significant features of the micro-expressions in a video. In
this paper, we propose a new refined rectified linear unit (RReLU), which
overcome the problem of vanishing gradient and dying ReLU. RReLU extends the
range of derivatives as compared to existing activation functions. The RReLU
not only injects a nonlinearity but also captures the true edges by imposing
additive and multiplicative property. Furthermore, we present an augmented
feature learning block to improve the learning capabilities of the network by
embedding two parallel fully connected layers. The performance of proposed
OrigiNet is evaluated by conducting leave one subject out experiments on four
comprehensive ME datasets. The experimental results demonstrate that OrigiNet
outperformed state-of-the-art techniques with less computational complexity.
Related papers
- Three-Stream Temporal-Shift Attention Network Based on Self-Knowledge Distillation for Micro-Expression Recognition [21.675660978188617]
Micro-expression recognition is crucial in many fields, including criminal analysis and psychotherapy.
A three-stream temporal-shift attention network based on self-knowledge distillation called SKD-TSTSAN is proposed in this paper.
arXiv Detail & Related papers (2024-06-25T13:22:22Z) - Residual Connections Harm Generative Representation Learning [22.21222349477351]
We show that introducing a weighting factor to reduce the influence of identity shortcuts in residual networks significantly enhances semantic feature learning.
Our modification improves linear probing accuracy for both, notably increasing ImageNet accuracy from 67.8% to 72.7% for MAEs with a VIT-B/16 backbone.
arXiv Detail & Related papers (2024-04-16T23:05:17Z) - Multi-Scale Spatio-Temporal Graph Convolutional Network for Facial Expression Spotting [11.978551396144532]
We propose a Multi-Scale Spatio-Temporal Graph Conal Network (SpoT-CN) for facial expression spotting.
We track both short- and long-term motion of facial muscles in compact sliding windows whose window length adapts to the temporal receptive field of the network.
This network learns both local and global features from multiple scales of facial graph structures using our proposed facial localvolution graph pooling (FLGP)
arXiv Detail & Related papers (2024-03-24T03:10:39Z) - Multiscale Low-Frequency Memory Network for Improved Feature Extraction
in Convolutional Neural Networks [13.815116154370834]
We introduce a novel framework, the Multiscale Low-Frequency Memory (MLFM) Network.
The MLFM efficiently preserves low-frequency information, enhancing performance in targeted computer vision tasks.
Our work builds upon the existing CNN foundations and paves the way for future advancements in computer vision.
arXiv Detail & Related papers (2024-03-13T00:48:41Z) - SAR Despeckling Using Overcomplete Convolutional Networks [53.99620005035804]
despeckling is an important problem in remote sensing as speckle degrades SAR images.
Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods.
This study employs an overcomplete CNN architecture to focus on learning low-level features by restricting the receptive field.
We show that the proposed network improves despeckling performance compared to recent despeckling methods on synthetic and real SAR images.
arXiv Detail & Related papers (2022-05-31T15:55:37Z) - Towards Understanding the Effectiveness of Attention Mechanism [7.809333418199897]
We find that there is only a weak consistency between the attention weights of features and their importance.
With the high order non-linearity brought by the feature map multiplication, it played a regularization role on CNNs.
We design feature map multiplication network (FMMNet) by simply replacing the feature map addition in ResNet with feature map multiplication.
arXiv Detail & Related papers (2021-06-29T02:58:59Z) - The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer.
Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z) - Video-based Facial Expression Recognition using Graph Convolutional
Networks [57.980827038988735]
We introduce a Graph Convolutional Network (GCN) layer into a common CNN-RNN based model for video-based facial expression recognition.
We evaluate our method on three widely-used datasets, CK+, Oulu-CASIA and MMI, and also one challenging wild dataset AFEW8.0.
arXiv Detail & Related papers (2020-10-26T07:31:51Z) - The FaceChannel: A Fast & Furious Deep Neural Network for Facial
Expression Recognition [71.24825724518847]
Current state-of-the-art models for automatic Facial Expression Recognition (FER) are based on very deep neural networks that are effective but rather expensive to train.
We formalize the FaceChannel, a light-weight neural network that has much fewer parameters than common deep neural networks.
We demonstrate how our model achieves a comparable, if not better, performance to the current state-of-the-art in FER.
arXiv Detail & Related papers (2020-09-15T09:25:37Z) - Fully Convolutional Networks for Continuous Sign Language Recognition [83.85895472824221]
Continuous sign language recognition is a challenging task that requires learning on both spatial and temporal dimensions.
We propose a fully convolutional network (FCN) for online SLR to concurrently learn spatial and temporal features from weakly annotated video sequences.
arXiv Detail & Related papers (2020-07-24T08:16:37Z) - Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation.
We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters.
As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.