Improving Speech Emotion Recognition Through Focus and Calibration
Attention Mechanisms
- URL: http://arxiv.org/abs/2208.10491v1
- Date: Sun, 21 Aug 2022 08:04:22 GMT
- Title: Improving Speech Emotion Recognition Through Focus and Calibration
Attention Mechanisms
- Authors: Junghun Kim, Yoojin An, Jihie Kim
- Abstract summary: We identify misalignments between the attention and the signal amplitude in the existing multi-head self-attention.
We propose to use a Focus-Attention (FA) mechanism and a novel-Attention (CA) mechanism in combination with the multi-head self-attention.
By employing the CA mechanism, the network can modulate the information flow by assigning different weights to each attention head and improve the utilization of surrounding contexts.
- Score: 0.5994412766684842
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Attention has become one of the most commonly used mechanisms in deep
learning approaches. The attention mechanism can help the system focus more on
the feature space's critical regions. For example, high amplitude regions can
play an important role for Speech Emotion Recognition (SER). In this paper, we
identify misalignments between the attention and the signal amplitude in the
existing multi-head self-attention. To improve the attention area, we propose
to use a Focus-Attention (FA) mechanism and a novel Calibration-Attention (CA)
mechanism in combination with the multi-head self-attention. Through the FA
mechanism, the network can detect the largest amplitude part in the segment. By
employing the CA mechanism, the network can modulate the information flow by
assigning different weights to each attention head and improve the utilization
of surrounding contexts. To evaluate the proposed method, experiments are
performed with the IEMOCAP and RAVDESS datasets. Experimental results show that
the proposed framework significantly outperforms the state-of-the-art
approaches on both datasets.
Related papers
- Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights [5.798431829723857]
This paper provides a comprehensive exploration of techniques and insights for designing attention mechanisms in Vision Transformer (ViT) networks.
We present a systematic taxonomy of various attention mechanisms within ViTs, employing redesigned approaches.
The analysis includes an exploration of the novelty, strengths, weaknesses, and an in-depth evaluation of the different proposed strategies.
arXiv Detail & Related papers (2024-03-28T23:31:59Z) - Towards Robust Semantic Segmentation against Patch-based Attack via Attention Refinement [68.31147013783387]
We observe that the attention mechanism is vulnerable to patch-based adversarial attacks.
In this paper, we propose a Robust Attention Mechanism (RAM) to improve the robustness of the semantic segmentation model.
arXiv Detail & Related papers (2024-01-03T13:58:35Z) - Switchable Self-attention Module [3.8992324495848356]
We propose a self-attention module SEM.
Based on the input information of the attention module and alternative attention operators, SEM can automatically decide to select and integrate attention operators to compute attention maps.
The effectiveness of SEM is demonstrated by extensive experiments on widely used benchmark datasets and popular self-attention networks.
arXiv Detail & Related papers (2022-09-13T01:19:38Z) - Global Attention Mechanism: Retain Information to Enhance
Channel-Spatial Interactions [1.4438155481047366]
We propose a global attention mechanism that boosts the performance of deep neural networks by reducing information reduction and magnifying the global interactive representations.
The evaluation of the proposed mechanism for the image classification task on CIFAR-100 and ImageNet-1K indicates that our method stably outperforms several recent attention mechanisms with both ResNet and lightweight MobileNet.
arXiv Detail & Related papers (2021-12-10T14:12:32Z) - Alignment Attention by Matching Key and Query Distributions [48.93793773929006]
This paper introduces alignment attention that explicitly encourages self-attention to match the distributions of the key and query within each head.
It is simple to convert any models with self-attention, including pre-trained ones, to the proposed alignment attention.
On a variety of language understanding tasks, we show the effectiveness of our method in accuracy, uncertainty estimation, generalization across domains, and robustness to adversarial attacks.
arXiv Detail & Related papers (2021-10-25T00:54:57Z) - Counterfactual Attention Learning for Fine-Grained Visual Categorization
and Re-identification [101.49122450005869]
We present a counterfactual attention learning method to learn more effective attention based on causal inference.
Specifically, we analyze the effect of the learned visual attention on network prediction.
We evaluate our method on a wide range of fine-grained recognition tasks.
arXiv Detail & Related papers (2021-08-19T14:53:40Z) - Repulsive Attention: Rethinking Multi-head Attention as Bayesian
Inference [68.12511526813991]
We provide a novel understanding of multi-head attention from a Bayesian perspective.
We propose a non-parametric approach that explicitly improves the repulsiveness in multi-head attention.
Experiments on various attention models and applications demonstrate that the proposed repulsive attention can improve the learned feature diversity.
arXiv Detail & Related papers (2020-09-20T06:32:23Z) - Linear Attention Mechanism: An Efficient Attention for Semantic
Segmentation [2.9488233765621295]
Linear Attention Mechanism is approximate to dot-product attention with much less memory and computational costs.
Experiments conducted on semantic segmentation demonstrated the effectiveness of linear attention mechanism.
arXiv Detail & Related papers (2020-07-29T15:18:46Z) - Deep Reinforced Attention Learning for Quality-Aware Visual Recognition [73.15276998621582]
We build upon the weakly-supervised generation mechanism of intermediate attention maps in any convolutional neural networks.
We introduce a meta critic network to evaluate the quality of attention maps in the main network.
arXiv Detail & Related papers (2020-07-13T02:44:38Z) - ADRN: Attention-based Deep Residual Network for Hyperspectral Image
Denoising [52.01041506447195]
We propose an attention-based deep residual network to learn a mapping from noisy HSI to the clean one.
Experimental results demonstrate that our proposed ADRN scheme outperforms the state-of-the-art methods both in quantitative and visual evaluations.
arXiv Detail & Related papers (2020-03-04T08:36:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.