Unsupervised low-rank representations for speech emotion recognition
- URL: http://arxiv.org/abs/2104.07072v1
- Date: Wed, 14 Apr 2021 18:30:58 GMT
- Title: Unsupervised low-rank representations for speech emotion recognition
- Authors: Georgios Paraskevopoulos, Efthymios Tzinis, Nikolaos Ellinas,
Theodoros Giannakopoulos and Alexandros Potamianos
- Abstract summary: We examine the use of linear and non-linear dimensionality reduction algorithms for extracting low-rank feature representations for speech emotion recognition.
We report speech emotion recognition (SER) results for learned representations on two databases using different classification methods.
- Score: 78.38221758430244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We examine the use of linear and non-linear dimensionality reduction
algorithms for extracting low-rank feature representations for speech emotion
recognition. Two feature sets are used, one based on low-level descriptors and
their aggregations (IS10) and one modeling recurrence dynamics of speech (RQA),
as well as their fusion. We report speech emotion recognition (SER) results for
learned representations on two databases using different classification
methods. Classification with low-dimensional representations yields performance
improvement in a variety of settings. This indicates that dimensionality
reduction is an effective way to combat the curse of dimensionality for SER.
Visualization of features in two dimensions provides insight into
discriminatory abilities of reduced feature sets.
Related papers
- An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition [49.45660055499103]
Zero-shot human skeleton-based action recognition aims to construct a model that can recognize actions outside the categories seen during training.
Previous research has focused on aligning sequences' visual and semantic spatial distributions.
We introduce a new loss function sampling method to obtain a tight and robust representation.
arXiv Detail & Related papers (2024-06-02T06:53:01Z) - RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition [78.97487780589574]
Multimodal Large Language Models (MLLMs) excel at classifying fine-grained categories.
This paper introduces a Retrieving And Ranking augmented method for MLLMs.
Our proposed approach not only addresses the inherent limitations in fine-grained recognition but also preserves the model's comprehensive knowledge base.
arXiv Detail & Related papers (2024-03-20T17:59:55Z) - Dynamic Visual Semantic Sub-Embeddings and Fast Re-Ranking [0.5242869847419834]
We propose a Dynamic Visual Semantic Sub-Embeddings framework (DVSE) to reduce the information entropy.
To encourage the generated candidate embeddings to capture various semantic variations, we construct a mixed distribution.
We compare the performance with existing set-based method using four image feature encoders and two text feature encoders on three benchmark datasets.
arXiv Detail & Related papers (2023-09-15T04:39:11Z) - Semantic Prompt for Few-Shot Image Recognition [76.68959583129335]
We propose a novel Semantic Prompt (SP) approach for few-shot learning.
The proposed approach achieves promising results, improving the 1-shot learning accuracy by 3.67% on average.
arXiv Detail & Related papers (2023-03-24T16:32:19Z) - Training speech emotion classifier without categorical annotations [1.5609988622100528]
The main aim of this study is to investigate the relation between these two representations.
The proposed approach contains a regressor model which is trained to predict a vector of continuous values in dimensional representation for given speech audio.
The output of this model can be interpreted as an emotional category using a mapping algorithm.
arXiv Detail & Related papers (2022-10-14T08:47:41Z) - Visual Perturbation-aware Collaborative Learning for Overcoming the
Language Prior Problem [60.0878532426877]
We propose a novel collaborative learning scheme from the viewpoint of visual perturbation calibration.
Specifically, we devise a visual controller to construct two sorts of curated images with different perturbation extents.
The experimental results on two diagnostic VQA-CP benchmark datasets evidently demonstrate its effectiveness.
arXiv Detail & Related papers (2022-07-24T23:50:52Z) - Speech Emotion Recognition Using Deep Sparse Auto-Encoder Extreme
Learning Machine with a New Weighting Scheme and Spectro-Temporal Features
Along with Classical Feature Selection and A New Quantum-Inspired Dimension
Reduction Method [3.8073142980733]
A system for speech emotion recognition (SER) based on speech signal is proposed.
The system consists of three stages: feature extraction, feature selection, and finally feature classification.
A new weighting method has also been proposed to deal with class imbalance, which is more efficient than existing weighting methods.
arXiv Detail & Related papers (2021-11-13T11:09:38Z) - Dimensionality Reduction for Sentiment Classification: Evolving for the
Most Prominent and Separable Features [4.156782836736784]
In sentiment classification, the enormous amount of textual data, its immense dimensionality, and inherent noise make it extremely difficult for machine learning classifiers to extract high-level and complex abstractions.
In the existing dimensionality reduction techniques, the number of components needs to be set manually which results in loss of the most prominent features.
We have proposed a new framework that consists of two-dimensionality reduction techniques i.e., Sentiment Term Presence Count (SentiTPC) and Sentiment Term Presence Ratio (SentiTPR)
arXiv Detail & Related papers (2020-06-01T09:46:52Z) - Domain-aware Visual Bias Eliminating for Generalized Zero-Shot Learning [150.42959029611657]
Domain-aware Visual Bias Eliminating (DVBE) network constructs two complementary visual representations.
For unseen images, we automatically search an optimal semantic-visual alignment architecture.
arXiv Detail & Related papers (2020-03-30T08:17:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.