Related papers: SER_AMPEL: a multi-source dataset for speech emotion recognition of Italian older adults

Related papers

EmoBench-UA: A Benchmark Dataset for Emotion Detection in Ukrainian [60.61343989805093]
EmoBench-UA is the first annotated dataset for emotion detection in Ukrainian texts.<n>Our findings highlight the challenges of emotion classification in non-mainstream languages like Ukrainian.
arXiv Detail & Related papers (2025-05-29T09:49:57Z)
CAMEO: Collection of Multilingual Emotional Speech Corpora [0.0]
This paper presents a collection of multilingual emotional speech datasets designed to facilitate research in emotion recognition and other speech-related tasks.<n>The main objectives were to ensure easy access to the data, to allow normalization of the results, and to provide a standardized benchmark for evaluating speech emotion recognition systems.<n>The collection, along with metadata, and a leaderboard, is publicly available via the Hugging Face platform.
arXiv Detail & Related papers (2025-05-16T09:52:00Z)
Summarizing Speech: A Comprehensive Survey [76.13011304983458]
Speech summarization has become an essential tool for efficiently managing and accessing the growing volume of spoken and audiovisual content.<n>This survey examines existing datasets and evaluation protocols, which are crucial for assessing the quality of summarization approaches.
arXiv Detail & Related papers (2025-04-10T17:50:53Z)
PSCon: Product Search Through Conversations [55.94925947614474]
Conversational Product Search ( CPS) systems interact with users via natural language to offer personalized and context-aware product lists. Most existing research on CPS is limited to simulated conversations, due to the lack of a real CPS dataset driven by human-like language. In this paper, we propose a CPS data collection protocol and create a new CPS dataset, called PSCon, which assists product search through conversations with human-like language.
arXiv Detail & Related papers (2025-02-19T17:05:42Z)
M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis [23.523947343171926]
We present M-ABSA, a comprehensive dataset spanning 7 domains and 21 languages. Our primary focus is on triplet extraction, which involves identifying aspect terms, aspect categories, and sentiment polarities. Our empirical findings highlight that the dataset enables diverse evaluation tasks, such as multilingual and multi-domain transfer learning.
arXiv Detail & Related papers (2025-02-17T14:16:01Z)
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization [31.01716151301142]
We present a large-scale far-field overlapping speech dataset to advance research in speech separation, recognition, and speaker diarization. This dataset is a critical resource for decoding Who said What and When'' in multi-talker, reverberant environments.
arXiv Detail & Related papers (2024-09-01T19:23:08Z)
Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition [48.527630771422935]
We propose a synthetic data generation pipeline for multi-speaker conversational ASR. We conduct evaluation by fine-tuning the Whisper ASR model for telephone and distant conversational speech settings.
arXiv Detail & Related papers (2024-08-17T14:47:05Z)
What Does it Take to Generalize SER Model Across Datasets? A Comprehensive Benchmark [13.820963986497128]
Speech emotion recognition (SER) is essential for enhancing human-computer interaction in speech-based applications. Despite improvements in specific emotional datasets, there is still a research gap in SER's capability to generalize across real-world situations. In this paper, we investigate approaches to generalize the SER system across different emotion datasets.
arXiv Detail & Related papers (2024-06-14T11:27:19Z)
When a Language Question Is at Stake. A Revisited Approach to Label Sensitive Content [0.0]
Article revisits an approach of pseudo-labeling sensitive data on the example of Ukrainian tweets covering the Russian-Ukrainian war. We provide a fundamental statistical analysis of the obtained data, evaluation of models used for pseudo-labelling, and set further guidelines on how the scientists can leverage the corpus.
arXiv Detail & Related papers (2023-11-17T13:35:10Z)
Sentiment recognition of Italian elderly through domain adaptation on cross-corpus speech dataset [77.99182201815763]
The aim of this work is to define a speech emotion recognition (SER) model able to recognize positive, neutral and negative emotions in natural conversations of Italian elderly people.
arXiv Detail & Related papers (2022-11-14T12:39:41Z)
Towards Relation Extraction From Speech [56.36416922396724]
We propose a new listening information extraction task, i.e., speech relation extraction. We construct the training dataset for speech relation extraction via text-to-speech systems, and we construct the testing dataset via crowd-sourcing with native English speakers. We conduct comprehensive experiments to distinguish the challenges in speech relation extraction, which may shed light on future explorations.
arXiv Detail & Related papers (2022-10-17T05:53:49Z)
Dialogue Term Extraction using Transfer Learning and Topological Data Analysis [0.8185867455104834]
We explore different features that can enable systems to discover realizations of domains, slots, and values in dialogues in a purely data-driven fashion. To examine the utility of each feature set, we train a seed model based on the widely used MultiWOZ data-set. Our method outperforms the previously proposed approach that relies solely on word embeddings.
arXiv Detail & Related papers (2022-08-22T17:04:04Z)
Automatic Dialect Density Estimation for African American English [74.44807604000967]
We explore automatic prediction of dialect density of the African American English (AAE) dialect. dialect density is defined as the percentage of words in an utterance that contain characteristics of the non-standard dialect. We show a significant correlation between our predicted and ground truth dialect density measures for AAE speech in this database.
arXiv Detail & Related papers (2022-04-03T01:34:48Z)
Dialog speech sentiment classification for imbalanced datasets [7.84604505907019]
In this paper, we use single and bi-modal analysis of short dialog utterances and gain insights on the main factors that aid in sentiment detection. We propose an architecture which uses a learning rate scheduler and different monitoring criteria and provides state-of-the-art results for the SWITCHBOARD imbalanced sentiment dataset.
arXiv Detail & Related papers (2021-09-15T11:43:04Z)
Vyaktitv: A Multimodal Peer-to-Peer Hindi Conversations based Dataset for Personality Assessment [50.15466026089435]
We present a novel peer-to-peer Hindi conversation dataset- Vyaktitv. It consists of high-quality audio and video recordings of the participants, with Hinglish textual transcriptions for each conversation. The dataset also contains a rich set of socio-demographic features, like income, cultural orientation, amongst several others, for all the participants.
arXiv Detail & Related papers (2020-08-31T17:44:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.