Deep Learning Approach for Singer Voice Classification of Vietnamese
Popular Music
- URL: http://arxiv.org/abs/2102.12111v1
- Date: Wed, 24 Feb 2021 08:03:07 GMT
- Title: Deep Learning Approach for Singer Voice Classification of Vietnamese
Popular Music
- Authors: Toan Pham Van, Ngoc N. Tran, and Ta Minh Thanh
- Abstract summary: We propose a new method to identify the singer's name based on analysis of Vietnamese popular music.
We employ the use of vocal segment detection and singing voice separation as the pre-processing steps.
To verify the accuracy of our methods, we evaluate on a dataset of 300 Vietnamese songs from 18 famous singers.
- Score: 1.2043574473965315
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Singer voice classification is a meaningful task in the digital era. With a
huge number of songs today, identifying a singer is very helpful for music
information retrieval, music properties indexing, and so on. In this paper, we
propose a new method to identify the singer's name based on analysis of
Vietnamese popular music. We employ the use of vocal segment detection and
singing voice separation as the pre-processing steps. The purpose of these
steps is to extract the singer's voice from the mixture sound. In order to
build a singer classifier, we propose a neural network architecture working
with Mel Frequency Cepstral Coefficient as extracted input features from said
vocal. To verify the accuracy of our methods, we evaluate on a dataset of 300
Vietnamese songs from 18 famous singers. We achieve an accuracy of 92.84% with
5-fold stratified cross-validation, the best result compared to other methods
on the same data set.
Related papers
- GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks [52.30565320125514]
GTSinger is a large global, multi-technique, free-to-use, high-quality singing corpus with realistic music scores.
We collect 80.59 hours of high-quality singing voices, forming the largest recorded singing dataset.
We conduct four benchmark experiments: technique-controllable singing voice synthesis, technique recognition, style transfer, and speech-to-singing conversion.
arXiv Detail & Related papers (2024-09-20T18:18:14Z) - From Real to Cloned Singer Identification [7.407642348217603]
We present three embedding models that are trained using a singer-level contrastive learning scheme.
We demonstrate that all three models are highly capable of identifying real singers.
However, their performance deteriorates when classifying cloned versions of singers in our evaluation set.
arXiv Detail & Related papers (2024-07-11T16:25:21Z) - Singer Identity Representation Learning using Self-Supervised Techniques [0.0]
We propose a framework for training singer identity encoders to extract representations suitable for various singing-related tasks.
We explore different self-supervised learning techniques on a large collection of isolated vocal tracks.
We evaluate the quality of the resulting representations on singer similarity and identification tasks.
arXiv Detail & Related papers (2024-01-10T10:41:38Z) - RMSSinger: Realistic-Music-Score based Singing Voice Synthesis [56.51475521778443]
RMS-SVS aims to generate high-quality singing voices given realistic music scores with different note types.
We propose RMSSinger, the first RMS-SVS method, which takes realistic music scores as input.
In RMSSinger, we introduce word-level modeling to avoid the time-consuming phoneme duration annotation and the complicated phoneme-level mel-note alignment.
arXiv Detail & Related papers (2023-05-18T03:57:51Z) - A Phoneme-Informed Neural Network Model for Note-Level Singing
Transcription [11.951441023641975]
We propose a method of finding note onsets of singing voice more accurately by leveraging the linguistic characteristics of singing.
Our approach substantially improves the performance of singing transcription and emphasizes the importance of linguistic features in singing analysis.
arXiv Detail & Related papers (2023-04-12T15:36:01Z) - Learning the Beauty in Songs: Neural Singing Voice Beautifier [69.21263011242907]
We are interested in a novel task, singing voice beautifying (SVB)
Given the singing voice of an amateur singer, SVB aims to improve the intonation and vocal tone of the voice, while keeping the content and vocal timbre.
We introduce Neural Singing Voice Beautifier (NSVB), the first generative model to solve the SVB task.
arXiv Detail & Related papers (2022-02-27T03:10:12Z) - VAW-GAN for Singing Voice Conversion with Non-parallel Training Data [81.79070894458322]
We propose a singing voice conversion framework based on VAW-GAN.
We train an encoder to disentangle singer identity and singing prosody (F0) from phonetic content.
By conditioning on singer identity and F0, the decoder generates output spectral features with unseen target singer identity.
arXiv Detail & Related papers (2020-08-10T09:44:10Z) - Unsupervised Cross-Domain Singing Voice Conversion [105.1021715879586]
We present a wav-to-wav generative model for the task of singing voice conversion from any identity.
Our method utilizes both an acoustic model, trained for the task of automatic speech recognition, together with melody extracted features to drive a waveform-based generator.
arXiv Detail & Related papers (2020-08-06T18:29:11Z) - DeepSinger: Singing Voice Synthesis with Data Mined From the Web [194.10598657846145]
DeepSinger is a multi-lingual singing voice synthesis system built from scratch using singing training data mined from music websites.
We evaluate DeepSinger on our mined singing dataset that consists of about 92 hours data from 89 singers on three languages.
arXiv Detail & Related papers (2020-07-09T07:00:48Z) - Addressing the confounds of accompaniments in singer identification [29.949390919663596]
We employ open-unmix, an open source tool with state-of-the-art performance in source separation, to separate the vocal and instrumental tracks of music.
We then investigate two means to train a singer identification model: by learning from the separated vocal only, or from an augmented set of data.
arXiv Detail & Related papers (2020-02-17T07:49:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.