Spectrogram-Based Detection of Auto-Tuned Vocals in Music Recordings
- URL: http://arxiv.org/abs/2403.05380v1
- Date: Fri, 8 Mar 2024 15:19:26 GMT
- Title: Spectrogram-Based Detection of Auto-Tuned Vocals in Music Recordings
- Authors: Mahyar Gohari, Paolo Bestagini, Sergio Benini, Nicola Adami
- Abstract summary: This study introduces a data-driven approach leveraging triplet networks for the detection of Auto-Tuned songs.
The experimental results demonstrate the superiority of the proposed method in both accuracy and robustness compared to Rawnet2, an end-to-end model proposed for anti-spoofing.
- Score: 9.646498710102174
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the domain of music production and audio processing, the implementation of
automatic pitch correction of the singing voice, also known as Auto-Tune, has
significantly transformed the landscape of vocal performance. While auto-tuning
technology has offered musicians the ability to tune their vocal pitches and
achieve a desired level of precision, its use has also sparked debates
regarding its impact on authenticity and artistic integrity. As a result,
detecting and analyzing Auto-Tuned vocals in music recordings has become
essential for music scholars, producers, and listeners. However, to the best of
our knowledge, no prior effort has been made in this direction. This study
introduces a data-driven approach leveraging triplet networks for the detection
of Auto-Tuned songs, backed by the creation of a dataset composed of original
and Auto-Tuned audio clips. The experimental results demonstrate the
superiority of the proposed method in both accuracy and robustness compared to
Rawnet2, an end-to-end model proposed for anti-spoofing and widely used for
other audio forensic tasks.
Related papers
- Music Auto-Tagging with Robust Music Representation Learned via Domain
Adversarial Training [18.71152526968065]
Existing models in Music Information Retrieval (MIR) struggle with real-world noise such as environmental and speech sounds in multimedia content.
This study proposes a method inspired by speech-related tasks to enhance music auto-tagging performance in noisy settings.
arXiv Detail & Related papers (2024-01-27T06:56:51Z) - Resource-constrained stereo singing voice cancellation [1.0962868591006976]
We study the problem of stereo singing voice cancellation.
Our approach is evaluated using objective offline metrics and a large-scale MUSHRA trial.
arXiv Detail & Related papers (2024-01-22T16:05:30Z) - Singer Identity Representation Learning using Self-Supervised Techniques [0.0]
We propose a framework for training singer identity encoders to extract representations suitable for various singing-related tasks.
We explore different self-supervised learning techniques on a large collection of isolated vocal tracks.
We evaluate the quality of the resulting representations on singer similarity and identification tasks.
arXiv Detail & Related papers (2024-01-10T10:41:38Z) - Enhancing the vocal range of single-speaker singing voice synthesis with
melody-unsupervised pre-training [82.94349771571642]
This work proposes a melody-unsupervised multi-speaker pre-training method to enhance the vocal range of the single-speaker.
It is the first to introduce a differentiable duration regulator to improve the rhythm naturalness of the synthesized voice.
Experimental results verify that the proposed SVS system outperforms the baseline on both sound quality and naturalness.
arXiv Detail & Related papers (2023-09-01T06:40:41Z) - Human Voice Pitch Estimation: A Convolutional Network with Auto-Labeled
and Synthetic Data [0.0]
We present a specialized convolutional neural network designed for pitch extraction.
Our approach combines synthetic data with auto-labeled acapella sung audio, creating a robust training environment.
This work paves the way for enhanced pitch extraction in both music and voice settings.
arXiv Detail & Related papers (2023-08-14T14:26:52Z) - RMSSinger: Realistic-Music-Score based Singing Voice Synthesis [56.51475521778443]
RMS-SVS aims to generate high-quality singing voices given realistic music scores with different note types.
We propose RMSSinger, the first RMS-SVS method, which takes realistic music scores as input.
In RMSSinger, we introduce word-level modeling to avoid the time-consuming phoneme duration annotation and the complicated phoneme-level mel-note alignment.
arXiv Detail & Related papers (2023-05-18T03:57:51Z) - Anomalous Sound Detection using Audio Representation with Machine ID
based Contrastive Learning Pretraining [52.191658157204856]
This paper uses contrastive learning to refine audio representations for each machine ID, rather than for each audio sample.
The proposed two-stage method uses contrastive learning to pretrain the audio representation model.
Experiments show that our method outperforms the state-of-the-art methods using contrastive learning or self-supervised classification.
arXiv Detail & Related papers (2023-04-07T11:08:31Z) - Learning the Beauty in Songs: Neural Singing Voice Beautifier [69.21263011242907]
We are interested in a novel task, singing voice beautifying (SVB)
Given the singing voice of an amateur singer, SVB aims to improve the intonation and vocal tone of the voice, while keeping the content and vocal timbre.
We introduce Neural Singing Voice Beautifier (NSVB), the first generative model to solve the SVB task.
arXiv Detail & Related papers (2022-02-27T03:10:12Z) - Unsupervised Cross-Domain Singing Voice Conversion [105.1021715879586]
We present a wav-to-wav generative model for the task of singing voice conversion from any identity.
Our method utilizes both an acoustic model, trained for the task of automatic speech recognition, together with melody extracted features to drive a waveform-based generator.
arXiv Detail & Related papers (2020-08-06T18:29:11Z) - Audio Impairment Recognition Using a Correlation-Based Feature
Representation [85.08880949780894]
We propose a new representation of hand-crafted features that is based on the correlation of feature pairs.
We show superior performance in terms of compact feature dimensionality and improved computational speed in the test stage.
arXiv Detail & Related papers (2020-03-22T13:34:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.