Related papers: Extract fundamental frequency based on CNN combined with PYIN

Extract fundamental frequency based on CNN combined with PYIN

URL: http://arxiv.org/abs/2208.08354v1
Date: Wed, 17 Aug 2022 15:34:54 GMT
Title: Extract fundamental frequency based on CNN combined with PYIN
Authors: Ruowei Xing, Shengchen Li
Abstract summary: PYIN is applied to supplement the F0 extracted from the trained CNN model to combine the advantages of these two algorithms. Four pieces played by two violins are used, and the performance of the models are evaluated accoring to the flatness of the F0 curve extracted.
Score: 5.837881923712393
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper refers to the extraction of multiple fundamental frequencies (multiple F0) based on PYIN, an algorithm for extracting the fundamental frequency (F0) of monophonic music, and a trained convolutional neural networks (CNN) model, where a pitch salience function of the input signal is produced to estimate the multiple F0. The implementation of these two algorithms and their corresponding advantages and disadvantages are discussed in this article. Analysing the different performance of these two methods, PYIN is applied to supplement the F0 extracted from the trained CNN model to combine the advantages of these two algorithms. For evaluation, four pieces played by two violins are used, and the performance of the models are evaluated accoring to the flatness of the F0 curve extracted. The result shows the combined model outperforms the original algorithms when extracting F0 from monophonic music and polyphonic music.

Related papers

GP-FL: Model-Based Hessian Estimation for Second-Order Over-the-Air Federated Learning [52.295563400314094]
Second-order methods are widely adopted to improve the convergence rate of learning algorithms. This paper introduces a novel second-order FL framework tailored for wireless channels.
arXiv Detail & Related papers (2024-12-05T04:27:41Z)
Deepfake Audio Detection Using Spectrogram-based Feature and Ensemble of Deep Learning Models [42.39774323584976]
We propose a deep learning based system for the task of deepfake audio detection. In particular, the draw input audio is first transformed into various spectrograms. We leverage the state-of-the-art audio pre-trained models of Whisper, Seamless, Speechbrain, and Pyannote to extract audio embeddings.
arXiv Detail & Related papers (2024-07-01T20:10:43Z)
Transform Once: Efficient Operator Learning in Frequency Domain [69.74509540521397]
We study deep neural networks designed to harness the structure in frequency domain for efficient learning of long-range correlations in space or time. This work introduces a blueprint for frequency domain learning through a single transform: transform once (T1)
arXiv Detail & Related papers (2022-11-26T01:56:05Z)
Semantic Similarity Computing Model Based on Multi Model Fine-Grained Nonlinear Fusion [30.71123144365683]
This paper proposes a novel model based on multi model nonlinear fusion to grasp the meaning of a text from a global perspective. The model uses the Jaccard coefficient based on part of speech, Term Frequency-Inverse Document Frequency (TF-IDF) and word2vec-CNN algorithm to measure the similarity of sentences. Experimental results show that the matching of sentence similarity calculation method based on multi model nonlinear fusion is 84%, and the F1 value of the model is 75%.
arXiv Detail & Related papers (2022-02-05T03:12:37Z)
Win the Lottery Ticket via Fourier Analysis: Frequencies Guided Network Pruning [50.232218214751455]
optimal network pruning is a non-trivial task which mathematically is an NP-hard problem. In this paper, we investigate the Magnitude-Based Pruning (MBP) scheme and analyze it from a novel perspective. We also propose a novel two-stage pruning approach, where one stage is to obtain the topological structure of the pruned network and the other stage is to retrain the pruned network to recover the capacity.
arXiv Detail & Related papers (2022-01-30T03:42:36Z)
Learning Frequency Domain Approximation for Binary Neural Networks [68.79904499480025]
We propose to estimate the gradient of sign function in the Fourier frequency domain using the combination of sine functions for training BNNs. The experiments on several benchmark datasets and neural architectures illustrate that the binary network learned using our method achieves the state-of-the-art accuracy.
arXiv Detail & Related papers (2021-03-01T08:25:26Z)
Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning [58.14930566993063]
We present connections between three models used in different research fields: weighted finite automata(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks. We introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous vectors input.
arXiv Detail & Related papers (2020-10-19T15:28:00Z)
Fast accuracy estimation of deep learning based multi-class musical source separation [79.10962538141445]
We propose a method to evaluate the separability of instruments in any dataset without training and tuning a neural network. Based on the oracle principle with an ideal ratio mask, our approach is an excellent proxy to estimate the separation performances of state-of-the-art deep learning approaches.
arXiv Detail & Related papers (2020-10-19T13:05:08Z)
Multiple F0 Estimation in Vocal Ensembles using Convolutional Neural Networks [7.088324036549911]
This paper addresses the extraction of multiple F0 values from polyphonic and a cappella vocal performances using convolutional neural networks (CNNs) We build upon an existing architecture to produce a pitch salience function of the input signal. For training, we build a dataset that comprises several multi-track datasets of vocal quartets with F0 annotations.
arXiv Detail & Related papers (2020-09-09T09:11:49Z)
Score-informed Networks for Music Performance Assessment [64.12728872707446]
Deep neural network-based methods incorporating score information into MPA models have not yet been investigated. We introduce three different models capable of score-informed performance assessment.
arXiv Detail & Related papers (2020-08-01T07:46:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.