Source Separation-based Data Augmentation for Improved Joint Beat and
Downbeat Tracking
- URL: http://arxiv.org/abs/2106.08703v1
- Date: Wed, 16 Jun 2021 11:09:05 GMT
- Title: Source Separation-based Data Augmentation for Improved Joint Beat and
Downbeat Tracking
- Authors: Ching-Yu Chiu, Joann Ching, Wen-Yi Hsiao, Yu-Hua Chen, Alvin Wen-Yu
Su, and Yi-Hsuan Yang
- Abstract summary: We propose to use a blind drum separation model to segregate the drum and non-drum sounds from each training audio signal.
We report experiments on four completely unseen test sets, validating the effectiveness of the proposed method.
- Score: 33.05612957858605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to advances in deep learning, the performance of automatic beat and
downbeat tracking in musical audio signals has seen great improvement in recent
years. In training such deep learning based models, data augmentation has been
found an important technique. However, existing data augmentation methods for
this task mainly target at balancing the distribution of the training data with
respect to their tempo. In this paper, we investigate another approach for data
augmentation, to account for the composition of the training data in terms of
the percussive and non-percussive sound sources. Specifically, we propose to
employ a blind drum separation model to segregate the drum and non-drum sounds
from each training audio signal, filtering out training signals that are
drumless, and then use the obtained drum and non-drum stems to augment the
training data. We report experiments on four completely unseen test sets,
validating the effectiveness of the proposed method, and accordingly the
importance of drum sound composition in the training data for beat and downbeat
tracking.
Related papers
- Deep learning-based shot-domain seismic deblending [1.6411821807321063]
We make use of unblended shot gathers acquired at the end of each sail line.
By manually blending these data we obtain training data with good control of the ground truth.
We train a deep neural network using multi-channel inputs that include adjacent blended shot gathers.
arXiv Detail & Related papers (2024-09-13T07:32:31Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - A Data-Driven Analysis of Robust Automatic Piano Transcription [16.686703489636734]
Recent developments have focused on adapting new neural network architectures to yield more accurate systems.
We show how these models can severely overfit to acoustic properties of the training data.
We achieve state-of-the-art note-onset accuracy of 88.4 F1-score on the MAPS dataset, without seeing any of its training data.
arXiv Detail & Related papers (2024-02-02T14:11:23Z) - Understanding and Mitigating the Label Noise in Pre-training on
Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks.
We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z) - On-the-fly Denoising for Data Augmentation in Natural Language
Understanding [101.46848743193358]
We propose an on-the-fly denoising technique for data augmentation that learns from soft augmented labels provided by an organic teacher model trained on the cleaner original data.
Our method can be applied to general augmentation techniques and consistently improve the performance on both text classification and question-answering tasks.
arXiv Detail & Related papers (2022-12-20T18:58:33Z) - SingAug: Data Augmentation for Singing Voice Synthesis with
Cycle-consistent Training Strategy [69.24683717901262]
Deep learning based singing voice synthesis (SVS) systems have been demonstrated to flexibly generate singing with better qualities.
In this work, we explore different data augmentation methods to boost the training of SVS systems.
To further stabilize the training, we introduce the cycle-consistent training strategy.
arXiv Detail & Related papers (2022-03-31T12:50:10Z) - Cross-Referencing Self-Training Network for Sound Event Detection in
Audio Mixtures [23.568610919253352]
This paper proposes a semi-supervised method for generating pseudo-labels from unsupervised data using a student-teacher scheme that balances self-training and cross-training.
The results of these methods on both "validation" and "public evaluation" sets of DESED database show significant improvement compared to the state-of-the art systems in semi-supervised learning.
arXiv Detail & Related papers (2021-05-27T18:46:59Z) - Fast accuracy estimation of deep learning based multi-class musical
source separation [79.10962538141445]
We propose a method to evaluate the separability of instruments in any dataset without training and tuning a neural network.
Based on the oracle principle with an ideal ratio mask, our approach is an excellent proxy to estimate the separation performances of state-of-the-art deep learning approaches.
arXiv Detail & Related papers (2020-10-19T13:05:08Z) - Mixing-Specific Data Augmentation Techniques for Improved Blind
Violin/Piano Source Separation [29.956390660450484]
Blind music source separation has been a popular subject of research in both the music information retrieval and signal processing communities.
To counter the lack of available multi-track data for supervised model training, a data augmentation method that creates artificial mixtures has been shown useful in recent works.
We consider more sophisticated mixing settings employed in the modern music production routine, the relationship between the tracks to be combined, and factors of silence.
arXiv Detail & Related papers (2020-08-06T07:02:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.