Multitask learning for instrument activation aware music source
separation
- URL: http://arxiv.org/abs/2008.00616v1
- Date: Mon, 3 Aug 2020 02:35:00 GMT
- Title: Multitask learning for instrument activation aware music source
separation
- Authors: Yun-Ning Hung and Alexander Lerch
- Abstract summary: We propose a novel multitask structure to investigate using instrument activation information to improve source separation performance.
We investigate our system on six independent instruments, a more realistic scenario than the three instruments included in the widely-used MUSDB dataset.
The results show that our proposed multitask model outperforms the baseline Open-Unmix model on the mixture of Mixing Secrets and MedleyDB dataset.
- Score: 83.30944624666839
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Music source separation is a core task in music information retrieval which
has seen a dramatic improvement in the past years. Nevertheless, most of the
existing systems focus exclusively on the problem of source separation itself
and ignore the utilization of other~---possibly related---~MIR tasks which
could lead to additional quality gains. In this work, we propose a novel
multitask structure to investigate using instrument activation information to
improve source separation performance. Furthermore, we investigate our system
on six independent instruments, a more realistic scenario than the three
instruments included in the widely-used MUSDB dataset, by leveraging a
combination of the MedleyDB and Mixing Secrets datasets. The results show that
our proposed multitask model outperforms the baseline Open-Unmix model on the
mixture of Mixing Secrets and MedleyDB dataset while maintaining comparable
performance on the MUSDB dataset.
Related papers
- An Ensemble Approach to Music Source Separation: A Comparative Analysis of Conventional and Hierarchical Stem Separation [0.4893345190925179]
Music source separation (MSS) is a task that involves isolating individual sound sources, or stems, from mixed audio signals.
This paper presents an ensemble approach to MSS, combining several state-of-the-art architectures to achieve superior separation performance.
arXiv Detail & Related papers (2024-10-28T06:18:12Z) - LUMA: A Benchmark Dataset for Learning from Uncertain and Multimodal Data [3.66486428341988]
Multimodal Deep Learning enhances decision-making by integrating diverse information sources, such as texts, images, audio, and videos.
To develop trustworthy multimodal approaches, it is essential to understand how uncertainty impacts these models.
We propose LUMA, a unique benchmark dataset, featuring audio, image, and textual data from 50 classes, for learning from uncertain and multimodal data.
arXiv Detail & Related papers (2024-06-14T09:22:07Z) - Towards Completeness-Oriented Tool Retrieval for Large Language Models [60.733557487886635]
Real-world systems often incorporate a wide array of tools, making it impractical to input all tools into Large Language Models.
Existing tool retrieval methods primarily focus on semantic matching between user queries and tool descriptions.
We propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT, which captures not only the semantic similarities between user queries and tool descriptions but also takes into account the collaborative information of tools.
arXiv Detail & Related papers (2024-05-25T06:41:23Z) - Source-Free Collaborative Domain Adaptation via Multi-Perspective
Feature Enrichment for Functional MRI Analysis [55.03872260158717]
Resting-state MRI functional (rs-fMRI) is increasingly employed in multi-site research to aid neurological disorder analysis.
Many methods have been proposed to reduce fMRI heterogeneity between source and target domains.
But acquiring source data is challenging due to concerns and/or data storage burdens in multi-site studies.
We design a source-free collaborative domain adaptation framework for fMRI analysis, where only a pretrained source model and unlabeled target data are accessible.
arXiv Detail & Related papers (2023-08-24T01:30:18Z) - Learning with MISELBO: The Mixture Cookbook [62.75516608080322]
We present the first ever mixture of variational approximations for a normalizing flow-based hierarchical variational autoencoder (VAE) with VampPrior and a PixelCNN decoder network.
We explain this cooperative behavior by drawing a novel connection between VI and adaptive importance sampling.
We obtain state-of-the-art results among VAE architectures in terms of negative log-likelihood on the MNIST and FashionMNIST datasets.
arXiv Detail & Related papers (2022-09-30T15:01:35Z) - Deep Transfer Learning for Multi-source Entity Linkage via Domain
Adaptation [63.24594955429465]
Multi-source entity linkage is critical in high-impact applications such as data cleaning and user stitching.
AdaMEL is a deep transfer learning framework that learns generic high-level knowledge to perform multi-source entity linkage.
Our framework achieves state-of-the-art results with 8.21% improvement on average over methods based on supervised learning.
arXiv Detail & Related papers (2021-10-27T15:20:41Z) - Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal
Sentiment Analysis [96.46952672172021]
Bi-Bimodal Fusion Network (BBFN) is a novel end-to-end network that performs fusion on pairwise modality representations.
Model takes two bimodal pairs as input due to known information imbalance among modalities.
arXiv Detail & Related papers (2021-07-28T23:33:42Z) - Mixing-Specific Data Augmentation Techniques for Improved Blind
Violin/Piano Source Separation [29.956390660450484]
Blind music source separation has been a popular subject of research in both the music information retrieval and signal processing communities.
To counter the lack of available multi-track data for supervised model training, a data augmentation method that creates artificial mixtures has been shown useful in recent works.
We consider more sophisticated mixing settings employed in the modern music production routine, the relationship between the tracks to be combined, and factors of silence.
arXiv Detail & Related papers (2020-08-06T07:02:24Z) - MusPy: A Toolkit for Symbolic Music Generation [32.01713268702699]
MusPy is an open source Python library for symbolic music generation.
In this paper, we present statistical analysis of the eleven datasets currently supported by MusPy.
arXiv Detail & Related papers (2020-08-05T06:16:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.