Evolving Multi-Resolution Pooling CNN for Monaural Singing Voice
Separation
- URL: http://arxiv.org/abs/2008.00816v1
- Date: Mon, 3 Aug 2020 12:09:42 GMT
- Title: Evolving Multi-Resolution Pooling CNN for Monaural Singing Voice
Separation
- Authors: Weitao Yuan, Bofei Dong, Shengbei Wang, Masashi Unoki, and Wenwu Wang
- Abstract summary: Monaural Singing Voice Separation (MSVS) is a challenging task and has been studied for decades.
Deep neural networks (DNNs) are the current state-of-the-art methods for MSVS.
We introduce a Neural Architecture Search (NAS) method to the structure design of DNNs for MSVS.
- Score: 40.170868770930774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monaural Singing Voice Separation (MSVS) is a challenging task and has been
studied for decades. Deep neural networks (DNNs) are the current
state-of-the-art methods for MSVS. However, the existing DNNs are often
designed manually, which is time-consuming and error-prone. In addition, the
network architectures are usually pre-defined, and not adapted to the training
data. To address these issues, we introduce a Neural Architecture Search (NAS)
method to the structure design of DNNs for MSVS. Specifically, we propose a new
multi-resolution Convolutional Neural Network (CNN) framework for MSVS namely
Multi-Resolution Pooling CNN (MRP-CNN), which uses various-size pooling
operators to extract multi-resolution features. Based on the NAS, we then
develop an evolving framework namely Evolving MRP-CNN (E-MRP-CNN), by
automatically searching the effective MRP-CNN structures using genetic
algorithms, optimized in terms of a single-objective considering only
separation performance, or multi-objective considering both the separation
performance and the model complexity. The multi-objective E-MRP-CNN gives a set
of Pareto-optimal solutions, each providing a trade-off between separation
performance and model complexity. Quantitative and qualitative evaluations on
the MIR-1K and DSD100 datasets are used to demonstrate the advantages of the
proposed framework over several recent baselines.
Related papers
- Multiway Multislice PHATE: Visualizing Hidden Dynamics of RNNs through Training [6.326396282553267]
Recurrent neural networks (RNNs) are a widely used tool for sequential data analysis, however, they are still often seen as black boxes of computation.
Here, we present Multiway Multislice PHATE (MM-PHATE), a novel method for visualizing the evolution of RNNs' hidden states.
arXiv Detail & Related papers (2024-06-04T05:05:27Z) - Multi-Objective Evolutionary Neural Architecture Search for Recurrent Neural Networks [0.0]
This paper proposes a multi-objective evolutionary algorithm-based RNN architecture search method.
The proposed method relies on approximate network morphisms for RNN architecture complexity optimisation during evolution.
arXiv Detail & Related papers (2024-03-17T11:19:45Z) - Applications of Spiking Neural Networks in Visual Place Recognition [19.577433371468533]
Spiking Neural Networks (SNNs) are increasingly recognized for their potential energy efficiency and low latency.
This paper highlights three advancements for SNNs in Visual Place Recognition (VPR)
Firstly, we propose Modular SNNs, where each SNN represents a set of non-overlapping geographically distinct places.
Secondly, we present Ensembles of Modular SNNs, where multiple networks represent the same place.
Lastly, we investigate the role of sequence matching in SNN-based VPR, a technique where consecutive images are used to refine place recognition.
arXiv Detail & Related papers (2023-11-22T06:26:24Z) - Disentangling Structured Components: Towards Adaptive, Interpretable and
Scalable Time Series Forecasting [52.47493322446537]
We develop a adaptive, interpretable and scalable forecasting framework, which seeks to individually model each component of the spatial-temporal patterns.
SCNN works with a pre-defined generative process of MTS, which arithmetically characterizes the latent structure of the spatial-temporal patterns.
Extensive experiments are conducted to demonstrate that SCNN can achieve superior performance over state-of-the-art models on three real-world datasets.
arXiv Detail & Related papers (2023-05-22T13:39:44Z) - Multi-scale Evolutionary Neural Architecture Search for Deep Spiking
Neural Networks [7.271032282434803]
We propose a Multi-Scale Evolutionary Neural Architecture Search (MSE-NAS) for Spiking Neural Networks (SNNs)
MSE-NAS evolves individual neuron operation, self-organized integration of multiple circuit motifs, and global connectivity across motifs through a brain-inspired indirect evaluation function, Representational Dissimilarity Matrices (RDMs)
The proposed algorithm achieves state-of-the-art (SOTA) performance with shorter simulation steps on static datasets and neuromorphic datasets.
arXiv Detail & Related papers (2023-04-21T05:36:37Z) - Split-Et-Impera: A Framework for the Design of Distributed Deep Learning
Applications [8.434224141580758]
Split-Et-Impera determines the set of the best-split points of a neural network based on deep network interpretability principles.
It performs a communication-aware simulation for the rapid evaluation of different neural network rearrangements.
It suggests the best match between the quality of service requirements of the application and the performance in terms of accuracy and latency time.
arXiv Detail & Related papers (2023-03-22T13:00:00Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - Towards a General Purpose CNN for Long Range Dependencies in
$\mathrm{N}$D [49.57261544331683]
We propose a single CNN architecture equipped with continuous convolutional kernels for tasks on arbitrary resolution, dimensionality and length without structural changes.
We show the generality of our approach by applying the same CCNN to a wide set of tasks on sequential (1$mathrmD$) and visual data (2$mathrmD$)
Our CCNN performs competitively and often outperforms the current state-of-the-art across all tasks considered.
arXiv Detail & Related papers (2022-06-07T15:48:02Z) - Deep Multi-Task Learning for Cooperative NOMA: System Design and
Principles [52.79089414630366]
We develop a novel deep cooperative NOMA scheme, drawing upon the recent advances in deep learning (DL)
We develop a novel hybrid-cascaded deep neural network (DNN) architecture such that the entire system can be optimized in a holistic manner.
arXiv Detail & Related papers (2020-07-27T12:38:37Z) - Multi-Tones' Phase Coding (MTPC) of Interaural Time Difference by
Spiking Neural Network [68.43026108936029]
We propose a pure spiking neural network (SNN) based computational model for precise sound localization in the noisy real-world environment.
We implement this algorithm in a real-time robotic system with a microphone array.
The experiment results show a mean error azimuth of 13 degrees, which surpasses the accuracy of the other biologically plausible neuromorphic approach for sound source localization.
arXiv Detail & Related papers (2020-07-07T08:22:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.