MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement
- URL: http://arxiv.org/abs/2104.03538v1
- Date: Thu, 8 Apr 2021 06:46:35 GMT
- Title: MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement
- Authors: Szu-Wei Fu, Cheng Yu, Tsun-An Hsieh, Peter Plantinga, Mirco Ravanelli,
Xugang Lu, Yu Tsao
- Abstract summary: We propose a MetricGAN+ in which three training techniques incorporating domain-knowledge of speech processing are proposed.
With these techniques, experimental results on the VoiceBank-DEMAND dataset show that MetricGAN+ can increase PESQ score by 0.3 compared to the previous MetricGAN.
- Score: 37.3251779254894
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The discrepancy between the cost function used for training a speech
enhancement model and human auditory perception usually makes the quality of
enhanced speech unsatisfactory. Objective evaluation metrics which consider
human perception can hence serve as a bridge to reduce the gap. Our previously
proposed MetricGAN was designed to optimize objective metrics by connecting the
metric with a discriminator. Because only the scores of the target evaluation
functions are needed during training, the metrics can even be
non-differentiable. In this study, we propose a MetricGAN+ in which three
training techniques incorporating domain-knowledge of speech processing are
proposed. With these techniques, experimental results on the VoiceBank-DEMAND
dataset show that MetricGAN+ can increase PESQ score by 0.3 compared to the
previous MetricGAN and achieve state-of-the-art results (PESQ score = 3.15).
Related papers
- Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In! [80.3129093617928]
Annually, at the Conference of Machine Translation (WMT), the Metrics Shared Task organizers conduct the meta-evaluation of Machine Translation (MT) metrics.
This work highlights two issues with the meta-evaluation framework currently employed in WMT, and assesses their impact on the metrics rankings.
We introduce the concept of sentinel metrics, which are designed explicitly to scrutinize the meta-evaluation process's accuracy, robustness, and fairness.
arXiv Detail & Related papers (2024-08-25T13:29:34Z) - The PESQetarian: On the Relevance of Goodhart's Law for Speech Enhancement [17.516851319183555]
We introduce enhancement models that exploit the widely used PESQ measure.
While the obtained PESQ value of 3.82 would imply "state-of-the-art" PESQ-performance on the VB-DMD benchmark, our examples show that when optimizing w.r.t. a metric, an isolated evaluation on the same metric may be misleading.
arXiv Detail & Related papers (2024-06-05T17:07:39Z) - A Study of Unsupervised Evaluation Metrics for Practical and Automatic
Domain Adaptation [15.728090002818963]
Unsupervised domain adaptation (UDA) methods facilitate the transfer of models to target domains without labels.
In this paper, we aim to find an evaluation metric capable of assessing the quality of a transferred model without access to target validation labels.
arXiv Detail & Related papers (2023-08-01T05:01:05Z) - What You Hear Is What You See: Audio Quality Metrics From Image Quality
Metrics [44.659718609385315]
We investigate the feasibility of utilizing state-of-the-art image perceptual metrics for evaluating audio signals by representing them as spectrograms.
We customise one of the metrics which has a psychoacoustically plausible architecture to account for the peculiarities of sound signals.
We evaluate the effectiveness of our proposed metric and several baseline metrics using a music dataset.
arXiv Detail & Related papers (2023-05-19T10:43:57Z) - Metric-oriented Speech Enhancement using Diffusion Probabilistic Model [23.84172431047342]
Deep neural network based speech enhancement technique focuses on learning a noisy-to-clean transformation supervised by paired training data.
The task-specific evaluation metric (e.g., PESQ) is usually non-differentiable and can not be directly constructed in the training criteria.
We propose a metric-oriented speech enhancement method (MOSE) which integrates a metric-oriented training strategy into its reverse process.
arXiv Detail & Related papers (2023-02-23T13:12:35Z) - Ontology-aware Learning and Evaluation for Audio Tagging [56.59107110017436]
Mean average precision (mAP) metric treats different kinds of sound as independent classes without considering their relations.
Ontology-aware mean average precision (OmAP) addresses the weaknesses of mAP by utilizing the AudioSet ontology information during the evaluation.
We conduct human evaluations and demonstrate that OmAP is more consistent with human perception than mAP.
arXiv Detail & Related papers (2022-11-22T11:35:14Z) - MetricGAN+/-: Increasing Robustness of Noise Reduction on Unseen Data [26.94528951545861]
We propose a "de-generator" which attempts to improve the robustness of the prediction network.
Experimental results on the VoiceBank-DEMAND dataset show relative improvement in PESQ score of 3.8%.
arXiv Detail & Related papers (2022-03-23T12:42:28Z) - LDNet: Unified Listener Dependent Modeling in MOS Prediction for
Synthetic Speech [67.88748572167309]
We present LDNet, a unified framework for mean opinion score (MOS) prediction.
We propose two inference methods that provide more stable results and efficient computation.
arXiv Detail & Related papers (2021-10-18T08:52:31Z) - ReMP: Rectified Metric Propagation for Few-Shot Learning [67.96021109377809]
A rectified metric space is learned to maintain the metric consistency from training to testing.
Numerous analyses indicate that a simple modification of the objective can yield substantial performance gains.
The proposed ReMP is effective and efficient, and outperforms the state of the arts on various standard few-shot learning datasets.
arXiv Detail & Related papers (2020-12-02T00:07:53Z) - GO FIGURE: A Meta Evaluation of Factuality in Summarization [131.1087461486504]
We introduce GO FIGURE, a meta-evaluation framework for evaluating factuality evaluation metrics.
Our benchmark analysis on ten factuality metrics reveals that our framework provides a robust and efficient evaluation.
It also reveals that while QA metrics generally improve over standard metrics that measure factuality across domains, performance is highly dependent on the way in which questions are generated.
arXiv Detail & Related papers (2020-10-24T08:30:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.