Tight integration of neural- and clustering-based diarization through
deep unfolding of infinite Gaussian mixture model
- URL: http://arxiv.org/abs/2202.06524v1
- Date: Mon, 14 Feb 2022 07:45:21 GMT
- Title: Tight integration of neural- and clustering-based diarization through
deep unfolding of infinite Gaussian mixture model
- Authors: Keisuke Kinoshita, Marc Delcroix, Tomoharu Iwata
- Abstract summary: This paper introduces a it trainable clustering algorithm into the integration framework.
Speaker embeddings are optimized during training such that it better fits iGMM clustering.
Experimental results show that the proposed approach outperforms the conventional approach in terms of diarization error rate.
- Score: 84.57667267657382
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Speaker diarization has been investigated extensively as an important central
task for meeting analysis. Recent trend shows that integration of end-to-end
neural (EEND)-and clustering-based diarization is a promising approach to
handle realistic conversational data containing overlapped speech with an
arbitrarily large number of speakers, and achieved state-of-the-art results on
various tasks. However, the approaches proposed so far have not realized {\it
tight} integration yet, because the clustering employed therein was not optimal
in any sense for clustering the speaker embeddings estimated by the EEND
module. To address this problem, this paper introduces a {\it trainable}
clustering algorithm into the integration framework, by deep-unfolding a
non-parametric Bayesian model called the infinite Gaussian mixture model
(iGMM). Specifically, the speaker embeddings are optimized during training such
that it better fits iGMM clustering, based on a novel clustering loss based on
Adjusted Rand Index (ARI). Experimental results based on CALLHOME data show
that the proposed approach outperforms the conventional approach in terms of
diarization error rate (DER), especially by substantially reducing speaker
confusion errors, that indeed reflects the effectiveness of the proposed iGMM
integration.
Related papers
- Rethinking Clustered Federated Learning in NOMA Enhanced Wireless
Networks [60.09912912343705]
This study explores the benefits of integrating the novel clustered federated learning (CFL) approach with non-independent and identically distributed (non-IID) datasets.
A detailed theoretical analysis of the generalization gap that measures the degree of non-IID in the data distribution is presented.
Solutions to address the challenges posed by non-IID conditions are proposed with the analysis of the properties.
arXiv Detail & Related papers (2024-03-05T17:49:09Z) - Overlap-aware End-to-End Supervised Hierarchical Graph Clustering for
Speaker Diarization [41.24045486520547]
We propose an end-to-end supervised hierarchical clustering algorithm based on graph neural networks (GNN)
The proposed E-SHARC framework improves significantly over the state-of-art diarization systems.
arXiv Detail & Related papers (2024-01-23T15:35:44Z) - Robust Consensus Clustering and its Applications for Advertising
Forecasting [18.242055675730253]
We propose a novel algorithm -- robust consensus clustering that can find common ground truth among experts' opinions.
We apply the proposed method to the real-world advertising campaign segmentation and forecasting tasks.
arXiv Detail & Related papers (2022-12-27T21:49:04Z) - Correlation Clustering Reconstruction in Semi-Adversarial Models [70.11015369368272]
Correlation Clustering is an important clustering problem with many applications.
We study the reconstruction version of this problem in which one is seeking to reconstruct a latent clustering corrupted by random noise and adversarial modifications.
arXiv Detail & Related papers (2021-08-10T14:46:17Z) - Deep Conditional Gaussian Mixture Model for Constrained Clustering [7.070883800886882]
Constrained clustering can leverage prior information on a growing amount of only partially labeled data.
We propose a novel framework for constrained clustering that is intuitive, interpretable, and can be trained efficiently in the framework of gradient variational inference.
arXiv Detail & Related papers (2021-06-11T13:38:09Z) - Unsupervised Clustered Federated Learning in Complex Multi-source
Acoustic Environments [75.8001929811943]
We introduce a realistic and challenging, multi-source and multi-room acoustic environment.
We present an improved clustering control strategy that takes into account the variability of the acoustic scene.
The proposed approach is optimized using clustering-based measures and validated via a network-wide classification task.
arXiv Detail & Related papers (2021-06-07T14:51:39Z) - Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z) - Integrating end-to-end neural and clustering-based diarization: Getting
the best of both worlds [71.36164750147827]
Clustering-based approaches assign speaker labels to speech regions by clustering speaker embeddings such as x-vectors.
End-to-end neural diarization (EEND) directly predicts diarization labels using a neural network.
We propose a simple but effective hybrid diarization framework that works with overlapped speech and for long recordings containing an arbitrary number of speakers.
arXiv Detail & Related papers (2020-10-26T06:33:02Z) - Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized
Maximum Eigengap [43.82618103722998]
We propose a new spectral clustering framework that can auto-tune the parameters of the clustering algorithm in the context of speaker diarization.
A relative improvement of 17% in the speaker error rate on the well-known CALLHOME evaluation set shows the effectiveness of our proposed spectral clustering with auto-tuning.
arXiv Detail & Related papers (2020-03-05T02:50:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.