Supervised Hierarchical Clustering using Graph Neural Networks for
Speaker Diarization
- URL: http://arxiv.org/abs/2302.12716v1
- Date: Fri, 24 Feb 2023 16:16:41 GMT
- Title: Supervised Hierarchical Clustering using Graph Neural Networks for
Speaker Diarization
- Authors: Prachi Singh, Amrit Kaul, Sriram Ganapathy
- Abstract summary: We propose a novel Supervised HierArchical gRaph Clustering algorithm (SHARC) for speaker diarization.
In this paper, we introduce a hierarchical structure using Graph Neural Network (GNN) to perform supervised clustering.
The supervised clustering is performed using node densities and edge existence probabilities to merge the segments until convergence.
- Score: 41.30830281043803
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Conventional methods for speaker diarization involve windowing an audio file
into short segments to extract speaker embeddings, followed by an unsupervised
clustering of the embeddings. This multi-step approach generates speaker
assignments for each segment. In this paper, we propose a novel Supervised
HierArchical gRaph Clustering algorithm (SHARC) for speaker diarization where
we introduce a hierarchical structure using Graph Neural Network (GNN) to
perform supervised clustering. The supervision allows the model to update the
representations and directly improve the clustering performance, thus enabling
a single-step approach for diarization. In the proposed work, the input segment
embeddings are treated as nodes of a graph with the edge weights corresponding
to the similarity scores between the nodes. We also propose an approach to
jointly update the embedding extractor and the GNN model to perform end-to-end
speaker diarization (E2E-SHARC). During inference, the hierarchical clustering
is performed using node densities and edge existence probabilities to merge the
segments until convergence. In the diarization experiments, we illustrate that
the proposed E2E-SHARC approach achieves 53% and 44% relative improvements over
the baseline systems on benchmark datasets like AMI and Voxconverse,
respectively.
Related papers
- Cluster-based Graph Collaborative Filtering [55.929052969825825]
Graph Convolution Networks (GCNs) have succeeded in learning user and item representations for recommendation systems.
Most existing GCN-based methods overlook the multiple interests of users while performing high-order graph convolution.
We propose a novel GCN-based recommendation model, termed Cluster-based Graph Collaborative Filtering (ClusterGCF)
arXiv Detail & Related papers (2024-04-16T07:05:16Z) - Overlap-aware End-to-End Supervised Hierarchical Graph Clustering for
Speaker Diarization [41.24045486520547]
We propose an end-to-end supervised hierarchical clustering algorithm based on graph neural networks (GNN)
The proposed E-SHARC framework improves significantly over the state-of-art diarization systems.
arXiv Detail & Related papers (2024-01-23T15:35:44Z) - Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering.
In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework.
In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z) - Progressive Sub-Graph Clustering Algorithm for Semi-Supervised Domain
Adaptation Speaker Verification [17.284276598514502]
We propose a novel progressive subgraph clustering algorithm based on multi-model voting and double-Gaussian based assessment.
To prevent disastrous clustering results, we adopt an iterative approach that progressively increases k and employs a double-Gaussian based assessment algorithm.
arXiv Detail & Related papers (2023-05-22T04:26:18Z) - Learn to Cluster Faces with Better Subgraphs [13.511058277653122]
Face clustering can provide pseudo-labels to the massive unlabeled face data.
Existing clustering methods aggregate features within subgraphs based on a uniform threshold or a learned cutoff position.
This work proposed an efficient neighborhood-aware subgraph adjustment method that can significantly reduce the noise.
arXiv Detail & Related papers (2023-04-21T09:18:55Z) - DeepCut: Unsupervised Segmentation using Graph Neural Networks
Clustering [6.447863458841379]
This study introduces a lightweight Graph Neural Network (GNN) to replace classical clustering methods.
Unlike existing methods, our GNN takes both the pair-wise affinities between local image features and the raw features as input.
We demonstrate how classical clustering objectives can be formulated as self-supervised loss functions for training an image segmentation GNN.
arXiv Detail & Related papers (2022-12-12T12:31:46Z) - DeepCluE: Enhanced Image Clustering via Multi-layer Ensembles in Deep
Neural Networks [53.88811980967342]
This paper presents a Deep Clustering via Ensembles (DeepCluE) approach.
It bridges the gap between deep clustering and ensemble clustering by harnessing the power of multiple layers in deep neural networks.
Experimental results on six image datasets confirm the advantages of DeepCluE over the state-of-the-art deep clustering approaches.
arXiv Detail & Related papers (2022-06-01T09:51:38Z) - A Variational Edge Partition Model for Supervised Graph Representation
Learning [51.30365677476971]
This paper introduces a graph generative process to model how the observed edges are generated by aggregating the node interactions over a set of overlapping node communities.
We partition each edge into the summation of multiple community-specific weighted edges and use them to define community-specific GNNs.
A variational inference framework is proposed to jointly learn a GNN based inference network that partitions the edges into different communities, these community-specific GNNs, and a GNN based predictor that combines community-specific GNNs for the end classification task.
arXiv Detail & Related papers (2022-02-07T14:37:50Z) - Learning Hierarchical Graph Neural Networks for Image Clustering [81.5841862489509]
We propose a hierarchical graph neural network (GNN) model that learns how to cluster a set of images into an unknown number of identities.
Our hierarchical GNN uses a novel approach to merge connected components predicted at each level of the hierarchy to form a new graph at the next level.
arXiv Detail & Related papers (2021-07-03T01:28:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.