Partitioning-Guided K-Means: Extreme Empty Cluster Resolution for
Extreme Model Compression
- URL: http://arxiv.org/abs/2306.14031v1
- Date: Sat, 24 Jun 2023 18:19:29 GMT
- Title: Partitioning-Guided K-Means: Extreme Empty Cluster Resolution for
Extreme Model Compression
- Authors: Tianhong Huang, Victor Agostinelli, Lizhong Chen
- Abstract summary: We consider Iterative Product Quantization (iPQ) with Quant-Noise to be state-of-the-art in this area.
iPQ with Quant-Noise suffers from preventable inference quality degradation due to prevalent empty clusters.
We propose several novel enhancements aiming to improve the accuracy of iPQ with Quant-Noise by focusing on resolving empty clusters.
- Score: 0.17188280334580192
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compactness in deep learning can be critical to a model's viability in
low-resource applications, and a common approach to extreme model compression
is quantization. We consider Iterative Product Quantization (iPQ) with
Quant-Noise to be state-of-the-art in this area, but this quantization
framework suffers from preventable inference quality degradation due to
prevalent empty clusters. In this paper, we propose several novel enhancements
aiming to improve the accuracy of iPQ with Quant-Noise by focusing on resolving
empty clusters. Our contribution, which we call Partitioning-Guided k-means (PG
k-means), is a heavily augmented k-means implementation composed of three main
components. First, we propose a partitioning-based pre-assignment strategy that
ensures no initial empty clusters and encourages an even weight-to-cluster
distribution. Second, we propose an empirically superior empty cluster
resolution heuristic executed via cautious partitioning of large clusters.
Finally, we construct an optional optimization step that consolidates
intuitively dense clusters of weights to ensure shared representation. The
proposed approach consistently reduces the number of empty clusters in iPQ with
Quant-Noise by 100x on average, uses 8x fewer iterations during empty cluster
resolution, and improves overall model accuracy by up to 12%, when applied to
RoBERTa on a variety of tasks in the GLUE benchmark.
Related papers
- Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - Cluster truncated Wigner approximation for bond-disordered Heisenberg spin models [0.0]
Cluster Truncated Wigner Approximation (cTWA) applied to quench dynamics in bond-disordered Heisenberg spin chains with power-law interactions.
We develop a discrete sampling scheme for the initial Wigner function, as an alternative to the originally introduced scheme based on Gaussian approximations.
arXiv Detail & Related papers (2024-07-01T18:00:06Z) - Deep Embedding Clustering Driven by Sample Stability [16.53706617383543]
We propose a deep embedding clustering algorithm driven by sample stability (DECS)
Specifically, we start by constructing the initial feature space with an autoencoder and then learn the cluster-oriented embedding feature constrained by sample stability.
The experimental results on five datasets illustrate that the proposed method achieves superior performance compared to state-of-the-art clustering approaches.
arXiv Detail & Related papers (2024-01-29T09:19:49Z) - Gap-Free Clustering: Sensitivity and Robustness of SDP [6.996002801232415]
We study graph clustering in the Block Model (SBM) in the presence of both large clusters and small, unrecoverable clusters.
Previous convex relaxation approaches achieving exact recovery do not allow any small clusters of size $o(sqrtn)$, or require a size gap between the smallest recovered cluster and the largest non-recovered cluster.
We provide an algorithm based on semidefinite programming (SDP) which removes these requirements and provably recovers large clusters regardless of the remaining cluster sizes.
arXiv Detail & Related papers (2023-08-29T21:27:21Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - SqueezeLLM: Dense-and-Sparse Quantization [80.32162537942138]
Main bottleneck for generative inference with LLMs is memory bandwidth, rather than compute, for single batch inference.
We introduce SqueezeLLM, a post-training quantization framework that enables lossless compression to ultra-low precisions of up to 3-bit.
Our framework incorporates two novel ideas: (i) sensitivity-based non-uniform quantization, which searches for the optimal bit precision assignment based on second-order information; and (ii) the Dense-and-Sparse decomposition that stores outliers and sensitive weight values in an efficient sparse format.
arXiv Detail & Related papers (2023-06-13T08:57:54Z) - Local Sample-weighted Multiple Kernel Clustering with Consensus
Discriminative Graph [73.68184322526338]
Multiple kernel clustering (MKC) is committed to achieving optimal information fusion from a set of base kernels.
This paper proposes a novel local sample-weighted multiple kernel clustering model.
Experimental results demonstrate that our LSWMKC possesses better local manifold representation and outperforms existing kernel or graph-based clustering algo-rithms.
arXiv Detail & Related papers (2022-07-05T05:00:38Z) - Neural Mixture Models with Expectation-Maximization for End-to-end Deep
Clustering [0.8543753708890495]
In this paper, we realize mixture model-based clustering with a neural network.
We train the network end-to-end via batch-wise EM iterations where the forward pass acts as the E-step and the backward pass acts as the M-step.
Our trained networks outperform single-stage deep clustering methods that still depend on k-means.
arXiv Detail & Related papers (2021-07-06T08:00:58Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Progressive Cluster Purification for Unsupervised Feature Learning [48.87365358296371]
In unsupervised feature learning, sample specificity based methods ignore the inter-class information.
We propose a novel clustering based method, which excludes class inconsistent samples during progressive cluster formation.
Our approach, referred to as Progressive Cluster Purification (PCP), implements progressive clustering by gradually reducing the number of clusters during training.
arXiv Detail & Related papers (2020-07-06T08:11:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.