Related papers: A Sparsity Predicting Approach for Large Language Models via Activation Pattern Clustering

A Sparsity Predicting Approach for Large Language Models via Activation Pattern Clustering

URL: http://arxiv.org/abs/2507.14179v1
Date: Fri, 11 Jul 2025 19:07:29 GMT
Title: A Sparsity Predicting Approach for Large Language Models via Activation Pattern Clustering
Authors: Nobel Dhar, Bobin Deng, Md Romyull Islam, Xinyue Zhang, Kazi Fahim Ahmad Nasif, Kun Suo,
Abstract summary: Large Language Models (LLMs) exhibit significant activation sparsity, where only a subset of neurons are active for a given input.<n>Direct prediction at the neuron level is computationally expensive due to the vast number of neurons in modern LLMs.<n>We propose a clustering-based activation pattern compression framework to enable efficient prediction and utilization of activation sparsity.
Score: 3.485125799252057
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Large Language Models (LLMs) exhibit significant activation sparsity, where only a subset of neurons are active for a given input. Although this sparsity presents opportunities to reduce computational cost, efficiently utilizing it requires predicting activation patterns in a scalable manner. However, direct prediction at the neuron level is computationally expensive due to the vast number of neurons in modern LLMs. To enable efficient prediction and utilization of activation sparsity, we propose a clustering-based activation pattern compression framework. Instead of treating each neuron independently, we group similar activation patterns into a small set of representative clusters. Our method achieves up to 79.34% clustering precision, outperforming standard binary clustering approaches while maintaining minimal degradation in perplexity (PPL) scores. With a sufficiently large number of clusters, our approach attains a PPL score as low as 12.49, demonstrating its effectiveness in preserving model quality while reducing computational overhead. By predicting cluster assignments rather than individual neuron states, future models can efficiently infer activation patterns from pre-computed centroids. We detail the clustering algorithm, analyze its effectiveness in capturing meaningful activation structures, and demonstrate its potential to improve sparse computation efficiency. This clustering-based formulation serves as a foundation for future work on activation pattern prediction, paving the way for efficient inference in large-scale language models.

Related papers

Neural-Inspired Posterior Approximation (NIPA) [1.1649834448993244]
Humans learn efficiently from their environment by engaging multiple interacting neural systems.<n>We aim to elucidate the computational principles underlying this biological efficiency and translate them into a sampling algorithm.<n>We show that this approach advances Bayesian methods and facilitates their application to large-scale statistical machine learning problems.
arXiv Detail & Related papers (2026-01-30T04:19:26Z)
Resting Neurons, Active Insights: Improving Input Sparsification for Large Language Models [42.12574676719046]
Large Language Models (LLMs) achieve state-of-the-art performance across a wide range of applications.<n>Structured pruning, which reduces model size by removing redundant computational units such as neurons, has been widely explored as a solution.<n>This study devotes to input sparsification, an increasingly popular technique that improves efficiency by selectively activating only a subset of entry values for each input.
arXiv Detail & Related papers (2025-12-14T15:47:40Z)
SPaRFT: Self-Paced Reinforcement Fine-Tuning for Large Language Models [51.74498855100541]
Large language models (LLMs) have shown strong reasoning capabilities when fine-tuned with reinforcement learning (RL)<n>We propose textbfSPaRFT, a self-paced learning framework that enables efficient learning based on the capability of the model being trained.
arXiv Detail & Related papers (2025-08-07T03:50:48Z)
An Enhanced Model-based Approach for Short Text Clustering [58.60681789677676]
Short text clustering has become increasingly important with the popularity of social media like Twitter, Google+, and Facebook.<n>Existing methods can be broadly categorized into two paradigms: topic model-based approaches and deep representation learning-based approaches.<n>We propose a collapsed Gibbs Sampling algorithm for the Dirichlet Multinomial Mixture model (GSDMM), which effectively handles the sparsity and high dimensionality of short texts.<n>Based on several aspects of GSDMM that warrant further refinement, we propose an improved approach, GSDMM+, designed to further optimize its performance.
arXiv Detail & Related papers (2025-07-18T10:07:42Z)
Image Clustering Algorithm Based on Self-Supervised Pretrained Models and Latent Feature Distribution Optimization [4.39139858370436]
This paper introduces an image clustering algorithm based on self-supervised pretrained models and latent feature distribution optimization. Our approach outperforms the latest clustering algorithms and achieves state-of-the-art clustering results.
arXiv Detail & Related papers (2024-08-04T04:08:21Z)
Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens [57.37893387775829]
We introduce a fast and balanced clustering method, named Semantic Equitable Clustering (SEC)<n>SEC clusters tokens based on their global semantic relevance in an efficient, straightforward manner.<n>We propose a versatile vision backbone, SECViT, to serve as a vision language connector.
arXiv Detail & Related papers (2024-05-22T04:49:00Z)
GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure. First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples. Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z)
Nonlinear subspace clustering by functional link neural networks [20.972039615938193]
Subspace clustering based on a feed-forward neural network has been demonstrated to provide better clustering accuracy than some advanced subspace clustering algorithms. We employ a functional link neural network to transform data samples into a nonlinear domain. We introduce a convex combination subspace clustering scheme, which combines a linear subspace clustering method with the functional link neural network subspace clustering approach.
arXiv Detail & Related papers (2024-02-03T06:01:21Z)
Model-Based Control with Sparse Neural Dynamics [23.961218902837807]
We propose a new framework for integrated model learning and predictive control. We show that our framework can deliver better closed-loop performance than existing state-of-the-art methods.
arXiv Detail & Related papers (2023-12-20T06:25:02Z)
Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation. Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions. We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z)
Revisiting Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [69.15976031704687]
We propose IAC (Instance-Adaptive Clustering), the first algorithm whose performance matches the instance-specific lower bounds both in expectation and with high probability.<n>IAC maintains an overall computational complexity of $ mathcalO(n, textpolylog(n) $, making it scalable and practical for large-scale problems.
arXiv Detail & Related papers (2023-06-18T08:46:06Z)
Learning Neural Eigenfunctions for Unsupervised Semantic Segmentation [12.91586050451152]
Spectral clustering is a theoretically grounded solution to it where the spectral embeddings for pixels are computed to construct distinct clusters. Current approaches still suffer from inefficiencies in spectral decomposition and inflexibility in applying them to the test data. This work addresses these issues by casting spectral clustering as a parametric approach that employs neural network-based eigenfunctions to produce spectral embeddings. In practice, the neural eigenfunctions are lightweight and take the features from pre-trained models as inputs, improving training efficiency and unleashing the potential of pre-trained models for dense prediction.
arXiv Detail & Related papers (2023-04-06T03:14:15Z)
Dynamic Clustering and Cluster Contrastive Learning for Unsupervised Person Re-identification [29.167783500369442]
Unsupervised Re-ID methods aim at learning robust and discriminative features from unlabeled data. We propose a dynamic clustering and cluster contrastive learning (DCCC) method. Experiments on several widely used public datasets validate the effectiveness of our proposed DCCC.
arXiv Detail & Related papers (2023-03-13T01:56:53Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.