PTSBench: A Comprehensive Post-Training Sparsity Benchmark Towards Algorithms and Models
- URL: http://arxiv.org/abs/2412.07268v1
- Date: Tue, 10 Dec 2024 07:49:07 GMT
- Title: PTSBench: A Comprehensive Post-Training Sparsity Benchmark Towards Algorithms and Models
- Authors: Zining Wnag, Jinyang Guo, Ruihao Gong, Yang Yong, Aishan Liu, Yushi Huang, Jiaheng Liu, Xianglong Liu,
- Abstract summary: PTSBench is the first comprehensive post-training sparsity benchmark towards algorithms and models.
We benchmark 10+ PTS general-pluggable fine-grained techniques on 3 typical tasks using over 40 off-the-shelf model architectures.
Our PTSBench can provide (1) new observations for a better understanding of the PTS algorithms, (2) in-depth and comprehensive evaluations for the sparsification ability of models, and (3) a well-structured and easy-integrate open-source framework.
- Score: 39.56594737760323
- License:
- Abstract: With the increased attention to model efficiency, post-training sparsity (PTS) has become more and more prevalent because of its effectiveness and efficiency. However, there remain questions on better practice of PTS algorithms and the sparsification ability of models, which hinders the further development of this area. Therefore, a benchmark to comprehensively investigate the issues above is urgently needed. In this paper, we propose the first comprehensive post-training sparsity benchmark called PTSBench towards algorithms and models. We benchmark 10+ PTS general-pluggable fine-grained techniques on 3 typical tasks using over 40 off-the-shelf model architectures. Through extensive experiments and analyses, we obtain valuable conclusions and provide several insights from both algorithms and model aspects. Our PTSBench can provide (1) new observations for a better understanding of the PTS algorithms, (2) in-depth and comprehensive evaluations for the sparsification ability of models, and (3) a well-structured and easy-integrate open-source framework. We hope this work will provide illuminating conclusions and advice for future studies of post-training sparsity methods and sparsification-friendly model design. The code for our PTSBench is released at \href{https://github.com/ModelTC/msbench}{https://github.com/ModelTC/msbench}.
Related papers
- Benchmarking Post-Training Quantization in LLMs: Comprehensive Taxonomy, Unified Evaluation, and Comparative Analysis [89.60263788590893]
Post-training Quantization (PTQ) technique has been extensively adopted for large language models (LLMs) compression.
Existing algorithms focus primarily on performance, overlooking the trade-off among model size, performance, and quantization bitwidth.
arXiv Detail & Related papers (2025-02-18T07:35:35Z) - High-Performance Few-Shot Segmentation with Foundation Models: An Empirical Study [64.06777376676513]
We develop a few-shot segmentation (FSS) framework based on foundation models.
To be specific, we propose a simple approach to extract implicit knowledge from foundation models to construct coarse correspondence.
Experiments on two widely used datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-09-10T08:04:11Z) - Stochastic Two Points Method for Deep Model Zeroth-order Optimization [32.459322001738144]
This paper introduces an efficient Two-Point (S2P) approach within the gradient-free regime.
We present the theoretical convergence properties of S2P under the general and relaxed smoothness assumptions.
We show that VS2P is highly effective in optimizing objectives for deep models.
arXiv Detail & Related papers (2024-02-02T18:39:40Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - Leveraging Reinforcement Learning and Large Language Models for Code
Optimization [14.602997316032706]
This paper introduces a new framework to decrease the complexity of code optimization.
The proposed framework builds on large language models (LLMs) and reinforcement learning (RL)
We run several experiments on the PIE dataset using a CodeT5 language model and RRHF, a new reinforcement learning algorithm.
arXiv Detail & Related papers (2023-12-09T19:50:23Z) - Precise Learning of Source Code Contextual Semantics via Hierarchical
Dependence Structure and Graph Attention Networks [28.212889828892664]
We propose a novel source code model embedded with hierarchical dependencies.
We introduce the syntactic structural of the basic block, i.e., its corresponding AST, in source code model to provide sufficient information.
The results show that our model reduces the scale of parameters by 50% and achieves 4% improvement on accuracy on program classification task.
arXiv Detail & Related papers (2021-11-20T04:03:42Z) - DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language
Models [152.29364079385635]
As pre-trained models grow bigger, the fine-tuning process can be time-consuming and computationally expensive.
We propose a framework for resource- and parameter-efficient fine-tuning by leveraging the sparsity prior in both weight updates and the final model weights.
Our proposed framework, dubbed Dually Sparsity-Embedded Efficient Tuning (DSEE), aims to achieve two key objectives: (i) parameter efficient fine-tuning and (ii) resource-efficient inference.
arXiv Detail & Related papers (2021-10-30T03:29:47Z) - AutoBERT-Zero: Evolving BERT Backbone from Scratch [94.89102524181986]
We propose an Operation-Priority Neural Architecture Search (OP-NAS) algorithm to automatically search for promising hybrid backbone architectures.
We optimize both the search algorithm and evaluation of candidate models to boost the efficiency of our proposed OP-NAS.
Experiments show that the searched architecture (named AutoBERT-Zero) significantly outperforms BERT and its variants of different model capacities in various downstream tasks.
arXiv Detail & Related papers (2021-07-15T16:46:01Z) - Highly Efficient Knowledge Graph Embedding Learning with Orthogonal
Procrustes Analysis [10.154836127889487]
Knowledge Graph Embeddings (KGEs) have been intensively explored in recent years due to their promise for a wide range of applications.
This paper proposes a simple yet effective KGE framework which can reduce the training time and carbon footprint by orders of magnitudes.
arXiv Detail & Related papers (2021-04-10T03:55:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.