HULK: An Energy Efficiency Benchmark Platform for Responsible Natural
Language Processing
- URL: http://arxiv.org/abs/2002.05829v1
- Date: Fri, 14 Feb 2020 01:04:19 GMT
- Title: HULK: An Energy Efficiency Benchmark Platform for Responsible Natural
Language Processing
- Authors: Xiyou Zhou, Zhiyu Chen, Xiaoyong Jin, William Yang Wang
- Abstract summary: We introduce HULK, a multi-task energy efficiency benchmarking platform for responsible natural language processing.
We compare pretrained models' energy efficiency from the perspectives of time and cost.
- Score: 76.38975568873765
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Computation-intensive pretrained models have been taking the lead of many
natural language processing benchmarks such as GLUE. However, energy efficiency
in the process of model training and inference becomes a critical bottleneck.
We introduce HULK, a multi-task energy efficiency benchmarking platform for
responsible natural language processing. With HULK, we compare pretrained
models' energy efficiency from the perspectives of time and cost. Baseline
benchmarking results are provided for further analysis. The fine-tuning
efficiency of different pretrained models can differ a lot among different
tasks and fewer parameter number does not necessarily imply better efficiency.
We analyzed such phenomenon and demonstrate the method of comparing the
multi-task efficiency of pretrained models. Our platform is available at
https://sites.engineering.ucsb.edu/~xiyou/hulk/.
Related papers
- Cheaply Evaluating Inference Efficiency Metrics for Autoregressive
Transformer APIs [66.30706841821123]
Large language models (LLMs) power many state-of-the-art systems in natural language processing.
LLMs are extremely computationally expensive, even at inference time.
We propose a new metric for comparing inference efficiency across models.
arXiv Detail & Related papers (2023-05-03T21:51:42Z) - MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided
Adaptation [68.30497162547768]
We propose MoEBERT, which uses a Mixture-of-Experts structure to increase model capacity and inference speed.
We validate the efficiency and effectiveness of MoEBERT on natural language understanding and question answering tasks.
arXiv Detail & Related papers (2022-04-15T23:19:37Z) - PaLM: Scaling Language Modeling with Pathways [180.69584031908113]
We trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.
We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods.
We demonstrate continued benefits of scaling by achieving state-of-the-art few-shot learning results on hundreds of language understanding and generation benchmarks.
arXiv Detail & Related papers (2022-04-05T16:11:45Z) - Maximizing Efficiency of Language Model Pre-training for Learning
Representation [6.518508607788086]
ELECTRA is a novel approach for improving the compute efficiency of pre-trained language models.
Our work proposes adaptive early exit strategy to maximize the efficiency of the pre-training process.
arXiv Detail & Related papers (2021-10-13T10:25:06Z) - Joint Energy-based Model Training for Better Calibrated Natural Language
Understanding Models [61.768082640087]
We explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders for natural language understanding tasks.
Experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines.
arXiv Detail & Related papers (2021-01-18T01:41:31Z) - MC-BERT: Efficient Language Pre-Training via a Meta Controller [96.68140474547602]
Large-scale pre-training is computationally expensive.
ELECTRA, an early attempt to accelerate pre-training, trains a discriminative model that predicts whether each input token was replaced by a generator.
We propose a novel meta-learning framework, MC-BERT, to achieve better efficiency and effectiveness.
arXiv Detail & Related papers (2020-06-10T09:22:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.