MOROCCO: Model Resource Comparison Framework
- URL: http://arxiv.org/abs/2104.14314v1
- Date: Thu, 29 Apr 2021 13:01:27 GMT
- Title: MOROCCO: Model Resource Comparison Framework
- Authors: Valentin Malykh, Alexander Kukushkin, Ekaterina Artemova, Vladislav
Mikhailov, Maria Tikhonova, Tatiana Shavrina
- Abstract summary: We present MOROCCO, a framework to compare language models compatible with ttjiant environment which supports over 50 NLU tasks.
We demonstrate its applicability for two GLUE-like suites in different languages.
- Score: 61.444083353087294
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The new generation of pre-trained NLP models push the SOTA to the new limits,
but at the cost of computational resources, to the point that their use in real
production environments is often prohibitively expensive. We tackle this
problem by evaluating not only the standard quality metrics on downstream tasks
but also the memory footprint and inference time. We present MOROCCO, a
framework to compare language models compatible with \texttt{jiant} environment
which supports over 50 NLU tasks, including SuperGLUE benchmark and multiple
probing suites. We demonstrate its applicability for two GLUE-like suites in
different languages.
Related papers
- ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets [106.7760874400261]
This paper presents ML-SUPERB2.0, which is a new benchmark for evaluating pre-trained SSL and supervised speech models.
We find performance improvements over the setup of ML-SUPERB, but performance depends on the downstream model design.
Also, we find large performance differences between languages and datasets, suggesting the need for more targeted approaches.
arXiv Detail & Related papers (2024-06-12T21:01:26Z) - Low-Cost Language Models: Survey and Performance Evaluation on Python Code Generation [0.0]
Large Language Models (LLMs) have become a popular choice for many Natural Language Processing (NLP) tasks.
LLMs' substantial computational and memory requirements often make them inaccessible to users with limited resources.
This paper focuses on very low-cost models which offer a more accessible alternative to resource-intensive LLMs.
arXiv Detail & Related papers (2024-04-17T08:16:48Z) - MoSECroT: Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer [50.40191599304911]
We introduce MoSECroT Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer.
In this paper, we present the first framework that leverages relative representations to construct a common space for the embeddings of a source language PLM and the static word embeddings of a target language.
We show that although our proposed framework is competitive with weak baselines when addressing MoSECroT, it fails to achieve competitive results compared with some strong baselines.
arXiv Detail & Related papers (2024-01-09T21:09:07Z) - YAYI 2: Multilingual Open-Source Large Language Models [53.92832054643197]
We propose YAYI 2, including both base and chat models, with 30 billion parameters.
YAYI 2 is pre-trained from scratch on a multilingual corpus which contains 2.65 trillion tokens filtered by our pre-training data processing pipeline.
The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback.
arXiv Detail & Related papers (2023-12-22T17:34:47Z) - Extending Multilingual Machine Translation through Imitation Learning [60.15671816513614]
Imit-MNMT treats the task as an imitation learning process, which mimicks the behavior of an expert.
We show that our approach significantly improves the translation performance between the new and the original languages.
We also demonstrate that our approach is capable of solving copy and off-target problems.
arXiv Detail & Related papers (2023-11-14T21:04:03Z) - Generate to Understand for Representation [3.5325087487696463]
GUR is a pretraining framework that combines language modeling and contrastive learning objectives in a single training step.
GUR achieves impressive results without any labeled training data, outperforming all other pretrained baselines as a retriever at the recall benchmark in a zero-shot setting.
arXiv Detail & Related papers (2023-06-14T06:00:18Z) - Scalable Performance Analysis for Vision-Language Models [26.45624201546282]
Joint vision-language models have shown great performance over a diverse set of tasks.
Our paper introduces a more scalable solution that relies on already annotated benchmarks.
We confirm previous findings that CLIP behaves like a bag of words model and performs better with nouns and verbs.
arXiv Detail & Related papers (2023-05-30T06:40:08Z) - This is the way: designing and compiling LEPISZCZE, a comprehensive NLP
benchmark for Polish [5.8090623549313944]
We introduce LEPISZCZE, a new, comprehensive benchmark for Polish NLP.
We use five datasets from the Polish benchmark and add eight novel datasets.
We provide insights and experiences learned while creating the benchmark for Polish as the blueprint to design similar benchmarks for other low-resourced languages.
arXiv Detail & Related papers (2022-11-23T16:51:09Z) - GreenPLM: Cross-Lingual Transfer of Monolingual Pre-Trained Language
Models at Almost No Cost [7.510253441699812]
This study proposes a framework called GreenPLM that uses bilingual lexicons to directly "translate" pre-trained language models into another language.
We validate this approach in 18 languages' BERT models and show that this framework is comparable to, if not better than, other frameworks with high training costs.
In six out of seven tested languages, this framework outperforms the original monolingual language models with up to 200x less pre-training efforts.
arXiv Detail & Related papers (2022-11-13T18:59:15Z) - Multi-lingual Evaluation of Code Generation Models [82.7357812992118]
We present new benchmarks on evaluation code generation models: MBXP and Multilingual HumanEval, and MathQA-X.
These datasets cover over 10 programming languages.
We are able to assess the performance of code generation models in a multi-lingual fashion.
arXiv Detail & Related papers (2022-10-26T17:17:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.