Related papers: Overcoming linguistic barriers in code assistants: creating a QLoRA adapter to improve support for Russian-language code writing instructions

Overcoming linguistic barriers in code assistants: creating a QLoRA adapter to improve support for Russian-language code writing instructions

URL: http://arxiv.org/abs/2409.09353v1
Date: Sat, 14 Sep 2024 07:49:29 GMT
Title: Overcoming linguistic barriers in code assistants: creating a QLoRA adapter to improve support for Russian-language code writing instructions
Authors: C. B. Pronin, A. V. Volosova, A. V. Ostroukh, Yu. N. Strogov,
Abstract summary: adapter was developed to improve the performance of the base model in tasks related to programming and understanding the Russian language. The proposed adapter was trained using a large and diverse dataset, including question-answer pairs related to programming, as well code-related texts in Russian language. The obtained results showed significant improvement, both in tasks related to writing Python code and in processing the Russian language, confirming the effectiveness of the proposed adapter.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, an approach to training and evaluating an adapter model for the popular language model "zephyr-7b-beta" is described. The adapter was developed to improve the performance of the base model in tasks related to programming and understanding the Russian language. Considering the high quality of the original model in tasks in the English language, the goal of the research was to expand its linguistic and technical spectrum. The proposed adapter was trained using a large and diverse dataset, including question-answer pairs related to programming, as well code-related texts in Russian language. The applied training methodology ensures an improvement in the model's quality of answers in understanding and generating Python code based on Russian instructions. We evaluated the performance of the base model with the installed adapter using various metrics, comparing it to the base model as well as other state-of-the-art models in this field. The obtained results showed significant improvement, both in tasks related to writing Python code and in processing the Russian language, confirming the effectiveness of the proposed adapter.

Related papers

Teaching a Language Model to Speak the Language of Tools [0.0]
This work presents a methodology for adapting existing language models to enable robust tool use in any target language.<n>The research introduces TUCAN, which achieves up to 28.75% improvement in function-calling accuracy over base models.
arXiv Detail & Related papers (2025-06-29T20:47:27Z)
Enhancing Multilingual Language Models for Code-Switched Input Data [0.0]
This research investigates if pre-training Multilingual BERT (mBERT) on code-switched datasets improves the model's performance on critical NLP tasks. We use a dataset of Spanglish tweets for pre-training and evaluate the pre-trained model against a baseline model. Our findings show that our pre-trained mBERT model outperforms or matches the baseline model in the given tasks, with the most significant improvements seen for parts of speech tagging.
arXiv Detail & Related papers (2025-03-11T02:49:41Z)
Vikhr: Constructing a State-of-the-art Bilingual Open-Source Instruction-Following Large Language Model for Russian [44.13635168077528]
Vikhr is a state-of-the-art bilingual open-source instruction-following LLM designed specifically for the Russian language. "Vikhr" refers to the name of the Mistral LLM series and means a "strong gust of wind"
arXiv Detail & Related papers (2024-05-22T18:58:58Z)
Leveraging Language Identification to Enhance Code-Mixed Text Classification [0.7340017786387767]
Existing deep-learning models do not take advantage of the implicit language information in code-mixed text. Our study aims to improve BERT-based models performance on low-resource Code-Mixed Hindi-English datasets.
arXiv Detail & Related papers (2023-06-08T06:43:10Z)
BRENT: Bidirectional Retrieval Enhanced Norwegian Transformer [1.911678487931003]
Retrieval-based language models are increasingly employed in question-answering tasks. We develop the first Norwegian retrieval-based model by adapting the REALM framework. We show that this type of training improves the reader's performance on extractive question-answering.
arXiv Detail & Related papers (2023-04-19T13:40:47Z)
Language-Family Adapters for Low-Resource Multilingual Neural Machine Translation [129.99918589405675]
Large multilingual models trained with self-supervision achieve state-of-the-art results in a wide range of natural language processing tasks. Multilingual fine-tuning improves performance on low-resource languages but requires modifying the entire model and can be prohibitively expensive. We propose training language-family adapters on top of mBART-50 to facilitate cross-lingual transfer.
arXiv Detail & Related papers (2022-09-30T05:02:42Z)
Summarize and Generate to Back-translate: Unsupervised Translation of Programming Languages [86.08359401867577]
Back-translation is widely known for its effectiveness for neural machine translation when little to no parallel data is available. We propose performing back-translation via code summarization and generation. We show that our proposed approach performs competitively with state-of-the-art methods.
arXiv Detail & Related papers (2022-05-23T08:20:41Z)
Russian SuperGLUE 1.1: Revising the Lessons not Learned by Russian NLP models [53.95094814056337]
This paper presents Russian SuperGLUE 1.1, an updated benchmark styled after GLUE for Russian NLP models. The new version includes a number of technical, user experience and methodological improvements. We provide the integration of Russian SuperGLUE with a framework for industrial evaluation of the open-source models, MOROCCO.
arXiv Detail & Related papers (2022-02-15T23:45:30Z)
Efficient Test Time Adapter Ensembling for Low-resource Language Varieties [115.12997212870962]
Specialized language and task adapters have been proposed to facilitate cross-lingual transfer of multilingual pretrained models. An intuitive solution is to use a related language adapter for the new language variety, but we observe that this solution can lead to sub-optimal performance. In this paper, we aim to improve the robustness of language adapters to uncovered languages without training new adapters.
arXiv Detail & Related papers (2021-09-10T13:44:46Z)
Paraphrastic Representations at Scale [134.41025103489224]
We release trained models for English, Arabic, German, French, Spanish, Russian, Turkish, and Chinese languages. We train these models on large amounts of data, achieving significantly improved performance from the original papers.
arXiv Detail & Related papers (2021-04-30T16:55:28Z)
Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size. We propose a fully compositional output embedding layer for language models. To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.