One Adapter for All Programming Languages? Adapter Tuning for Code
Search and Summarization
- URL: http://arxiv.org/abs/2303.15822v1
- Date: Tue, 28 Mar 2023 08:49:54 GMT
- Title: One Adapter for All Programming Languages? Adapter Tuning for Code
Search and Summarization
- Authors: Deze Wang, Boxing Chen, Shanshan Li, Wei Luo, Shaoliang Peng, Wei
Dong, Xiangke Liao
- Abstract summary: We find that multilingual fine-tuning leads to performance degradation on recent models UniXcoder and CodeT5.
To alleviate the potentially catastrophic forgetting issue in multilingual models, we fix all pre-trained model parameters, insert the parameter-efficient structure adapter, and fine-tune it.
Our experiments on three probing tasks show that adapter tuning significantly outperforms full-model fine-tuning and effectively overcomes catastrophic forgetting.
- Score: 27.27985393610581
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As pre-trained models automate many code intelligence tasks, a widely used
paradigm is to fine-tune a model on the task dataset for each programming
language. A recent study reported that multilingual fine-tuning benefits a
range of tasks and models. However, we find that multilingual fine-tuning leads
to performance degradation on recent models UniXcoder and CodeT5.
To alleviate the potentially catastrophic forgetting issue in multilingual
models, we fix all pre-trained model parameters, insert the parameter-efficient
structure adapter, and fine-tune it. Updating only 0.6\% of the overall
parameters compared to full-model fine-tuning for each programming language,
adapter tuning yields consistent improvements on code search and summarization
tasks, achieving state-of-the-art results. In addition, we experimentally show
its effectiveness in cross-lingual and low-resource scenarios. Multilingual
fine-tuning with 200 samples per programming language approaches the results
fine-tuned with the entire dataset on code summarization. Our experiments on
three probing tasks show that adapter tuning significantly outperforms
full-model fine-tuning and effectively overcomes catastrophic forgetting.
Related papers
- Efficient Adapter Finetuning for Tail Languages in Streaming
Multilingual ASR [44.949146169903074]
The heterogeneous nature and imbalanced data abundance of different languages may cause performance degradation.
Our proposed method brings 12.2% word error rate reduction on average and up to 37.5% on a single locale.
arXiv Detail & Related papers (2024-01-17T06:01:16Z) - On the Analysis of Cross-Lingual Prompt Tuning for Decoder-based
Multilingual Model [49.81429697921861]
We study the interaction between parameter-efficient fine-tuning (PEFT) and cross-lingual tasks in multilingual autoregressive models.
We show that prompt tuning is more effective in enhancing the performance of low-resource languages than fine-tuning.
arXiv Detail & Related papers (2023-11-14T00:43:33Z) - Bactrian-X: Multilingual Replicable Instruction-Following Models with
Low-Rank Adaptation [40.695782736177264]
Bactrian-X is a comprehensive multilingual parallel dataset of 3.4 million instruction-response pairs across 52 languages.
We train a set of adapters using low-rank adaptation (LoRA), which are lightweight components that seamlessly integrate with large language models.
Experiments in various multilingual evaluation settings demonstrate that models derived from LoRA-based training over Bactrian-X outperform both the vanilla models and existing instruction-tuned models.
arXiv Detail & Related papers (2023-05-24T10:50:31Z) - Crosslingual Generalization through Multitask Finetuning [80.8822603322471]
Multitask prompted finetuning (MTF) has been shown to help large language models generalize to new tasks in a zero-shot setting.
We apply MTF to the pretrained multilingual BLOOM and mT5 model families to produce finetuned variants called BLOOMZ and mT0.
We find finetuning large multilingual language models on English tasks with English prompts allows for task generalization to non-English languages.
arXiv Detail & Related papers (2022-11-03T13:19:32Z) - Multi Task Learning For Zero Shot Performance Prediction of Multilingual
Models [12.759281077118567]
Massively Multilingual Transformer based Language Models have been observed to be surprisingly effective on zero-shot transfer across languages.
We build upon some of the existing techniques for predicting the zero-shot performance on a task, by modeling it as a multi-task learning problem.
arXiv Detail & Related papers (2022-05-12T14:47:03Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Balancing Training for Multilingual Neural Machine Translation [130.54253367251738]
multilingual machine translation (MT) models can translate to/from multiple languages.
Standard practice is to up-sample less resourced languages to increase representation.
We propose a method that instead automatically learns how to weight training data through a data scorer.
arXiv Detail & Related papers (2020-04-14T18:23:28Z) - Exploring Versatile Generative Language Model Via Parameter-Efficient
Transfer Learning [70.81910984985683]
We propose an effective way to fine-tune multiple down-stream generation tasks simultaneously using a single, large pre-trained model.
The experiments on five diverse language generation tasks show that by just using an additional 2-3% parameters for each task, our model can maintain or even improve the performance of fine-tuning the whole model.
arXiv Detail & Related papers (2020-04-08T06:18:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.