AdvFusion: Adapter-based Knowledge Transfer for Code Summarization on Code Language Models
- URL: http://arxiv.org/abs/2307.07854v3
- Date: Sat, 21 Dec 2024 01:41:13 GMT
- Title: AdvFusion: Adapter-based Knowledge Transfer for Code Summarization on Code Language Models
- Authors: Iman Saberi, Amirreza Esmaeili, Fatemeh Fard, Fuxiang Chen,
- Abstract summary: We propose AdvFusion, a PEFT-based approach that effectively learns from other languages before adapting to the target task.<n>We evaluate it on code summarization and method name prediction.<n>It outperforms AdapterFusion by up to 1.7 points and surpasses LoRA with gains of 1.99, 1.26, and 2.16 for Ruby, JavaScript, and Go, respectively.
- Score: 0.3228451873135423
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Programming languages can benefit from one another by utilizing a pre-trained model for software engineering tasks such as code summarization and method name prediction. While full fine-tuning of Code Language Models (Code-LMs) has been explored for multilingual knowledge transfer, research on Parameter Efficient Fine-Tuning (PEFT) for this purpose is limited. AdapterFusion, a PEFT architecture, aims to enhance task performance by leveraging information from multiple languages but primarily focuses on the target language. To address this, we propose AdvFusion, a novel PEFT-based approach that effectively learns from other languages before adapting to the target task. Evaluated on code summarization and method name prediction, AdvFusion outperforms AdapterFusion by up to 1.7 points and surpasses LoRA with gains of 1.99, 1.26, and 2.16 for Ruby, JavaScript, and Go, respectively. We open-source our scripts for replication purposes.
Related papers
- Multi-Agent Collaboration for Multilingual Code Instruction Tuning [41.74155456003822]
We introduce a novel multi-agent collaboration framework to enhance multilingual instruction tuning for code LLMs.
Multiple language-specific intelligent agent components with generation memory work together to transfer knowledge from one language to another efficiently and effectively.
Experimental results on multilingual programming benchmarks demonstrate the superior performance of Qwen2.5-xCoder in sharing common knowledge.
arXiv Detail & Related papers (2025-02-11T11:46:38Z) - I Can't Share Code, but I need Translation -- An Empirical Study on Code Translation through Federated LLM [3.9373541926236766]
This study demonstrates that participants can collaboratively develop a FedLLM for efficient code translation.
Our findings indicate that FedLLM offers a collaborative approach to code translation and could serve as a promising direction for future research in this field.
arXiv Detail & Related papers (2025-01-10T05:43:36Z) - Unraveling the Potential of Large Language Models in Code Translation: How Far Are We? [4.616570111453259]
Large language models (LLMs) exhibit state-of-the-art performance in various tasks, but struggle for code translation.
We conduct a large-scale empirical study to exploit the capabilities and incapabilities of LLMs in code translation tasks.
We propose two methods: (1) intermediary translation which selects an intermediary language between the source and target ones; and (2) self-training which fine-tunes LLMs on self-generated parallel data.
arXiv Detail & Related papers (2024-10-13T12:20:12Z) - SpecTra: Enhancing the Code Translation Ability of Language Models by Generating Multi-Modal Specifications [17.60108067953814]
Large language models (LLMs) are increasingly being used for the task of automated code translation.
We propose SpecTra, a multi-stage approach that uses a novel self-consistency filter to first generate high-quality specifications.
arXiv Detail & Related papers (2024-05-28T20:48:30Z) - IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators [49.903001442804594]
This work investigates the prospect of leveraging compiler intermediate representations (IR) to improve the multilingual capabilities of Code-LMs.
We first compile SLTrans, a parallel dataset consisting of nearly 4M self-contained source code files.
Next, we carry out continued causal language modelling training on SLTrans, forcing the Code-LMs to learn the IR language.
Our resulting models, dubbed IRCoder, display sizeable and consistent gains across a wide variety of code generation tasks and metrics.
arXiv Detail & Related papers (2024-03-06T17:52:08Z) - UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised
Fine-tuning Dataset [69.33424532827608]
Open-source large language models (LLMs) have gained significant strength across diverse fields.
In this work, we construct an open-source multilingual supervised fine-tuning dataset.
The resulting UltraLink dataset comprises approximately 1 million samples across five languages.
arXiv Detail & Related papers (2024-02-07T05:05:53Z) - Language and Task Arithmetic with Parameter-Efficient Layers for Zero-Shot Summarization [126.96113831681338]
In this paper, we propose to improve zero-shot cross-lingual transfer by composing language or task specialized parameters.
Our method composes language and task PEFT modules via element-wise arithmetic operations to leverage unlabeled data and English labeled data.
arXiv Detail & Related papers (2023-11-15T20:04:58Z) - Translation and Fusion Improves Zero-shot Cross-lingual Information Extraction [18.926993352330797]
We propose TransFusion, a framework in which models are fine-tuned to use English translations of low-resource language data.
GoLLIE-TF, a cross-lingual instruction-tuned LLM for IE tasks, is designed to close the performance gap between high and low-resource languages.
arXiv Detail & Related papers (2023-05-23T01:23:22Z) - MetaTPTrans: A Meta Learning Approach for Multilingual Code
Representation Learning [5.434698132994918]
We propose MetaTPTrans, a meta learning approach for multilingual code representation learning.
We show that MetaTPTrans improves the F1 score of state-of-the-art approaches significantly.
arXiv Detail & Related papers (2022-06-13T20:36:42Z) - Continual Learning in Multilingual NMT via Language-Specific Embeddings [92.91823064720232]
It consists in replacing the shared vocabulary with a small language-specific vocabulary and fine-tuning the new embeddings on the new language's parallel data.
Because the parameters of the original model are not modified, its performance on the initial languages does not degrade.
arXiv Detail & Related papers (2021-10-20T10:38:57Z) - Multilingual Domain Adaptation for NMT: Decoupling Language and Domain
Information with Adapters [66.7986513246294]
We study the compositionality of language and domain adapters in the context of Machine Translation.
We find that in the partial resource scenario a naive combination of domain-specific and language-specific adapters often results in catastrophic forgetting' of the missing languages.
arXiv Detail & Related papers (2021-10-18T18:55:23Z) - Composable Sparse Fine-Tuning for Cross-Lingual Transfer [56.86192078426372]
Fine-tuning all parameters of a pre-trained model has become the mainstream approach for transfer learning.
We introduce a new fine-tuning method with both these desirable properties.
It outperforms adapters in zero-shot cross-lingual transfer by a large margin.
arXiv Detail & Related papers (2021-10-14T17:27:29Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z) - MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer [136.09386219006123]
We propose MAD-X, an adapter-based framework that enables high portability and parameter-efficient transfer to arbitrary tasks and languages.
MAD-X outperforms the state of the art in cross-lingual transfer across a representative set of typologically diverse languages on named entity recognition and causal commonsense reasoning.
arXiv Detail & Related papers (2020-04-30T18:54:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.