Related papers: Utilization of Pre-trained Language Model for Adapter-based Knowledge Transfer in Software Engineering

Utilization of Pre-trained Language Model for Adapter-based Knowledge Transfer in Software Engineering

URL: http://arxiv.org/abs/2307.08540v2
Date: Tue, 6 Feb 2024 08:03:31 GMT
Title: Utilization of Pre-trained Language Model for Adapter-based Knowledge Transfer in Software Engineering
Authors: Iman Saberi, Fatemeh Fard and Fuxiang Chen
Abstract summary: We study the knowledge transfer using adapters on multiple down-stream tasks including cloze test, code clone detection, and code summarization. adapters are trained on code corpora and are inserted into a PLM that is pre-trained on English corpora or code corpora. We observed an improvement in results using NL-PLM over a PLM that does not have adapters, and this suggested that adapters can transfer and utilize useful knowledge from NL-PLM to SE tasks.
Score: 0.3963827913892984
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Software Engineering (SE) Pre-trained Language Models (PLMs), such as CodeBERT, are pre-trained on large code corpora, and their learned knowledge has shown success in transferring into downstream tasks (e.g., code clone detection) through the fine-tuning of PLMs. In Natural Language Processing (NLP), an alternative in transferring the knowledge of PLMs is explored through the use of adapter, a compact and parameter efficient module that is inserted into a PLM. Although the use of adapters has shown promising results in many NLP-based downstream tasks, their application and exploration in SE-based downstream tasks are limited. Here, we study the knowledge transfer using adapters on multiple down-stream tasks including cloze test, code clone detection, and code summarization. These adapters are trained on code corpora and are inserted into a PLM that is pre-trained on English corpora or code corpora. We called these PLMs as NL-PLM and C-PLM, respectively. We observed an improvement in results using NL-PLM over a PLM that does not have adapters, and this suggested that adapters can transfer and utilize useful knowledge from NL-PLM to SE tasks. The results are sometimes on par with or exceed the results of C-PLM; while being more efficient in terms of the number of parameters and training time. Interestingly, adapters inserted into a C-PLM generally yield better results than a traditional fine-tuned C-PLM. Our results open new directions to build more compact models for SE tasks.

Related papers

Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters Themselves [123.07450481623124]
We propose Skip Tuning as a novel paradigm for adapting vision-language models to downstream tasks. Unlike existing PT or adapter-based methods, Skip Tuning applies Layer-wise Skipping (LSkip) and Class-wise Skipping (CSkip) upon the FT baseline without introducing extra context vectors or adapter modules.
arXiv Detail & Related papers (2024-12-16T07:33:23Z)
Leveraging Large Language Models for Wireless Symbol Detection via In-Context Learning [29.28683810366379]
We propose to leverage the in-context learning ability (a.k.a. prompting) of large language models (LLMs) to solve wireless tasks in the low data regime without any training or fine-tuning. Our results reveal that using LLMs via ICL methods generally outperforms traditional DNNs on the symbol demodulation task.
arXiv Detail & Related papers (2024-08-28T17:19:20Z)
Exploring and Unleashing the Power of Large Language Models in Automated Code Translation [40.25727029618665]
This paper investigates diverse LLMs and learning-based transpilers for automated code translation tasks. UniTrans is a Unified code Translation framework, applicable to various LLMs. Three recent LLMs of diverse sizes are tested with UniTrans, and all achieve substantial improvements.
arXiv Detail & Related papers (2024-04-23T00:49:46Z)
CodecLM: Aligning Language Models with Tailored Synthetic Data [51.59223474427153]
We introduce CodecLM, a framework for adaptively generating high-quality synthetic data for instruction-following abilities. We first encode seed instructions into metadata, which are concise keywords generated on-the-fly to capture the target instruction distribution. We also introduce Self-Rubrics and Contrastive Filtering during decoding to tailor data-efficient samples.
arXiv Detail & Related papers (2024-04-08T21:15:36Z)
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components. CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks. FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization. Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z)
Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models [77.2078051555533]
We propose a novel and affordable solution for the effective VL adaption of large language models (LLMs) Instead of using large neural networks to connect the image encoder and LLM, MMA adopts lightweight modules, i.e., adapters. MMA is also equipped with a routing algorithm to help LLMs achieve an automatic shift between single- and multi-modal instructions.
arXiv Detail & Related papers (2023-05-24T11:06:15Z)
LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models [75.25782573728677]
This paper presents a framework for adapter-based parameter-efficient fine-tuning (PEFT) of language models (LLMs) The framework includes state-of-the-art open-access LLMs such as LLaMA, BLOOM, and GPT-J, as well as widely used adapters such as Series adapters, Parallel adapter, Prompt-based learning and Reparametrization-based methods. We evaluate the effectiveness of the adapters on fourteen datasets from two different reasoning tasks, Arithmetic Reasoning and Commonsense Reasoning.
arXiv Detail & Related papers (2023-04-04T16:31:37Z)
CHAPTER: Exploiting Convolutional Neural Network Adapters for Self-supervised Speech Models [62.60723685118747]
Self-supervised learning (SSL) is a powerful technique for learning representations from unlabeled data. We propose an efficient tuning method specifically designed for SSL speech model, by applying CNN adapters at the feature extractor. We empirically found that adding CNN to the feature extractor can help the adaptation on emotion and speaker tasks.
arXiv Detail & Related papers (2022-12-01T08:50:12Z)
Selective Token Generation for Few-shot Natural Language Generation [19.015739016376532]
We develop a novel additive learning algorithm based on reinforcement learning (RL) We show that the proposed selective token generation significantly outperforms the previous additive learning algorithms based on the PLMs.
arXiv Detail & Related papers (2022-09-17T00:48:52Z)
KALA: Knowledge-Augmented Language Model Adaptation [65.92457495576141]
We propose a novel domain adaption framework for pre-trained language models (PLMs) Knowledge-Augmented Language model Adaptation (KALA) modulates the intermediate hidden representations of PLMs with domain knowledge. Results show that, despite being computationally efficient, our KALA largely outperforms adaptive pre-training.
arXiv Detail & Related papers (2022-04-22T08:11:59Z)
On The Cross-Modal Transfer from Natural Language to Code through Adapter Modules [0.0]
We explore the knowledge transfer using adapters in software engineering. Three programming languages, C/C++, Python, and Java, are studied along with extensive experiments on the best setup used for adapters. Our results can open new directions to build smaller models for more software engineering tasks.
arXiv Detail & Related papers (2022-04-19T04:18:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.