On The Cross-Modal Transfer from Natural Language to Code through
Adapter Modules
- URL: http://arxiv.org/abs/2204.08653v1
- Date: Tue, 19 Apr 2022 04:18:02 GMT
- Title: On The Cross-Modal Transfer from Natural Language to Code through
Adapter Modules
- Authors: Divyam Goel, Ramansh Grover, Fatemeh H. Fard
- Abstract summary: We explore the knowledge transfer using adapters in software engineering.
Three programming languages, C/C++, Python, and Java, are studied along with extensive experiments on the best setup used for adapters.
Our results can open new directions to build smaller models for more software engineering tasks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained neural Language Models (PTLM), such as CodeBERT, are recently
used in software engineering as models pre-trained on large source code
corpora. Their knowledge is transferred to downstream tasks (e.g. code clone
detection) via fine-tuning. In natural language processing (NLP), other
alternatives for transferring the knowledge of PTLMs are explored through using
adapters, compact, parameter efficient modules inserted in the layers of the
PTLM. Although adapters are known to facilitate adapting to many downstream
tasks compared to fine-tuning the model that require retraining all of the
models' parameters -- which owes to the adapters' plug and play nature and
being parameter efficient -- their usage in software engineering is not
explored.
Here, we explore the knowledge transfer using adapters and based on the
Naturalness Hypothesis proposed by Hindle et. al \cite{hindle2016naturalness}.
Thus, studying the bimodality of adapters for two tasks of cloze test and code
clone detection, compared to their benchmarks from the CodeXGLUE platform.
These adapters are trained using programming languages and are inserted in a
PTLM that is pre-trained on English corpora (N-PTLM). Three programming
languages, C/C++, Python, and Java, are studied along with extensive
experiments on the best setup used for adapters. Improving the results of the
N-PTLM confirms the success of the adapters in knowledge transfer to software
engineering, which sometimes are in par with or exceed the results of a PTLM
trained on source code; while being more efficient in terms of the number of
parameters, memory usage, and inference time. Our results can open new
directions to build smaller models for more software engineering tasks. We open
source all the scripts and the trained adapters.
Related papers
- Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R [1.9799527196428242]
Large Langauge Models (LLMs) have gained a lot of attention in the Software Engineering (SE) community.
In this work, we empirically study PEFT methods, LoRA and Compacter, on CodeT5 and CodeLlama.
We will assess their performance compared to fully fine-tuned models, whether they can be used for knowledge transfer from natural language models to code, and their ability to adapt the learned knowledge to an unseen language.
arXiv Detail & Related papers (2024-03-16T03:12:45Z) - Utilization of Pre-trained Language Model for Adapter-based Knowledge
Transfer in Software Engineering [0.3963827913892984]
We study the knowledge transfer using adapters on multiple down-stream tasks including cloze test, code clone detection, and code summarization.
adapters are trained on code corpora and are inserted into a PLM that is pre-trained on English corpora or code corpora.
We observed an improvement in results using NL-PLM over a PLM that does not have adapters, and this suggested that adapters can transfer and utilize useful knowledge from NL-PLM to SE tasks.
arXiv Detail & Related papers (2023-07-17T14:58:52Z) - Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large
Language Models [77.2078051555533]
We propose a novel and affordable solution for the effective VL adaption of large language models (LLMs)
Instead of using large neural networks to connect the image encoder and LLM, MMA adopts lightweight modules, i.e., adapters.
MMA is also equipped with a routing algorithm to help LLMs achieve an automatic shift between single- and multi-modal instructions.
arXiv Detail & Related papers (2023-05-24T11:06:15Z) - LeTI: Learning to Generate from Textual Interactions [60.425769582343506]
We explore LMs' potential to learn from textual interactions (LETI) that not only check their correctness with binary labels but also pinpoint and explain errors in their outputs through textual feedback.
Our focus is the code generation task, where the model produces code based on natural language instructions.
LETI iteratively fine-tunes the model, using the objective LM, on a concatenation of natural language instructions, LM-generated programs, and textual feedback.
arXiv Detail & Related papers (2023-05-17T15:53:31Z) - LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of
Large Language Models [75.25782573728677]
This paper presents a framework for adapter-based parameter-efficient fine-tuning (PEFT) of language models (LLMs)
The framework includes state-of-the-art open-access LLMs such as LLaMA, BLOOM, and GPT-J, as well as widely used adapters such as Series adapters, Parallel adapter, Prompt-based learning and Reparametrization-based methods.
We evaluate the effectiveness of the adapters on fourteen datasets from two different reasoning tasks, Arithmetic Reasoning and Commonsense Reasoning.
arXiv Detail & Related papers (2023-04-04T16:31:37Z) - CHAPTER: Exploiting Convolutional Neural Network Adapters for
Self-supervised Speech Models [62.60723685118747]
Self-supervised learning (SSL) is a powerful technique for learning representations from unlabeled data.
We propose an efficient tuning method specifically designed for SSL speech model, by applying CNN adapters at the feature extractor.
We empirically found that adding CNN to the feature extractor can help the adaptation on emotion and speaker tasks.
arXiv Detail & Related papers (2022-12-01T08:50:12Z) - Selective Token Generation for Few-shot Natural Language Generation [19.015739016376532]
We develop a novel additive learning algorithm based on reinforcement learning (RL)
We show that the proposed selective token generation significantly outperforms the previous additive learning algorithms based on the PLMs.
arXiv Detail & Related papers (2022-09-17T00:48:52Z) - AdapterHub Playground: Simple and Flexible Few-Shot Learning with
Adapters [34.86139827292556]
Open-access dissemination of pretrained language models has led to a democratization of state-of-the-art natural language processing (NLP) research.
This also allows people outside of NLP to use such models and adapt them to specific use-cases.
In this work, we aim to overcome this gap by providing a tool which allows researchers to leverage pretrained models without writing a single line of code.
arXiv Detail & Related papers (2021-08-18T11:56:01Z) - Exploiting Adapters for Cross-lingual Low-resource Speech Recognition [52.40623653290499]
Cross-lingual speech adaptation aims to solve the problem of leveraging multiple rich-resource languages to build models for a low-resource target language.
We propose adapters to investigate the performance of multiple adapters for parameter-efficient cross-lingual speech adaptation.
arXiv Detail & Related papers (2021-05-18T08:30:37Z) - AdapterHub: A Framework for Adapting Transformers [148.6877231725939]
AdapterHub is a framework that allows dynamic "stitching-in" of pre-trained adapters for different tasks and languages.
Our framework enables scalable and easy access to sharing of task-specific models.
arXiv Detail & Related papers (2020-07-15T15:56:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.