Related papers: On The Cross-Modal Transfer from Natural Language to Code through Adapter Modules

On The Cross-Modal Transfer from Natural Language to Code through Adapter Modules

URL: http://arxiv.org/abs/2204.08653v1
Date: Tue, 19 Apr 2022 04:18:02 GMT
Title: On The Cross-Modal Transfer from Natural Language to Code through Adapter Modules
Authors: Divyam Goel, Ramansh Grover, Fatemeh H. Fard
Abstract summary: We explore the knowledge transfer using adapters in software engineering. Three programming languages, C/C++, Python, and Java, are studied along with extensive experiments on the best setup used for adapters. Our results can open new directions to build smaller models for more software engineering tasks.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pre-trained neural Language Models (PTLM), such as CodeBERT, are recently used in software engineering as models pre-trained on large source code corpora. Their knowledge is transferred to downstream tasks (e.g. code clone detection) via fine-tuning. In natural language processing (NLP), other alternatives for transferring the knowledge of PTLMs are explored through using adapters, compact, parameter efficient modules inserted in the layers of the PTLM. Although adapters are known to facilitate adapting to many downstream tasks compared to fine-tuning the model that require retraining all of the models' parameters -- which owes to the adapters' plug and play nature and being parameter efficient -- their usage in software engineering is not explored. Here, we explore the knowledge transfer using adapters and based on the Naturalness Hypothesis proposed by Hindle et. al \cite{hindle2016naturalness}. Thus, studying the bimodality of adapters for two tasks of cloze test and code clone detection, compared to their benchmarks from the CodeXGLUE platform. These adapters are trained using programming languages and are inserted in a PTLM that is pre-trained on English corpora (N-PTLM). Three programming languages, C/C++, Python, and Java, are studied along with extensive experiments on the best setup used for adapters. Improving the results of the N-PTLM confirms the success of the adapters in knowledge transfer to software engineering, which sometimes are in par with or exceed the results of a PTLM trained on source code; while being more efficient in terms of the number of parameters, memory usage, and inference time. Our results can open new directions to build smaller models for more software engineering tasks. We open source all the scripts and the trained adapters.

Related papers

Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters Themselves [123.07450481623124]
We propose Skip Tuning as a novel paradigm for adapting vision-language models to downstream tasks. Unlike existing PT or adapter-based methods, Skip Tuning applies Layer-wise Skipping (LSkip) and Class-wise Skipping (CSkip) upon the FT baseline without introducing extra context vectors or adapter modules.
arXiv Detail & Related papers (2024-12-16T07:33:23Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
Exploring and Unleashing the Power of Large Language Models in Automated Code Translation [40.25727029618665]
This paper investigates diverse LLMs and learning-based transpilers for automated code translation tasks. UniTrans is a Unified code Translation framework, applicable to various LLMs. Three recent LLMs of diverse sizes are tested with UniTrans, and all achieve substantial improvements.
arXiv Detail & Related papers (2024-04-23T00:49:46Z)
Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R [1.9799527196428242]
Large Langauge Models (LLMs) have gained a lot of attention in the Software Engineering (SE) community. In this work, we empirically study PEFT methods, LoRA and Compacter, on CodeT5 and CodeLlama. We will assess their performance compared to fully fine-tuned models, whether they can be used for knowledge transfer from natural language models to code, and their ability to adapt the learned knowledge to an unseen language.
arXiv Detail & Related papers (2024-03-16T03:12:45Z)
Utilization of Pre-trained Language Model for Adapter-based Knowledge Transfer in Software Engineering [0.3963827913892984]
We study the knowledge transfer using adapters on multiple down-stream tasks including cloze test, code clone detection, and code summarization. adapters are trained on code corpora and are inserted into a PLM that is pre-trained on English corpora or code corpora. We observed an improvement in results using NL-PLM over a PLM that does not have adapters, and this suggested that adapters can transfer and utilize useful knowledge from NL-PLM to SE tasks.
arXiv Detail & Related papers (2023-07-17T14:58:52Z)
LeTI: Learning to Generate from Textual Interactions [60.425769582343506]
We explore LMs' potential to learn from textual interactions (LETI) that not only check their correctness with binary labels but also pinpoint and explain errors in their outputs through textual feedback. Our focus is the code generation task, where the model produces code based on natural language instructions. LETI iteratively fine-tunes the model, using the objective LM, on a concatenation of natural language instructions, LM-generated programs, and textual feedback.
arXiv Detail & Related papers (2023-05-17T15:53:31Z)
LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models [75.25782573728677]
This paper presents a framework for adapter-based parameter-efficient fine-tuning (PEFT) of language models (LLMs) The framework includes state-of-the-art open-access LLMs such as LLaMA, BLOOM, and GPT-J, as well as widely used adapters such as Series adapters, Parallel adapter, Prompt-based learning and Reparametrization-based methods. We evaluate the effectiveness of the adapters on fourteen datasets from two different reasoning tasks, Arithmetic Reasoning and Commonsense Reasoning.
arXiv Detail & Related papers (2023-04-04T16:31:37Z)
CHAPTER: Exploiting Convolutional Neural Network Adapters for Self-supervised Speech Models [62.60723685118747]
Self-supervised learning (SSL) is a powerful technique for learning representations from unlabeled data. We propose an efficient tuning method specifically designed for SSL speech model, by applying CNN adapters at the feature extractor. We empirically found that adding CNN to the feature extractor can help the adaptation on emotion and speaker tasks.
arXiv Detail & Related papers (2022-12-01T08:50:12Z)
Selective Token Generation for Few-shot Natural Language Generation [19.015739016376532]
We develop a novel additive learning algorithm based on reinforcement learning (RL) We show that the proposed selective token generation significantly outperforms the previous additive learning algorithms based on the PLMs.
arXiv Detail & Related papers (2022-09-17T00:48:52Z)
AdapterHub Playground: Simple and Flexible Few-Shot Learning with Adapters [34.86139827292556]
Open-access dissemination of pretrained language models has led to a democratization of state-of-the-art natural language processing (NLP) research. This also allows people outside of NLP to use such models and adapt them to specific use-cases. In this work, we aim to overcome this gap by providing a tool which allows researchers to leverage pretrained models without writing a single line of code.
arXiv Detail & Related papers (2021-08-18T11:56:01Z)
Exploiting Adapters for Cross-lingual Low-resource Speech Recognition [52.40623653290499]
Cross-lingual speech adaptation aims to solve the problem of leveraging multiple rich-resource languages to build models for a low-resource target language. We propose adapters to investigate the performance of multiple adapters for parameter-efficient cross-lingual speech adaptation.
arXiv Detail & Related papers (2021-05-18T08:30:37Z)
AdapterHub: A Framework for Adapting Transformers [148.6877231725939]
AdapterHub is a framework that allows dynamic "stitching-in" of pre-trained adapters for different tasks and languages. Our framework enables scalable and easy access to sharing of task-specific models.
arXiv Detail & Related papers (2020-07-15T15:56:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.