Exploring Continual Learning for Code Generation Models
- URL: http://arxiv.org/abs/2307.02435v1
- Date: Wed, 5 Jul 2023 16:58:39 GMT
- Title: Exploring Continual Learning for Code Generation Models
- Authors: Prateek Yadav, Qing Sun, Hantian Ding, Xiaopeng Li, Dejiao Zhang, Ming
Tan, Xiaofei Ma, Parminder Bhatia, Ramesh Nallapati, Murali Krishna
Ramanathan, Mohit Bansal, Bing Xiang
- Abstract summary: Continual Learning (CL) is an important aspect that remains underexplored in the code domain.
We introduce a benchmark called CodeTask-CL that covers a wide range of tasks, including code generation, translation, summarization, and refinement.
We find that effective methods like Prompt Pooling (PP) suffer from catastrophic forgetting due to the unstable training of the prompt selection mechanism.
- Score: 80.78036093054855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale code generation models such as Codex and CodeT5 have achieved
impressive performance. However, libraries are upgraded or deprecated very
frequently and re-training large-scale language models is computationally
expensive. Therefore, Continual Learning (CL) is an important aspect that
remains underexplored in the code domain. In this paper, we introduce a
benchmark called CodeTask-CL that covers a wide range of tasks, including code
generation, translation, summarization, and refinement, with different input
and output programming languages. Next, on our CodeTask-CL benchmark, we
compare popular CL techniques from NLP and Vision domains. We find that
effective methods like Prompt Pooling (PP) suffer from catastrophic forgetting
due to the unstable training of the prompt selection mechanism caused by stark
distribution shifts in coding tasks. We address this issue with our proposed
method, Prompt Pooling with Teacher Forcing (PP-TF), that stabilizes training
by enforcing constraints on the prompt selection mechanism and leads to a
21.54% improvement over Prompt Pooling. Along with the benchmark, we establish
a training pipeline that can be used for CL on code models, which we believe
can motivate further development of CL methods for code models. Our code is
available at https://github.com/amazon-science/codetaskcl-pptf
Related papers
- CodeRAG-Bench: Can Retrieval Augment Code Generation? [78.37076502395699]
We conduct a systematic, large-scale analysis of code generation using retrieval-augmented generation.
We first curate a comprehensive evaluation benchmark, CodeRAG-Bench, encompassing three categories of code generation tasks.
We examine top-performing models on CodeRAG-Bench by providing contexts retrieved from one or multiple sources.
arXiv Detail & Related papers (2024-06-20T16:59:52Z) - StepCoder: Improve Code Generation with Reinforcement Learning from
Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components.
CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks.
FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization.
Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z) - Introducing Language Guidance in Prompt-based Continual Learning [95.03110230754423]
We propose Language Guidance for Prompt-based Continual Learning (LGCL) as a plug-in for prompt-based methods.
LGCL consistently improves the performance of prompt-based continual learning methods to set a new state-of-the art.
arXiv Detail & Related papers (2023-08-30T08:03:49Z) - CodeT5+: Open Code Large Language Models for Code Understanding and
Generation [72.1638273937025]
Large language models (LLMs) pretrained on vast source code have achieved prominent progress in code intelligence.
CodeT5+ is a family of encoder-decoder LLMs for code in which component modules can be flexibly combined to suit a wide range of downstream code tasks.
We extensively evaluate CodeT5+ on over 20 code-related benchmarks in different settings, including zero-shot, finetuning, and instruction-tuning.
arXiv Detail & Related papers (2023-05-13T14:23:07Z) - Execution-based Code Generation using Deep Reinforcement Learning [8.085533911328577]
PPOCoder is a new framework for code generation that combines pre-trained PL models with Proximal Policy Optimization.
PPOCoder seamlessly integrates external code-specific knowledge into the model optimization process.
It's important to note that PPOCoder is a task-agnostic and model-agnostic framework that can be used across different code generation tasks and PLs.
arXiv Detail & Related papers (2023-01-31T18:02:26Z) - CLAWSAT: Towards Both Robust and Accurate Code Models [74.57590254102311]
We integrate contrastive learning (CL) with adversarial learning to co-optimize the robustness and accuracy of code models.
To the best of our knowledge, this is the first systematic study to explore and exploit the robustness and accuracy benefits of (multi-view) code obfuscations in code models.
arXiv Detail & Related papers (2022-11-21T18:32:50Z) - CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for
Code Understanding and Generation [36.47905744758698]
We present CodeT5, a unified pre-trained encoder-decoder Transformer model that better leverages the code semantics conveyed from the developer-assigned identifiers.
Our model employs a unified framework to seamlessly support both code understanding and generation tasks and allows for multi-task learning.
arXiv Detail & Related papers (2021-09-02T12:21:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.