Improving Tree-Structured Decoder Training for Code Generation via
Mutual Learning
- URL: http://arxiv.org/abs/2105.14796v1
- Date: Mon, 31 May 2021 08:44:13 GMT
- Title: Improving Tree-Structured Decoder Training for Code Generation via
Mutual Learning
- Authors: Binbin Xie, Jinsong Su, Yubin Ge, Xiang Li, Jianwei Cui, Junfeng Yao
and Bin Wang
- Abstract summary: Code generation aims to automatically generate a piece of code given an input natural language utterance.
We first throughly analyze the context modeling difference between neural code generation models with different decodings.
We propose to introduce a mutual learning framework to jointly train these models.
- Score: 27.080718377956693
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Code generation aims to automatically generate a piece of code given an input
natural language utterance. Currently, among dominant models, it is treated as
a sequence-to-tree task, where a decoder outputs a sequence of actions
corresponding to the pre-order traversal of an Abstract Syntax Tree. However,
such a decoder only exploits the preorder traversal based preceding actions,
which are insufficient to ensure correct action predictions. In this paper, we
first throughly analyze the context modeling difference between neural code
generation models with different traversals based decodings (preorder traversal
vs breadth-first traversal), and then propose to introduce a mutual learning
framework to jointly train these models. Under this framework, we continuously
enhance both two models via mutual distillation, which involves synchronous
executions of two one-to-one knowledge transfers at each training step. More
specifically, we alternately choose one model as the student and the other as
its teacher, and require the student to fit the training data and the action
prediction distributions of its teacher. By doing so, both models can fully
absorb the knowledge from each other and thus could be improved simultaneously.
Experimental results and in-depth analysis on several benchmark datasets
demonstrate the effectiveness of our approach. We release our code at
https://github.com/DeepLearnXMU/CGML.
Related papers
- Code Representation Learning At Scale [75.04686476303436]
We fuel code representation learning with a vast amount of code data via a two-stage pretraining scheme.
We first train the encoders via a mix that leverages both randomness in masking language modeling and the structure aspect of programming language.
We then enhance the representations via contrastive learning with hard negative and hard positive constructed in an unsupervised manner.
arXiv Detail & Related papers (2024-02-02T22:19:15Z) - DORE: Document Ordered Relation Extraction based on Generative Framework [56.537386636819626]
This paper investigates the root cause of the underwhelming performance of the existing generative DocRE models.
We propose to generate a symbolic and ordered sequence from the relation matrix which is deterministic and easier for model to learn.
Experimental results on four datasets show that our proposed method can improve the performance of the generative DocRE models.
arXiv Detail & Related papers (2022-10-28T11:18:10Z) - Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups.
We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective.
Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z) - Efficient Sub-structured Knowledge Distillation [52.5931565465661]
We propose an approach that is much simpler in its formulation and far more efficient for training than existing approaches.
We transfer the knowledge from a teacher model to its student model by locally matching their predictions on all sub-structures, instead of the whole output space.
arXiv Detail & Related papers (2022-03-09T15:56:49Z) - UniXcoder: Unified Cross-Modal Pre-training for Code Representation [65.6846553962117]
We present UniXcoder, a unified cross-modal pre-trained model for programming language.
We propose a one-to-one mapping method to transform AST in a sequence structure that retains all structural information from the tree.
We evaluate UniXcoder on five code-related tasks over nine datasets.
arXiv Detail & Related papers (2022-03-08T04:48:07Z) - Bridging Pre-trained Models and Downstream Tasks for Source Code
Understanding [13.65914588243695]
We propose an approach to bridge pre-trained models and code-related tasks.
We exploit semantic-preserving transformation to enrich downstream data diversity.
We introduce curriculum learning to organize the transformed data in an easy-to-hard manner to fine-tune existing pre-trained models.
arXiv Detail & Related papers (2021-12-04T07:21:28Z) - Parameter Decoupling Strategy for Semi-supervised 3D Left Atrium
Segmentation [0.0]
We present a novel semi-supervised segmentation model based on parameter decoupling strategy to encourage consistent predictions from diverse views.
Our method has achieved a competitive result over the state-of-the-art semisupervised methods on the Atrial Challenge dataset.
arXiv Detail & Related papers (2021-09-20T14:51:42Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.