ECMG: Exemplar-based Commit Message Generation
- URL: http://arxiv.org/abs/2203.02700v1
- Date: Sat, 5 Mar 2022 10:55:15 GMT
- Title: ECMG: Exemplar-based Commit Message Generation
- Authors: Ensheng Shia, Yanlin Wangb, Lun Du, Hongyu Zhang, Shi Han, Dongmei
Zhang, Hongbin Sun
- Abstract summary: Commit messages concisely describe the content of code diffs (i.e., code changes) and the intent behind them.
The information retrieval-based methods reuse the commit messages of similar code diffs, while the neural-based methods learn the semantic connection between code diffs and commit messages.
We propose a novel exemplar-based neural commit message generation model, which treats the similar commit message as an exemplar and leverages it to guide the neural network model to generate an accurate commit message.
- Score: 45.54414179533286
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Commit messages concisely describe the content of code diffs (i.e., code
changes) and the intent behind them. Recently, many approaches have been
proposed to generate commit messages automatically. The information
retrieval-based methods reuse the commit messages of similar code diffs, while
the neural-based methods learn the semantic connection between code diffs and
commit messages. However, the reused commit messages might not accurately
describe the content/intent of code diffs and neural-based methods tend to
generate high-frequent and repetitive tokens in the corpus. In this paper, we
combine the advantages of the two technical routes and propose a novel
exemplar-based neural commit message generation model, which treats the similar
commit message as an exemplar and leverages it to guide the neural network
model to generate an accurate commit message. We perform extensive experiments
and the results confirm the effectiveness of our model.
Related papers
- Commit Messages in the Age of Large Language Models [0.9217021281095906]
We evaluate the performance of OpenAI's ChatGPT for generating commit messages based on code changes.
We compare the results obtained with ChatGPT to previous automatic commit message generation methods that have been trained specifically on commit data.
arXiv Detail & Related papers (2024-01-31T06:47:12Z) - Delving into Commit-Issue Correlation to Enhance Commit Message
Generation Models [13.605167159285374]
Commit message generation is a challenging task in automated software engineering.
tool is a novel paradigm that can introduce the correlation between commits and issues into the training phase of models.
The results show that compared with the original models, the performance of tool-enhanced models is significantly improved.
arXiv Detail & Related papers (2023-07-31T20:35:00Z) - LRANet: Towards Accurate and Efficient Scene Text Detection with
Low-Rank Approximation Network [63.554061288184165]
We propose a novel parameterized text shape method based on low-rank approximation.
By exploring the shape correlation among different text contours, our method achieves consistency, compactness, simplicity, and robustness in shape representation.
We implement an accurate and efficient arbitrary-shaped text detector named LRANet.
arXiv Detail & Related papers (2023-06-27T02:03:46Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - Jointly Learning to Repair Code and Generate Commit Message [78.4177637346384]
We construct a multilingual triple dataset including buggy code, fixed code, and commit messages for this novel task.
To deal with the error propagation problem of the cascaded method, the joint model is proposed that can both repair the code and generate the commit message.
Experimental results show that the enhanced cascaded model with teacher-student method and multitask-learning method achieves the best score on different metrics of automated code repair.
arXiv Detail & Related papers (2021-09-25T07:08:28Z) - Assessing the Effectiveness of Syntactic Structure to Learn Code Edit
Representations [2.1793134762413433]
We use structural information from Abstract Syntax Tree (AST) to represent source code edits.
Inspired by the code2seq approach, we evaluate how using structural information from AST can help with the task of code edit classification.
arXiv Detail & Related papers (2021-06-11T01:23:07Z) - Autoregressive Belief Propagation for Decoding Block Codes [113.38181979662288]
We revisit recent methods that employ graph neural networks for decoding error correcting codes.
Our method violates the symmetry conditions that enable the other methods to train exclusively with the zero-word.
Despite not having the luxury of training on a single word, and the inability to train on more than a small fraction of the relevant sample space, we demonstrate effective training.
arXiv Detail & Related papers (2021-01-23T17:14:55Z) - Conditioned Text Generation with Transfer for Closed-Domain Dialogue
Systems [65.48663492703557]
We show how to optimally train and control the generation of intent-specific sentences using a conditional variational autoencoder.
We introduce a new protocol called query transfer that allows to leverage a large unlabelled dataset.
arXiv Detail & Related papers (2020-11-03T14:06:10Z) - Retrieve and Refine: Exemplar-based Neural Comment Generation [27.90756259321855]
Comments of similar code snippets are helpful for comment generation.
We design a novel seq2seq neural network that takes the given code, its AST, its similar code, and its exemplar as input.
We evaluate our approach on a large-scale Java corpus, which contains about 2M samples.
arXiv Detail & Related papers (2020-10-09T09:33:10Z) - CoreGen: Contextualized Code Representation Learning for Commit Message
Generation [39.383390029545865]
We propose a novel Contextualized code representation learning strategy for commit message Generation (CoreGen)
Experiments on the benchmark dataset demonstrate the superior effectiveness of our model over the baseline models with at least 28.18% improvement in terms of BLEU-4 score.
arXiv Detail & Related papers (2020-07-14T09:43:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.