Related papers: ECMG: Exemplar-based Commit Message Generation

ECMG: Exemplar-based Commit Message Generation

URL: http://arxiv.org/abs/2203.02700v1
Date: Sat, 5 Mar 2022 10:55:15 GMT
Title: ECMG: Exemplar-based Commit Message Generation
Authors: Ensheng Shia, Yanlin Wangb, Lun Du, Hongyu Zhang, Shi Han, Dongmei Zhang, Hongbin Sun
Abstract summary: Commit messages concisely describe the content of code diffs (i.e., code changes) and the intent behind them. The information retrieval-based methods reuse the commit messages of similar code diffs, while the neural-based methods learn the semantic connection between code diffs and commit messages. We propose a novel exemplar-based neural commit message generation model, which treats the similar commit message as an exemplar and leverages it to guide the neural network model to generate an accurate commit message.
Score: 45.54414179533286
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Commit messages concisely describe the content of code diffs (i.e., code changes) and the intent behind them. Recently, many approaches have been proposed to generate commit messages automatically. The information retrieval-based methods reuse the commit messages of similar code diffs, while the neural-based methods learn the semantic connection between code diffs and commit messages. However, the reused commit messages might not accurately describe the content/intent of code diffs and neural-based methods tend to generate high-frequent and repetitive tokens in the corpus. In this paper, we combine the advantages of the two technical routes and propose a novel exemplar-based neural commit message generation model, which treats the similar commit message as an exemplar and leverages it to guide the neural network model to generate an accurate commit message. We perform extensive experiments and the results confirm the effectiveness of our model.

Related papers

Contextual Code Retrieval for Commit Message Generation: A Preliminary Study [18.46986692375691]
A commit message describes the main code changes in a commit and plays a crucial role in software maintenance.<n>Existing commit message generation approaches typically frame it as a direct mapping which inputs a code diff and produces a brief descriptive sentence as output.<n>We argue that relying solely on the code diff is insufficient, as raw code diff fails to capture the full context needed for generating high-quality commit messages.
arXiv Detail & Related papers (2025-07-23T16:54:57Z)
Bridging Textual-Collaborative Gap through Semantic Codes for Sequential Recommendation [91.13055384151897]
CoCoRec is a novel Code-based textual and Collaborative semantic fusion method for sequential Recommendation. We generate fine-grained semantic codes from multi-view text embeddings through vector quantization techniques. In order to further enhance the fusion of textual and collaborative semantics, we introduce an optimization strategy.
arXiv Detail & Related papers (2025-03-15T15:54:44Z)
Commit Messages in the Age of Large Language Models [0.9217021281095906]
We evaluate the performance of OpenAI's ChatGPT for generating commit messages based on code changes. We compare the results obtained with ChatGPT to previous automatic commit message generation methods that have been trained specifically on commit data.
arXiv Detail & Related papers (2024-01-31T06:47:12Z)
Delving into Commit-Issue Correlation to Enhance Commit Message Generation Models [13.605167159285374]
Commit message generation is a challenging task in automated software engineering. tool is a novel paradigm that can introduce the correlation between commits and issues into the training phase of models. The results show that compared with the original models, the performance of tool-enhanced models is significantly improved.
arXiv Detail & Related papers (2023-07-31T20:35:00Z)
LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation Network [63.554061288184165]
We propose a novel parameterized text shape method based on low-rank approximation. By exploring the shape correlation among different text contours, our method achieves consistency, compactness, simplicity, and robustness in shape representation. We implement an accurate and efficient arbitrary-shaped text detector named LRANet.
arXiv Detail & Related papers (2023-06-27T02:03:46Z)
Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text. As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z)
Jointly Learning to Repair Code and Generate Commit Message [78.4177637346384]
We construct a multilingual triple dataset including buggy code, fixed code, and commit messages for this novel task. To deal with the error propagation problem of the cascaded method, the joint model is proposed that can both repair the code and generate the commit message. Experimental results show that the enhanced cascaded model with teacher-student method and multitask-learning method achieves the best score on different metrics of automated code repair.
arXiv Detail & Related papers (2021-09-25T07:08:28Z)
Assessing the Effectiveness of Syntactic Structure to Learn Code Edit Representations [2.1793134762413433]
We use structural information from Abstract Syntax Tree (AST) to represent source code edits. Inspired by the code2seq approach, we evaluate how using structural information from AST can help with the task of code edit classification.
arXiv Detail & Related papers (2021-06-11T01:23:07Z)
Autoregressive Belief Propagation for Decoding Block Codes [113.38181979662288]
We revisit recent methods that employ graph neural networks for decoding error correcting codes. Our method violates the symmetry conditions that enable the other methods to train exclusively with the zero-word. Despite not having the luxury of training on a single word, and the inability to train on more than a small fraction of the relevant sample space, we demonstrate effective training.
arXiv Detail & Related papers (2021-01-23T17:14:55Z)
Conditioned Text Generation with Transfer for Closed-Domain Dialogue Systems [65.48663492703557]
We show how to optimally train and control the generation of intent-specific sentences using a conditional variational autoencoder. We introduce a new protocol called query transfer that allows to leverage a large unlabelled dataset.
arXiv Detail & Related papers (2020-11-03T14:06:10Z)
Retrieve and Refine: Exemplar-based Neural Comment Generation [27.90756259321855]
Comments of similar code snippets are helpful for comment generation. We design a novel seq2seq neural network that takes the given code, its AST, its similar code, and its exemplar as input. We evaluate our approach on a large-scale Java corpus, which contains about 2M samples.
arXiv Detail & Related papers (2020-10-09T09:33:10Z)
CoreGen: Contextualized Code Representation Learning for Commit Message Generation [39.383390029545865]
We propose a novel Contextualized code representation learning strategy for commit message Generation (CoreGen) Experiments on the benchmark dataset demonstrate the superior effectiveness of our model over the baseline models with at least 28.18% improvement in terms of BLEU-4 score.
arXiv Detail & Related papers (2020-07-14T09:43:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.