COMET: Generating Commit Messages using Delta Graph Context
Representation
- URL: http://arxiv.org/abs/2402.01841v1
- Date: Fri, 2 Feb 2024 19:01:52 GMT
- Title: COMET: Generating Commit Messages using Delta Graph Context
Representation
- Authors: Abhinav Reddy Mandli, Saurabhsingh Rajput, and Tushar Sharma
- Abstract summary: Commit messages explain code changes in a commit and facilitate collaboration among developers.
We propose Comet, a novel approach that captures context of code changes using a graph-based representation.
Tests show Comet outperforms state-of-the-art techniques in terms of bleu-norm and meteor metrics.
- Score: 2.5899040911480182
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Commit messages explain code changes in a commit and facilitate collaboration
among developers. Several commit message generation approaches have been
proposed; however, they exhibit limited success in capturing the context of
code changes. We propose Comet (Context-Aware Commit Message Generation), a
novel approach that captures context of code changes using a graph-based
representation and leverages a transformer-based model to generate high-quality
commit messages. Our proposed method utilizes delta graph that we developed to
effectively represent code differences. We also introduce a customizable
quality assurance module to identify optimal messages, mitigating subjectivity
in commit messages. Experiments show that Comet outperforms state-of-the-art
techniques in terms of bleu-norm and meteor metrics while being comparable in
terms of rogue-l. Additionally, we compare the proposed approach with the
popular gpt-3.5-turbo model, along with gpt-4-turbo; the most capable GPT
model, over zero-shot, one-shot, and multi-shot settings. We found Comet
outperforming the GPT models, on five and four metrics respectively and provide
competitive results with the two other metrics. The study has implications for
researchers, tool developers, and software developers. Software developers may
utilize Comet to generate context-aware commit messages. Researchers and tool
developers can apply the proposed delta graph technique in similar contexts,
like code review summarization.
Related papers
- Towards Realistic Evaluation of Commit Message Generation by Matching Online and Offline Settings [77.20838441870151]
Commit message generation is a crucial task in software engineering that is challenging to evaluate correctly.
We use an online metric - the number of edits users introduce before committing the generated messages to the VCS - to select metrics for offline experiments.
Our results indicate that edit distance exhibits the highest correlation, whereas commonly used similarity metrics such as BLEU and METEOR demonstrate low correlation.
arXiv Detail & Related papers (2024-10-15T20:32:07Z) - Commit Messages in the Age of Large Language Models [0.9217021281095906]
We evaluate the performance of OpenAI's ChatGPT for generating commit messages based on code changes.
We compare the results obtained with ChatGPT to previous automatic commit message generation methods that have been trained specifically on commit data.
arXiv Detail & Related papers (2024-01-31T06:47:12Z) - Using Large Language Models for Commit Message Generation: A Preliminary
Study [5.5784148764236114]
Large language models (LLMs) can be used to generate commit messages automatically and effectively.
In 78% of the 366 samples, the commit messages generated by LLMs were evaluated by humans as the best.
arXiv Detail & Related papers (2024-01-11T14:06:39Z) - MAgIC: Investigation of Large Language Model Powered Multi-Agent in
Cognition, Adaptability, Rationality and Collaboration [102.41118020705876]
Large Language Models (LLMs) have marked a significant advancement in the field of natural language processing.
As their applications extend into multi-agent environments, a need has arisen for a comprehensive evaluation framework.
This work introduces a novel benchmarking framework specifically tailored to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z) - From Commit Message Generation to History-Aware Commit Message
Completion [49.175498083165884]
We argue that if we could shift the focus from commit message generation to commit message completion, we could significantly improve the quality and the personal nature of the resulting commit messages.
Since the existing datasets lack historical data, we collect and share a novel dataset called CommitChronicle, containing 10.7M commits across 20 programming languages.
Our results show that in some contexts, commit message completion shows better results than generation, and that while in general GPT-3.5-turbo performs worse, it shows potential for long and detailed messages.
arXiv Detail & Related papers (2023-08-15T09:10:49Z) - Delving into Commit-Issue Correlation to Enhance Commit Message
Generation Models [13.605167159285374]
Commit message generation is a challenging task in automated software engineering.
tool is a novel paradigm that can introduce the correlation between commits and issues into the training phase of models.
The results show that compared with the original models, the performance of tool-enhanced models is significantly improved.
arXiv Detail & Related papers (2023-07-31T20:35:00Z) - T5Score: Discriminative Fine-tuning of Generative Evaluation Metrics [94.69907794006826]
We present a framework that combines the best of both worlds, using both supervised and unsupervised signals from whatever data we have available.
We operationalize this idea by training T5Score, a metric that uses these training signals with mT5 as the backbone.
T5Score achieves the best performance on all datasets against existing top-scoring metrics at the segment level.
arXiv Detail & Related papers (2022-12-12T06:29:04Z) - ECMG: Exemplar-based Commit Message Generation [45.54414179533286]
Commit messages concisely describe the content of code diffs (i.e., code changes) and the intent behind them.
The information retrieval-based methods reuse the commit messages of similar code diffs, while the neural-based methods learn the semantic connection between code diffs and commit messages.
We propose a novel exemplar-based neural commit message generation model, which treats the similar commit message as an exemplar and leverages it to guide the neural network model to generate an accurate commit message.
arXiv Detail & Related papers (2022-03-05T10:55:15Z) - GN-Transformer: Fusing Sequence and Graph Representation for Improved
Code Summarization [0.0]
We propose a novel method, GN-Transformer, to learn end-to-end on a fused sequence and graph modality.
The proposed methods achieve state-of-the-art performance in two code summarization datasets and across three automatic code summarization metrics.
arXiv Detail & Related papers (2021-11-17T02:51:37Z) - CoreGen: Contextualized Code Representation Learning for Commit Message
Generation [39.383390029545865]
We propose a novel Contextualized code representation learning strategy for commit message Generation (CoreGen)
Experiments on the benchmark dataset demonstrate the superior effectiveness of our model over the baseline models with at least 28.18% improvement in terms of BLEU-4 score.
arXiv Detail & Related papers (2020-07-14T09:43:26Z) - Graph Convolution Machine for Context-aware Recommender System [59.50474932860843]
We extend the advantages of graph convolutions to context-aware recommender system (CARS)
We propose textitGraph Convolution Machine (GCM), an end-to-end framework that consists of three components: an encoder, graph convolution layers, and a decoder.
We conduct experiments on three real-world datasets from Yelp and Amazon, validating the effectiveness of GCM and the benefits of performing graph convolutions for CARS.
arXiv Detail & Related papers (2020-01-30T15:32:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.