AILINKPREVIEWER: Enhancing Code Reviews with LLM-Powered Link Previews
- URL: http://arxiv.org/abs/2511.09223v1
- Date: Thu, 13 Nov 2025 01:41:33 GMT
- Title: AILINKPREVIEWER: Enhancing Code Reviews with LLM-Powered Link Previews
- Authors: Panya Trakoolgerntong, Tao Xiao, Masanari Kondo, Chaiyong Ragkhitwetsagul, Morakot Choetkiertikul, Pattaraporn Sangaroonsilp, Yasutaka Kamei,
- Abstract summary: Code review is a key practice in software engineering, where developers evaluate code changes to ensure quality and maintainability.<n> Links to issues and external resources are often included in Pull Requests (PRs) to provide additional context.<n>We present AIlinkPREVIEWER, a tool that generates previews of links in PRs using PR metadata, including titles, descriptions, comments, and link body content.
- Score: 4.664062055146575
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Code review is a key practice in software engineering, where developers evaluate code changes to ensure quality and maintainability. Links to issues and external resources are often included in Pull Requests (PRs) to provide additional context, yet they are typically discarded in automated tasks such as PR summarization and code review comment generation. This limits the richness of information available to reviewers and increases cognitive load by forcing context-switching. To address this gap, we present AILINKPREVIEWER, a tool that leverages Large Language Models (LLMs) to generate previews of links in PRs using PR metadata, including titles, descriptions, comments, and link body content. We analyzed 50 engineered GitHub repositories and compared three approaches: Contextual LLM summaries, Non-Contextual LLM summaries, and Metadata-based previews. The results in metrics such as BLEU, BERTScore, and compression ratio show that contextual summaries consistently outperform other methods. However, in a user study with seven participants, most preferred non-contextual summaries, suggesting a trade-off between metric performance and perceived usability. These findings demonstrate the potential of LLM-powered link previews to enhance code review efficiency and to provide richer context for developers and automation in software engineering. The video demo is available at https://www.youtube.com/watch?v=h2qH4RtrB3E, and the tool and its source code can be found at https://github.com/c4rtune/AILinkPreviewer.
Related papers
- RAVEL: Reasoning Agents for Validating and Evaluating LLM Text Synthesis [78.32151470154422]
We introduce RAVEL, an agentic framework that enables the testers to autonomously plan and execute typical synthesis operations.<n>We present C3EBench, a benchmark comprising 1,258 samples derived from professional human writings.<n>By augmenting RAVEL with SOTA LLMs as operators, we find that such agentic text synthesis is dominated by the LLM's reasoning capability.
arXiv Detail & Related papers (2026-02-28T14:47:34Z) - Uncovering Code Insights: Leveraging GitHub Artifacts for Deeper Code Understanding [0.1358202049520503]
Large language models (LLMs) have shown promise in generating code explanations.<n>We propose a novel approach that leverages natural language artifacts from GitHub.<n>Our system consists of three components: one that extracts and structures relevant GitHub context, another that uses this context to generate high-level explanations of the code's purpose, and a third that validates the explanation.
arXiv Detail & Related papers (2025-11-05T15:31:42Z) - Code Review Without Borders: Evaluating Synthetic vs. Real Data for Review Recommendation [37.86790434630698]
Large Language Models (LLMs) are used to translate code changes from well-resourced languages into equivalent changes in underrepresented or emerging languages.<n>We compare the performance of these models against models trained on real labelled data.<n>This approach provides a scalable pathway to extend automated code review capabilities to rapidly evolving technology stacks.
arXiv Detail & Related papers (2025-09-05T05:17:14Z) - SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving [90.32201622392137]
We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs)<n>Unlike traditional static benchmarks, SwingArena models the collaborative process of software by pairing LLMs as iterations, who generate patches, and reviewers, who create test cases and verify the patches through continuous integration (CI) pipelines.
arXiv Detail & Related papers (2025-05-29T18:28:02Z) - Is Compression Really Linear with Code Intelligence? [60.123628177110206]
textitFormat Annealing is a lightweight, transparent training methodology designed to assess the intrinsic capabilities of pre-trained models equitably.<n>Our empirical results reveal a fundamental logarithmic relationship between measured code intelligence and bits-per-character (BPC)<n>Our work provides a more nuanced understanding of compression's role in developing code intelligence and contributes a robust evaluation framework in the code domain.
arXiv Detail & Related papers (2025-05-16T16:59:14Z) - Code Summarization Beyond Function Level [0.213063058314067]
This study investigated the effectiveness of code summarization models beyond the function level.<n>The fine-tuned state-of-the-art CodeT5+ base model excelled in code summarization.<n> Repository-level summarization exhibited promising potential but requires significant computational resources.
arXiv Detail & Related papers (2025-02-23T20:31:21Z) - Automating Patch Set Generation from Code Review Comments Using Large Language Models [2.045040820541428]
We provide code contexts to five popular Large Language Models (LLMs)
We obtain the suggested code-changes (patch sets) derived from real-world code-review comments.
The performance of each model is meticulously assessed by comparing their generated patch sets against the historical data of human-generated patch-sets.
arXiv Detail & Related papers (2024-04-10T02:46:08Z) - Robust and Scalable Model Editing for Large Language Models [75.95623066605259]
We propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing.
Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs.
arXiv Detail & Related papers (2024-03-26T06:57:23Z) - AlignedCoT: Prompting Large Language Models via Native-Speaking Demonstrations [52.43593893122206]
Alignedcot is an in-context learning technique for invoking Large Language Models.
It achieves consistent and correct step-wise prompts in zero-shot scenarios.
We conduct experiments on mathematical reasoning and commonsense reasoning.
arXiv Detail & Related papers (2023-11-22T17:24:21Z) - ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code [76.84199699772903]
ML-Bench is a benchmark rooted in real-world programming applications that leverage existing code repositories to perform tasks.
To evaluate both Large Language Models (LLMs) and AI agents, two setups are employed: ML-LLM-Bench for assessing LLMs' text-to-code conversion within a predefined deployment environment, and ML-Agent-Bench for testing autonomous agents in an end-to-end task execution within a Linux sandbox environment.
arXiv Detail & Related papers (2023-11-16T12:03:21Z) - A Transformer-based Approach for Source Code Summarization [86.08359401867577]
We learn code representation for summarization by modeling the pairwise relationship between code tokens.
We show that despite the approach is simple, it outperforms the state-of-the-art techniques by a significant margin.
arXiv Detail & Related papers (2020-05-01T23:29:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.