Related papers: What to Cut? Predicting Unnecessary Methods in Agentic Code Generation

What to Cut? Predicting Unnecessary Methods in Agentic Code Generation

URL: http://arxiv.org/abs/2602.17091v1
Date: Thu, 19 Feb 2026 05:29:32 GMT
Title: What to Cut? Predicting Unnecessary Methods in Agentic Code Generation
Authors: Kan Watanabe, Tatsuya Shirai, Yutaro Kashiwa, Hajimu Iida,
Abstract summary: We propose a prediction model that identifies functions likely to be deleted during PR review.<n>Our results show that functions deleted for different reasons exhibit distinct characteristics.<n>These findings suggest that predictive approaches can help reviewers prioritize their efforts on essential code.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Agentic Coding, powered by autonomous agents such as GitHub Copilot and Cursor, enables developers to generate code, tests, and pull requests from natural language instructions alone. While this accelerates implementation, it produces larger volumes of code per pull request, shifting the burden from implementers to reviewers. In practice, a notable portion of AI-generated code is eventually deleted during review, yet reviewers must still examine such code before deciding to remove it. No prior work has explored methods to help reviewers efficiently identify code that will be removed.In this paper, we propose a prediction model that identifies functions likely to be deleted during PR review. Our results show that functions deleted for different reasons exhibit distinct characteristics, and our model achieves an AUC of 87.1%. These findings suggest that predictive approaches can help reviewers prioritize their efforts on essential code.

Related papers

Understanding Dominant Themes in Reviewing Agentic AI-authored Code [6.183483850365225]
We analyze 19,450 inline review comments spanning 3,177 agent-authored PRs from real-world GitHub repositories.<n>We find that while AI agents can accelerate code production, there remain gaps requiring targeted human review oversight.
arXiv Detail & Related papers (2026-01-27T07:21:09Z)
Can We Predict Before Executing Machine Learning Agents? [74.39460101251792]
We formalize the task of Data-centric Solution Preference and construct a comprehensive corpus of 18,438 pairwise comparisons.<n>We demonstrate that LLMs exhibit significant predictive capabilities when primed with a Verified Data Analysis Report.<n>We instantiate this framework in FOREAGENT, an agent that employs a Predict-then-Verify loop, achieving a 6x acceleration in convergence while surpassing execution-based baselines by +6%.
arXiv Detail & Related papers (2026-01-09T16:44:17Z)
Early-Stage Prediction of Review Effort in AI-Generated Pull Requests [0.0]
We analyze 33,707 agent-authored PRs from the AIDev dataset across 2,807 repositories.<n>We propose a Circuit Breaker triage model that predicts high-review-effort PRs at creation time.
arXiv Detail & Related papers (2026-01-02T17:18:01Z)
DeputyDev -- AI Powered Developer Assistant: Breaking the Code Review Logjam through Contextual AI to Boost Developer Productivity [38.585498338645856]
This study investigates the implementation and efficacy of DeputyDev.<n>DeputyDev is an AI-powered code review assistant developed to address inefficiencies in the software development process.
arXiv Detail & Related papers (2025-08-13T10:09:45Z)
Leveraging Reward Models for Guiding Code Review Comment Generation [13.306560805316103]
Code review is a crucial component of modern software development, involving the evaluation of code quality, providing feedback on potential issues, and refining the code to address identified problems.<n>Deep learning techniques are able to tackle the generative aspect of code review, by commenting on a given code as a human reviewer would do.<n>In this paper, we introduce CoRAL, a deep learning framework automating review comment generation by exploiting reinforcement learning with a reward mechanism.
arXiv Detail & Related papers (2025-06-04T21:31:38Z)
LazyReview A Dataset for Uncovering Lazy Thinking in NLP Peer Reviews [74.87393214734114]
This work introduces LazyReview, a dataset of peer-review sentences annotated with fine-grained lazy thinking categories.<n>Large Language Models (LLMs) struggle to detect these instances in a zero-shot setting.<n> instruction-based fine-tuning on our dataset significantly boosts performance by 10-20 performance points.
arXiv Detail & Related papers (2025-04-15T10:07:33Z)
Automated Code Review In Practice [1.6271516689052665]
Several AI-assisted tools, such as Qodo, GitHub Copilot, and Coderabbit, provide automated reviews using large language models (LLMs)<n>This study examines the impact of LLM-based automated code review tools in an industrial setting.
arXiv Detail & Related papers (2024-12-24T16:24:45Z)
Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach [66.51005288743153]
We investigate the legal and ethical issues of current neural code completion models. We tailor a membership inference approach (termed CodeMI) that was originally crafted for classification tasks. We evaluate the effectiveness of this adapted approach across a diverse array of neural code completion models.
arXiv Detail & Related papers (2024-04-22T15:54:53Z)
CONCORD: Clone-aware Contrastive Learning for Source Code [64.51161487524436]
Self-supervised pre-training has gained traction for learning generic code representations valuable for many downstream SE tasks. We argue that it is also essential to factor in how developers code day-to-day for general-purpose representation learning. In particular, we propose CONCORD, a self-supervised, contrastive learning strategy to place benign clones closer in the representation space while moving deviants further apart.
arXiv Detail & Related papers (2023-06-05T20:39:08Z)
Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions [54.55334589363247]
We study whether conveying information about uncertainty enables programmers to more quickly and accurately produce code. We find that highlighting tokens with the highest predicted likelihood of being edited leads to faster task completion and more targeted edits.
arXiv Detail & Related papers (2023-02-14T18:43:34Z)
Coder Reviewer Reranking for Code Generation [56.80381384717]
We propose Coder-Reviewer reranking as a method for sampling diverse programs from a code language model and reranking with model likelihood. Experimental results show that Coder-Reviewer reranking leads to consistent and significant improvement over reranking with the Coder model only. Coder-Reviewer reranking is easy to implement by prompting, can generalize to different programming languages, and works well with off-the-shelf hyper parameters.
arXiv Detail & Related papers (2022-11-29T18:56:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.