LLM-Driven Collaborative Model for Untangling Commits via Explicit and Implicit Dependency Reasoning
- URL: http://arxiv.org/abs/2507.16395v1
- Date: Tue, 22 Jul 2025 09:42:13 GMT
- Title: LLM-Driven Collaborative Model for Untangling Commits via Explicit and Implicit Dependency Reasoning
- Authors: Bo Hou, Xin Tan, Kai Zheng, Fang Liu, Yinghao Zhu, Li Zhang,
- Abstract summary: We propose ColaUntangle, a new collaborative consultation framework for commit untangling.<n>ColaUntangle integrates Large Language Model (LLM)-driven agents in a multi-agent architecture.<n>We construct multi-version Program Dependency Graphs (delta-PDG), enabling agents to reason over code relationships with both symbolic and semantic depth.
- Score: 20.147009997147798
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Atomic commits, each of which addresses a single development concern, are a best practice in software development. However, developers frequently produce tangled commits that mix unrelated changes due to practical constraints or unclear boundaries, negatively impacting code review and maintenance. Although prior commit untangling approaches: rule-based, feature-based, or graph-based, have made progress, they often rely on shallow signals and fail to distinguish between explicit dependencies (e.g., control/data flow) and implicit ones (e.g., semantic or conceptual relationships). In this paper, we propose ColaUntangle, a new collaborative consultation framework for commit untangling that models both explicit and implicit dependencies among code changes. ColaUntangle integrates Large Language Model (LLM)-driven agents in a multi-agent architecture: one agent specializes in explicit dependencies, another in implicit ones, and a reviewer agent synthesizes their perspectives through iterative consultation. To capture explicit and implicit contextual information, we construct multi-version Program Dependency Graphs (delta-PDG), enabling agents to reason over code relationships with both symbolic and semantic depth. We evaluate ColaUntangle on two widely-used datasets (1,612 C# and 14k Java tangled commits). Experimental results show that ColaUntangle outperforms the best-performing baseline, achieving an improvement of 44% on the C# dataset and 100% on the Java dataset. These findings highlight the potential of LLM-based collaborative frameworks for advancing automated commit untangling tasks.
Related papers
- CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward [50.97588334916863]
We develop CompassVerifier, an accurate and robust lightweight verifier model for evaluation and outcome reward.<n>It demonstrates multi-domain competency spanning math, knowledge, and diverse reasoning tasks, with the capability to process various answer types.<n>We introduce VerifierBench benchmark comprising model outputs collected from multiple data sources, augmented through manual analysis of metaerror patterns to enhance CompassVerifier.
arXiv Detail & Related papers (2025-08-05T17:55:24Z) - Data Dependency Inference for Industrial Code Generation Based on UML Sequence Diagrams [31.902404948282925]
We propose a novel step-by-step code generation framework named API2Dep.<n>First, we introduce an enhanced Unified Modeling Language (UML) API diagram tailored for service-oriented architectures.<n>Second, recognizing the critical role of data flow, we introduce a dedicated data dependency inference task.
arXiv Detail & Related papers (2025-08-05T12:28:23Z) - SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving [90.32201622392137]
We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs)<n>Unlike traditional static benchmarks, SwingArena models the collaborative process of software by pairing LLMs as iterations, who generate patches, and reviewers, who create test cases and verify the patches through continuous integration (CI) pipelines.
arXiv Detail & Related papers (2025-05-29T18:28:02Z) - EpiCoder: Encompassing Diversity and Complexity in Code Generation [49.170195362149386]
Existing methods for code generation use code snippets as seed data.<n>We introduce a novel feature tree-based synthesis framework, which revolves around hierarchical code features.<n>Our framework provides precise control over the complexity of the generated code, enabling functionalities that range from function-level operations to multi-file scenarios.
arXiv Detail & Related papers (2025-01-08T18:58:15Z) - COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement [80.18490952057125]
Iterative refinement has emerged as an effective paradigm for enhancing the capabilities of large language models (LLMs) on complex tasks.
We propose Context-Wise Order-Agnostic Language Modeling (COrAL) to overcome these challenges.
Our approach models multiple token dependencies within manageable context windows, enabling the model to perform iterative refinement internally.
arXiv Detail & Related papers (2024-10-12T23:56:19Z) - Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework.
Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z) - AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction [10.65417796726349]
relation extraction (RE) in complex scenarios faces challenges such as diverse relation types and ambiguous relations between entities within a single sentence.
We propose an agent-based RE framework, namely AgentRE, which fully leverages the potential of large language models to achieve RE in complex scenarios.
arXiv Detail & Related papers (2024-09-03T12:53:05Z) - On the Impacts of Contexts on Repository-Level Code Generation [5.641402231731082]
We present RepoExec, a novel benchmark designed to evaluate repository-level code generation.<n>We focus on three key aspects: executability, functional correctness through comprehensive test case generation, and accurate utilization of cross-file contexts.
arXiv Detail & Related papers (2024-06-17T10:45:22Z) - KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning
over Knowledge Graph [134.8631016845467]
We propose an autonomous LLM-based agent framework, called KG-Agent.
In KG-Agent, we integrate the LLM, multifunctional toolbox, KG-based executor, and knowledge memory.
To guarantee the effectiveness, we leverage program language to formulate the multi-hop reasoning process over the KG.
arXiv Detail & Related papers (2024-02-17T02:07:49Z) - Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency [127.97467912117652]
Large language models (LLMs) have exhibited remarkable ability in code generation.
However, generating the correct solution in a single attempt still remains a challenge.
We propose the Multi-Perspective Self-Consistency (MPSC) framework incorporating both inter- and intra-consistency.
arXiv Detail & Related papers (2023-09-29T14:23:26Z) - Delving into Commit-Issue Correlation to Enhance Commit Message
Generation Models [13.605167159285374]
Commit message generation is a challenging task in automated software engineering.
tool is a novel paradigm that can introduce the correlation between commits and issues into the training phase of models.
The results show that compared with the original models, the performance of tool-enhanced models is significantly improved.
arXiv Detail & Related papers (2023-07-31T20:35:00Z) - DORE: Document Ordered Relation Extraction based on Generative Framework [56.537386636819626]
This paper investigates the root cause of the underwhelming performance of the existing generative DocRE models.
We propose to generate a symbolic and ordered sequence from the relation matrix which is deterministic and easier for model to learn.
Experimental results on four datasets show that our proposed method can improve the performance of the generative DocRE models.
arXiv Detail & Related papers (2022-10-28T11:18:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.