LLM-Driven Collaborative Model for Untangling Commits via Explicit and Implicit Dependency Reasoning
- URL: http://arxiv.org/abs/2507.16395v2
- Date: Wed, 05 Nov 2025 06:26:32 GMT
- Title: LLM-Driven Collaborative Model for Untangling Commits via Explicit and Implicit Dependency Reasoning
- Authors: Bo Hou, Xin Tan, Kai Zheng, Fang Liu, Yinghao Zhu, Li Zhang,
- Abstract summary: We propose ColaUntangle, a new collaborative consultation framework for commit untangling.<n>ColaUntangle integrates Large Language Model (LLM)-driven agents in a multi-agent architecture.<n>We evaluate ColaUntangle on two widely-used datasets (1,612 C# and 14k Java tangled commits)
- Score: 15.20947984949809
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Atomic commits, which address a single development concern, are a best practice in software development. In practice, however, developers often produce tangled commits that mix unrelated changes, complicating code review and maintenance. Prior untangling approaches (rule-based, feature-based, or graph-based) have made progress but typically rely on shallow signals and struggle to distinguish explicit dependencies (e.g., control/data flow) from implicit ones (e.g., semantic or conceptual relationships). In this paper, we propose ColaUntangle, a new collaborative consultation framework for commit untangling that models both explicit and implicit dependencies among code changes. ColaUntangle integrates Large Language Model (LLM)-driven agents in a multi-agent architecture: one agent specializes in explicit dependencies, another in implicit ones, and a reviewer agent synthesizes their perspectives through iterative consultation. To capture structural and contextual information, we construct Explicit and Implicit Contexts, enabling agents to reason over code relationships with both symbolic and semantic depth. We evaluate ColaUntangle on two widely-used datasets (1,612 C# and 14k Java tangled commits). Experimental results show that ColaUntangle outperforms the best-performing baseline, achieving an improvement of 44% on the C# dataset and 82% on the Java dataset. These findings highlight the potential of LLM-based collaborative frameworks for advancing automated commit untangling tasks.
Related papers
- Theory of Code Space: Do Code Agents Understand Software Architecture? [0.0]
Theory of Code Space (ToCS) is a benchmark that evaluates ability to construct, maintain, and update coherent architectural beliefs.<n>ToCS requires agents to build structured belief states over module dependencies.<n>Preliminary Architectural Constraint Discovery is a code-specific evaluation.
arXiv Detail & Related papers (2026-02-28T11:40:17Z) - Do Not Treat Code as Natural Language: Implications for Repository-Level Code Generation and Beyond [13.550121154853715]
We present Hydra, a repository-level code generation framework that treats code as structured code rather than natural language.<n>We show that Hydra achieves state-of-the-art performance across open- and closed-source CodeLLMs.
arXiv Detail & Related papers (2026-02-12T07:44:00Z) - AlignCoder: Aligning Retrieval with Target Intent for Repository-Level Code Completion [55.21541958868449]
We propose AlignCoder, a repository-level code completion framework.<n>Our framework generates an enhanced query that bridges the semantic gap between the initial query and the target code.<n>We employ reinforcement learning to train an AlignRetriever that learns to leverage inference information in the enhanced query for more accurate retrieval.
arXiv Detail & Related papers (2026-01-27T15:23:14Z) - SolContractEval: A Benchmark for Evaluating Contract-Level Solidity Code Generation [17.116315125396696]
Solidity is the primary language for smart contracts.<n>Existing evaluations fall short of assessing models' capabilities in real-world contract development.<n>We introduce SolContractEval, the first contract-level benchmark for Solidity code generation.
arXiv Detail & Related papers (2025-09-28T11:53:41Z) - CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward [50.97588334916863]
We develop CompassVerifier, an accurate and robust lightweight verifier model for evaluation and outcome reward.<n>It demonstrates multi-domain competency spanning math, knowledge, and diverse reasoning tasks, with the capability to process various answer types.<n>We introduce VerifierBench benchmark comprising model outputs collected from multiple data sources, augmented through manual analysis of metaerror patterns to enhance CompassVerifier.
arXiv Detail & Related papers (2025-08-05T17:55:24Z) - Data Dependency Inference for Industrial Code Generation Based on UML Sequence Diagrams [31.902404948282925]
We propose a novel step-by-step code generation framework named API2Dep.<n>First, we introduce an enhanced Unified Modeling Language (UML) API diagram tailored for service-oriented architectures.<n>Second, recognizing the critical role of data flow, we introduce a dedicated data dependency inference task.
arXiv Detail & Related papers (2025-08-05T12:28:23Z) - Large Language Models are Good Relational Learners [55.40941576497973]
We introduce Rel-LLM, a novel architecture that utilizes a graph neural network (GNN)- based encoder to generate structured relational prompts for large language models (LLMs)<n>Unlike traditional text-based serialization approaches, our method preserves the inherent relational structure of databases while enabling LLMs to process and reason over complex entity relationships.
arXiv Detail & Related papers (2025-06-06T04:07:55Z) - SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving [90.32201622392137]
We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs)<n>Unlike traditional static benchmarks, SwingArena models the collaborative process of software by pairing LLMs as iterations, who generate patches, and reviewers, who create test cases and verify the patches through continuous integration (CI) pipelines.
arXiv Detail & Related papers (2025-05-29T18:28:02Z) - Knowledge Graph Completion with Relation-Aware Anchor Enhancement [50.50944396454757]
We propose a relation-aware anchor enhanced knowledge graph completion method (RAA-KGC)<n>We first generate anchor entities within the relation-aware neighborhood of the head entity.<n>Then, by pulling the query embedding towards the neighborhoods of the anchors, it is tuned to be more discriminative for target entity matching.
arXiv Detail & Related papers (2025-04-08T15:22:08Z) - EpiCoder: Encompassing Diversity and Complexity in Code Generation [49.170195362149386]
Existing methods for code generation use code snippets as seed data.<n>We introduce a novel feature tree-based synthesis framework, which revolves around hierarchical code features.<n>Our framework provides precise control over the complexity of the generated code, enabling functionalities that range from function-level operations to multi-file scenarios.
arXiv Detail & Related papers (2025-01-08T18:58:15Z) - COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement [80.18490952057125]
Iterative refinement has emerged as an effective paradigm for enhancing the capabilities of large language models (LLMs) on complex tasks.
We propose Context-Wise Order-Agnostic Language Modeling (COrAL) to overcome these challenges.
Our approach models multiple token dependencies within manageable context windows, enabling the model to perform iterative refinement internally.
arXiv Detail & Related papers (2024-10-12T23:56:19Z) - Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework.
Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z) - AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction [10.65417796726349]
relation extraction (RE) in complex scenarios faces challenges such as diverse relation types and ambiguous relations between entities within a single sentence.
We propose an agent-based RE framework, namely AgentRE, which fully leverages the potential of large language models to achieve RE in complex scenarios.
arXiv Detail & Related papers (2024-09-03T12:53:05Z) - On the Impacts of Contexts on Repository-Level Code Generation [5.641402231731082]
We present RepoExec, a novel benchmark designed to evaluate repository-level code generation.<n>We focus on three key aspects: executability, functional correctness through comprehensive test case generation, and accurate utilization of cross-file contexts.
arXiv Detail & Related papers (2024-06-17T10:45:22Z) - KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning
over Knowledge Graph [134.8631016845467]
We propose an autonomous LLM-based agent framework, called KG-Agent.
In KG-Agent, we integrate the LLM, multifunctional toolbox, KG-based executor, and knowledge memory.
To guarantee the effectiveness, we leverage program language to formulate the multi-hop reasoning process over the KG.
arXiv Detail & Related papers (2024-02-17T02:07:49Z) - Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency [127.97467912117652]
Large language models (LLMs) have exhibited remarkable ability in code generation.
However, generating the correct solution in a single attempt still remains a challenge.
We propose the Multi-Perspective Self-Consistency (MPSC) framework incorporating both inter- and intra-consistency.
arXiv Detail & Related papers (2023-09-29T14:23:26Z) - Delving into Commit-Issue Correlation to Enhance Commit Message
Generation Models [13.605167159285374]
Commit message generation is a challenging task in automated software engineering.
tool is a novel paradigm that can introduce the correlation between commits and issues into the training phase of models.
The results show that compared with the original models, the performance of tool-enhanced models is significantly improved.
arXiv Detail & Related papers (2023-07-31T20:35:00Z) - DORE: Document Ordered Relation Extraction based on Generative Framework [56.537386636819626]
This paper investigates the root cause of the underwhelming performance of the existing generative DocRE models.
We propose to generate a symbolic and ordered sequence from the relation matrix which is deterministic and easier for model to learn.
Experimental results on four datasets show that our proposed method can improve the performance of the generative DocRE models.
arXiv Detail & Related papers (2022-10-28T11:18:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.