RepoTransAgent: Multi-Agent LLM Framework for Repository-Aware Code Translation
- URL: http://arxiv.org/abs/2508.17720v1
- Date: Mon, 25 Aug 2025 06:56:22 GMT
- Title: RepoTransAgent: Multi-Agent LLM Framework for Repository-Aware Code Translation
- Authors: Ziqi Guan, Xin Yin, Zhiyuan Peng, Chao Ni,
- Abstract summary: RepoTransAgent is a novel multi-agent framework for repository-aware code translation.<n>We evaluate RepoTransAgent on hundreds of Java-C# translation pairs from six popular open-source projects.
- Score: 6.2036957709296665
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Repository-aware code translation is critical for modernizing legacy systems, enhancing maintainability, and enabling interoperability across diverse programming languages. While recent advances in large language models (LLMs) have improved code translation quality, existing approaches face significant challenges in practical scenarios: insufficient contextual understanding, inflexible prompt designs, and inadequate error correction mechanisms. These limitations severely hinder accurate and efficient translation of complex, real-world code repositories. To address these challenges, we propose RepoTransAgent, a novel multi-agent LLM framework for repository-aware code translation. RepoTransAgent systematically decomposes the translation process into specialized subtasks-context retrieval, dynamic prompt construction, and iterative code refinement-each handled by dedicated agents. Our approach leverages retrieval-augmented generation (RAG) for contextual information gathering, employs adaptive prompts tailored to varying repository scenarios, and introduces a reflection-based mechanism for systematic error correction. We evaluate RepoTransAgent on hundreds of Java-C# translation pairs from six popular open-source projects. Experimental results demonstrate that RepoTransAgent significantly outperforms state-of-the-art baselines in both compile and pass rates. Specifically, RepoTransAgent achieves up to 55.34% compile rate and 45.84% pass rate. Comprehensive analysis confirms the robustness and generalizability of RepoTransAgent across different LLMs, establishing its effectiveness for real-world repository-aware code translation.
Related papers
- Finding the Translation Switch: Discovering and Exploiting the Task-Initiation Features in LLMs [69.28193153685893]
Large Language Models (LLMs) frequently exhibit strong translation abilities, even without task-specific fine-tuning.<n>To demystify this process, we leverage Sparse Autoencoders (SAEs) and introduce a novel framework for identifying task-specific features.<n>Our work not only decodes a core component of the translation mechanism in LLMs but also provides a blueprint for using internal model mechanism to create more robust and efficient models.
arXiv Detail & Related papers (2026-01-16T06:29:07Z) - Advancing Automated In-Isolation Validation in Repository-Level Code Translation [9.753507630426832]
Repository-level code translation aims to migrate entire repositories across programming languages while preserving functionality automatically.<n>This paper proposes TRAM, which combines context-aware type resolution with mock-based in-isolation validation.<n>TRAM demonstrates state-of-the-art performance in Java-to-Python translation, underscoring the effectiveness of its integration of RAG-based type resolution with reliable in-isolation validation.
arXiv Detail & Related papers (2025-11-26T19:53:46Z) - Multi-Agent Systems for Dataset Adaptation in Software Engineering: Capabilities, Limitations, and Future Directions [8.97512410819274]
This paper presents the first empirical study on how state-of-the-art multi-agent systems perform in dataset adaptation tasks.<n>We evaluate GitHub Copilot on adapting SE research artifacts from benchmark repositories including ROCODE and LogHub2.0.<n>Results show that current systems can identify key files and generate partial adaptations but rarely produce correct implementations.
arXiv Detail & Related papers (2025-11-26T13:26:11Z) - Connecting the Dots: Training-Free Visual Grounding via Agentic Reasoning [63.109585527799005]
GroundingAgent is a visual grounding framework that operates without task-specific fine-tuning.<n>It achieves an average zero-shot grounding accuracy of 65.1 % on widely-used benchmarks.<n>It also offers strong interpretability, transparently illustrating each reasoning step.
arXiv Detail & Related papers (2025-11-24T03:11:08Z) - Augmenting Multi-Agent Communication with State Delta Trajectory [31.127137626348098]
We propose a new communication protocol that transfers both natural language tokens and token-wise state transition trajectory from one agent to another.<n>We find that the sequence of state changes in LLMs after generating each token can better reflect the information hidden behind the inference process.<n> experimental results show that multi-agent systems with SDE achieve SOTA performance compared to other communication protocols.
arXiv Detail & Related papers (2025-06-24T00:38:25Z) - RustRepoTrans: Repository-level Code Translation Benchmark Targeting Rust [50.65321080814249]
RustRepoTrans is the first repository-level context code translation benchmark targeting incremental translation.<n>We evaluate seven representative LLMs, analyzing their errors to assess limitations in complex translation scenarios.
arXiv Detail & Related papers (2024-11-21T10:00:52Z) - AlphaTrans: A Neuro-Symbolic Compositional Approach for Repository-Level Code Translation and Validation [5.269923665485903]
We propose AlphaTrans, a neuro-symbolic approach to automate repository-level code translation.<n>We leveraged AlphaTrans to translate ten real-world open-source projects consisting of 836, 8575, 2719> classes, methods, and tests.
arXiv Detail & Related papers (2024-10-31T16:46:52Z) - CRAT: A Multi-Agent Framework for Causality-Enhanced Reflective and Retrieval-Augmented Translation with Large Language Models [59.8529196670565]
CRAT is a novel multi-agent translation framework that leverages RAG and causality-enhanced self-reflection to address translation challenges.
Our results show that CRAT significantly improves translation accuracy, particularly in handling context-sensitive terms and emerging vocabulary.
arXiv Detail & Related papers (2024-10-28T14:29:11Z) - DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory [96.35468670508476]
We introduce DelTA, a Document-levEL Translation Agent for large language models (LLMs)<n>DelTA features a multi-level memory structure that stores information across various granularities and spans.<n> Experimental results indicate that DelTA significantly outperforms strong baselines in terms of translation consistency and quality.
arXiv Detail & Related papers (2024-10-10T17:30:09Z) - TRANSAGENT: An LLM-Based Multi-Agent System for Code Translation [16.46292795782835]
Code translation is crucial for software migration, system ablation, and cross-platform development.
Traditional rule-based methods rely on manually-written rules, which can be time-consuming and often result in less readable code.
More recently, the advance of Large Language Models (LLMs) further boosts learning-based code translation.
We propose a novel multi-agent system TRANSAGENT, which enhances LLM-based code translation by fixing the syntax errors and semantic errors.
arXiv Detail & Related papers (2024-09-30T02:53:03Z) - (Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts [56.7988577327046]
We introduce TransAgents, a novel multi-agent framework that simulates the roles and collaborative practices of a human translation company.<n>Our findings highlight the potential of multi-agent collaboration in enhancing translation quality, particularly for longer texts.
arXiv Detail & Related papers (2024-05-20T05:55:08Z) - Exploring and Unleashing the Power of Large Language Models in Automated Code Translation [40.25727029618665]
This paper investigates diverse LLMs and learning-based transpilers for automated code translation tasks.
UniTrans is a Unified code Translation framework, applicable to various LLMs.
Three recent LLMs of diverse sizes are tested with UniTrans, and all achieve substantial improvements.
arXiv Detail & Related papers (2024-04-23T00:49:46Z) - ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code [76.84199699772903]
ML-Bench is a benchmark rooted in real-world programming applications that leverage existing code repositories to perform tasks.
To evaluate both Large Language Models (LLMs) and AI agents, two setups are employed: ML-LLM-Bench for assessing LLMs' text-to-code conversion within a predefined deployment environment, and ML-Agent-Bench for testing autonomous agents in an end-to-end task execution within a Linux sandbox environment.
arXiv Detail & Related papers (2023-11-16T12:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.