Related papers: Code Digital Twin: Empowering LLMs with Tacit Knowledge for Complex Software Development

Code Digital Twin: Empowering LLMs with Tacit Knowledge for Complex Software Development

URL: http://arxiv.org/abs/2510.16395v2
Date: Mon, 27 Oct 2025 12:09:29 GMT
Title: Code Digital Twin: Empowering LLMs with Tacit Knowledge for Complex Software Development
Authors: Xin Peng, Chong Wang,
Abstract summary: We identify challenges from both software and large language models perspectives.<n>We propose the Code Digital Twin, a framework that models both the physical and conceptual layers of software.<n>Our vision positions it as a bridge between AI advancements and enterprise software realities.
Score: 8.821206496273842
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in large language models (LLMs) have demonstrated strong capabilities in software engineering tasks, raising expectations of revolutionary productivity gains. However, enterprise software development is largely driven by incremental evolution, where challenges extend far beyond routine coding and depend critically on tacit knowledge, including design decisions at different levels and historical trade-offs. To achieve effective AI-powered support for complex software development, we should align emerging AI capabilities with the practical realities of enterprise development. To this end, we systematically identify challenges from both software and LLM perspectives. Alongside these challenges, we outline opportunities where AI and structured knowledge frameworks can enhance decision-making in tasks such as issue localization and impact analysis. To address these needs, we propose the Code Digital Twin, a living framework that models both the physical and conceptual layers of software, preserves tacit knowledge, and co-evolves with the codebase. By integrating hybrid knowledge representations, multi-stage extraction pipelines, incremental updates, LLM-empowered applications, and human-in-the-loop feedback, the Code Digital Twin transforms fragmented knowledge into explicit and actionable representations. Our vision positions it as a bridge between AI advancements and enterprise software realities, providing a concrete roadmap toward sustainable, intelligent, and resilient development and evolution of ultra-complex systems.

Related papers

Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey [59.3507264893654]
Issue resolution is a complex Software Engineering task integral to real-world development.<n> benchmarks like SWE-bench revealed this task as profoundly difficult for large language models.<n>This paper presents a systematic survey of this emerging domain.
arXiv Detail & Related papers (2026-01-15T18:55:03Z)
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence [150.3696990310269]
Large language models (LLMs) have transformed automated software development by enabling direct translation of natural language descriptions into functional code.<n>We provide a comprehensive synthesis and practical guide (a series of analytic and probing experiments) about code LLMs.<n>We analyze the code capability of the general LLMs (GPT-4, Claude, LLaMA) and code-specialized LLMs (StarCoder, Code LLaMA, DeepSeek-Coder, and QwenCoder)
arXiv Detail & Related papers (2025-11-23T17:09:34Z)
Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents [63.03252293761656]
This paper systematically reviews the technologies, applications, and evaluation methods of industry agents based on large language models (LLMs)<n>We examine the three key technological pillars that support the advancement of agent capabilities: Memory, Planning, and Tool Use.<n>We provide an overview of the application of industry agents in real-world domains such as digital engineering, scientific discovery, embodied intelligence, collaborative business execution, and complex system simulation.
arXiv Detail & Related papers (2025-10-20T12:46:55Z)
A Survey of Vibe Coding with Large Language Models [93.88284590533242]
"Vibe Coding" is a development methodology where developers validate AI-generated implementations through outcome observation.<n>Despite its transformative potential, the effectiveness of this emergent paradigm remains under-explored.<n>This survey provides the first comprehensive and systematic review of Vibe Coding with large language models.
arXiv Detail & Related papers (2025-10-14T11:26:56Z)
Generative AI and the Transformation of Software Development Practices [0.0]
Generative AI is reshaping how software is designed, written, and maintained.<n>This paper examines how AI-assisted techniques are changing software engineering practice.
arXiv Detail & Related papers (2025-10-12T22:02:10Z)
Vibe Coding as a Reconfiguration of Intent Mediation in Software Development: Definition, Implications, and Research Agenda [4.451779041553598]
vibe coding is a software development paradigm where humans and generative AI engage in collaborative flow to co-create software artifacts.<n>We show that vibe coding reconfigures cognitive work by redistributing labor between humans and machines.<n>We identify key opportunities, including democratization, acceleration, and systemic leverage, alongside risks.
arXiv Detail & Related papers (2025-07-29T15:44:55Z)
Code Digital Twin: Empowering LLMs with Tacit Knowledge for Complex Software Development [8.821206496273842]
We identify challenges from both software and large language models perspectives.<n>We propose the Code Digital Twin, a framework that models both the physical and conceptual layers of software.<n>Our vision positions it as a bridge between AI advancements and enterprise software realities.
arXiv Detail & Related papers (2025-03-11T01:46:58Z)
LLMs: A Game-Changer for Software Engineers? [0.0]
Large Language Models (LLMs) like GPT-3 and GPT-4 have emerged as groundbreaking innovations with capabilities that extend far beyond traditional AI applications. Their potential to revolutionize software development has captivated the software engineering (SE) community. This paper argues that LLMs are not just reshaping how software is developed but are redefining the role of developers.
arXiv Detail & Related papers (2024-11-01T17:14:37Z)
Overview of Current Challenges in Multi-Architecture Software Engineering and a Vision for the Future [0.0]
The presented system architecture is based on the concept of dynamic, knowledge graph-based WebAssembly Twins. The resulting systems are to possess advanced autonomous capabilities, with full transparency and controllability by the end user.
arXiv Detail & Related papers (2024-10-28T13:03:09Z)
Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs) The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation. We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z)
Exploring the intersection of Generative AI and Software Development [0.0]
The synergy between generative AI and Software Engineering emerges as a transformative frontier. This whitepaper delves into the unexplored realm, elucidating how generative AI techniques can revolutionize software development. It serves as a guide for stakeholders, urging discussions and experiments in the application of generative AI in Software Engineering.
arXiv Detail & Related papers (2023-12-21T19:23:23Z)
ChatDev: Communicative Agents for Software Development [84.90400377131962]
ChatDev is a chat-powered software development framework in which specialized agents are guided in what to communicate. These agents actively contribute to the design, coding, and testing phases through unified language-based communication.
arXiv Detail & Related papers (2023-07-16T02:11:34Z)
Technology Readiness Levels for Machine Learning Systems [107.56979560568232]
Development and deployment of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end. We have developed a proven systems engineering approach for machine learning development and deployment. Our "Machine Learning Technology Readiness Levels" framework defines a principled process to ensure robust, reliable, and responsible systems.
arXiv Detail & Related papers (2021-01-11T15:54:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.