Related papers: Quantitative Analysis of Technical Debt and Pattern Violation in Large Language Model Architectures

Quantitative Analysis of Technical Debt and Pattern Violation in Large Language Model Architectures

URL: http://arxiv.org/abs/2512.04273v1
Date: Wed, 03 Dec 2025 21:24:02 GMT
Title: Quantitative Analysis of Technical Debt and Pattern Violation in Large Language Model Architectures
Authors: Tyler Slater,
Abstract summary: This study presents the first empirical framework to measure "Architectural Erosion" and the accumulation of Technical Debt in AI-synthesized systems.<n>We find that while proprietary models achieve high architectural conformance, open-weights models exhibit critical divergence.<n>These findings suggest that without automated architectural linting, utilizing smaller open-weights models for system scaffolding accelerates the accumulation of structural technical debt.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As Large Language Models (LLMs) transition from code completion tools to autonomous system architects, their impact on long-term software maintainability remains unquantified. While existing research benchmarks functional correctness (pass@k), this study presents the first empirical framework to measure "Architectural Erosion" and the accumulation of Technical Debt in AI-synthesized microservices. We conducted a comparative pilot study of three state-of-the-art models (GPT-5.1, Claude 4.5 Sonnet, and Llama 3 8B) by prompting them to implement a standardized Book Lending Microservice under strict Hexagonal Architecture constraints. Utilizing Abstract Syntax Tree (AST) parsing, we find that while proprietary models achieve high architectural conformance (0% violation rate for GPT-5.1), open-weights models exhibit critical divergence. Specifically, Llama 3 demonstrated an 80% Architectural Violation Rate, frequently bypassing interface adapters to create illegal circular dependencies between Domain and Infrastructure layers. Furthermore, we identified a phenomenon of "Implementation Laziness," where open-weights models generated 60% fewer Logical Lines of Code (LLOC) than their proprietary counterparts, effectively omitting complex business logic to satisfy token constraints. These findings suggest that without automated architectural linting, utilizing smaller open-weights models for system scaffolding accelerates the accumulation of structural technical debt.

Related papers

A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development [4.146198197290144]
WebGIS development requires rigor, yet agentic AI frequently fails due to five large language model (LLM) limitations.<n>We propose a dual-helix governance framework reframing these challenges as structural governance problems that model capacity alone cannot resolve.<n>We implement the framework as a 3-track architecture (Knowledge, Behavior, Skills) that uses a knowledge graph substrate to stabilize execution.
arXiv Detail & Related papers (2026-03-04T18:53:25Z)
Architecture-Aware Multi-Design Generation for Repository-Level Feature Addition [53.50448142467294]
RAIM is a multi-design and architecture-aware framework for repository-level feature addition.<n>It shifts away from linear patching by generating multiple diverse implementation designs.<n>Experiments on the NoCode-bench Verified dataset demonstrate that RAIM establishes a new state-of-the-art performance.
arXiv Detail & Related papers (2026-03-02T12:50:40Z)
Iterative Structured Pruning for Large Language Models with Multi-Domain Calibration [73.40887151631088]
Large Language Models (LLMs) have achieved remarkable success across a wide spectrum of natural language processing tasks.<n>Their ever-growing scale introduces significant barriers to real-world deployment, including substantial computational overhead, memory footprint, and inference latency.<n>In this work, we explore structured pruning, which eliminates entire architectural components and maintains compatibility with standard hardware accelerators.
arXiv Detail & Related papers (2026-01-06T03:09:31Z)
RL-Struct: A Lightweight Reinforcement Learning Framework for Reliable Structured Output in LLMs [0.08594140167290097]
Large Language Models (LLMs) have demonstrated remarkable capabilities in natural language generation and reasoning.<n>Their integration into automated software ecosystems is often hindered by the "Structure Gap"<n>We propose a lightweight, efficient Reinforcement Learning framework to bridge this gap.
arXiv Detail & Related papers (2025-11-29T04:47:14Z)
Human-aligned AI Model Cards with Weighted Hierarchy Architecture [5.774549987076668]
The proliferation of Large Language Models (LLMs) has led to a burgeoning ecosystem of specialized, domain-specific models.<n>Existing documentation frameworks, such as Model Cards and FactSheets, attempt to standardize reporting but are often static, predominantly qualitative.<n>We introduce the Comprehensive Responsible AI Model Card Framework (CRAI-MCF), a novel approach that transitions from static disclosures to actionable, human-aligned documentation.
arXiv Detail & Related papers (2025-10-08T13:13:18Z)
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use [78.29315418819074]
We introduce VerlTool, a unified and modular framework that addresses limitations through systematic design principles.<n>Our framework formalizes ARLT as multi-turn trajectories with multi-modal observation tokens (text/image/video), extending beyond single-turn RLVR paradigms.<n>The modular plugin architecture enables rapid tool integration requiring only lightweight Python definitions.
arXiv Detail & Related papers (2025-09-01T01:45:18Z)
Enhanced DeepONet for 1-D consolidation operator learning: an architectural investigation [1.1743167854433305]
Deep Operator Networks (DeepONets) have emerged as a powerful surrogate modeling framework for learning solution operators in PDE-governed systems.<n>This study systematically evaluates several DeepONet architectures for the one-dimensional consolidation problem.
arXiv Detail & Related papers (2025-07-14T15:09:58Z)
Elucidating the Design Space of Multimodal Protein Language Models [69.3650883370033]
Multimodal protein language models (PLMs) integrate sequence and token-based structural information.<n>This paper systematically elucidates the design space of multimodal PLMs to overcome their limitations.<n>Our advancements approach finer-grained supervision, demonstrating that token-based multimodal PLMs can achieve robust structural modeling.
arXiv Detail & Related papers (2025-04-15T17:59:43Z)
Serving Deep Learning Model in Relational Databases [70.53282490832189]
Serving deep learning (DL) models on relational data has become a critical requirement across diverse commercial and scientific domains. We highlight three pivotal paradigms: The state-of-the-art DL-centric architecture offloads DL computations to dedicated DL frameworks. The potential UDF-centric architecture encapsulates one or more tensor computations into User Defined Functions (UDFs) within the relational database management system (RDBMS)
arXiv Detail & Related papers (2023-10-07T06:01:35Z)
Towards Automated Identification of Violation Symptoms of Architecture Erosion [2.915855887948474]
This paper explores the automated identification of violation symptoms from developer discussions in code reviews.<n>We developed 15 machine learning-based classifiers using pre-trained word embeddings and evaluated them on code review comments.<n>Results show that SVM with word2vec achieved the best ML/DL performance with an F1-score of 0.779, while fastText embeddings also yielded strong results.
arXiv Detail & Related papers (2023-06-14T16:20:59Z)
Learning to Decompose: Hypothetical Question Decomposition Based on Comparable Texts [65.84370471189676]
We look at large-scale intermediate pre-training of decomposition-based transformers using distant supervision from comparable texts. We show that with such intermediate pre-training, developing robust decomposition-based models for a diverse range of tasks becomes more feasible.
arXiv Detail & Related papers (2022-10-30T15:38:03Z)
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition [99.349598600887]
Conformer is the de facto backbone model for various downstream speech tasks based on its hybrid attention-convolution architecture. We propose the Squeezeformer model, which consistently outperforms the state-of-the-art ASR models under the same training schemes.
arXiv Detail & Related papers (2022-06-02T06:06:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.