Position: Intelligent Coding Systems Should Write Programs with Justifications
- URL: http://arxiv.org/abs/2508.06017v1
- Date: Fri, 08 Aug 2025 05:04:47 GMT
- Title: Position: Intelligent Coding Systems Should Write Programs with Justifications
- Authors: Xiangzhe Xu, Shiwei Feng, Zian Su, Chengpeng Wang, Xiangyu Zhang,
- Abstract summary: We argue that these systems should not only generate code but also produce clear, consistent justifications that bridge model reasoning and user understanding.<n>We advocate exploring neuro-symbolic approaches for justification generation, where symbolic constraints guide behavior during training and program semantics are enriched through neural representations.
- Score: 9.304020701255093
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Intelligent coding systems are transforming software development by enabling users to specify code behavior in natural language. However, the opaque decision-making of AI-driven coders raises trust and usability concerns, particularly for non-expert users who cannot inspect low-level implementations. We argue that these systems should not only generate code but also produce clear, consistent justifications that bridge model reasoning and user understanding. To this end, we identify two critical justification properties-cognitive alignment and semantic faithfulness-and highlight the limitations of existing methods, including formal verification, static analysis, and post-hoc explainability. We advocate exploring neuro-symbolic approaches for justification generation, where symbolic constraints guide model behavior during training and program semantics are enriched through neural representations, enabling automated consistency checks at inference time.
Related papers
- CodeCircuit: Toward Inferring LLM-Generated Code Correctness via Attribution Graphs [13.488544043942495]
We aim to investigate whether the model's neural dynamics encode internally decodable signals that are predictive of logical validity during code generation.<n>By decomposing complex residual flows, we aim to identify the structural signatures that distinguish sound reasoning from logical failure.<n>Analysis across Python, C++, and Java confirms that intrinsic correctness signals are robust across diverse syntaxes.
arXiv Detail & Related papers (2026-02-06T03:49:15Z) - From Passive Metric to Active Signal: The Evolving Role of Uncertainty Quantification in Large Language Models [77.04403907729738]
This survey charts the evolution of uncertainty from a passive diagnostic metric to an active control signal guiding real-time model behavior.<n>We demonstrate how uncertainty is leveraged as an active control signal across three frontiers.<n>This survey argues that mastering the new trend of uncertainty is essential for building the next generation of scalable, reliable, and trustworthy AI.
arXiv Detail & Related papers (2026-01-22T06:21:31Z) - The Vibe-Check Protocol: Quantifying Cognitive Offloading in AI Programming [5.584060970507507]
Vibe Coding'' is a paradigm where developers articulate high-level intent through natural language and delegate implementation to AI agents.<n>This paper proposes a theoretical framework to investigate the research question: textitIs Vibe Coding a better way to learn software engineering?
arXiv Detail & Related papers (2026-01-02T06:13:41Z) - Truth-Aware Decoding: A Program-Logic Approach to Factual Language Generation [0.2864713389096699]
This paper introduces Truth-Aware Decoding (TAD), a verification-oriented decoding scheme that aligns neural language generation with knowledge bases.<n>Our contributions are fourfold: (i) a constraint-based semantics that renders oracle filtering as a program-logic judgment, (ii) a proof that greedy selection enjoys local likelihood dominance under sound and complete guards, and (iii) an entropy-style invariant that quantifies factual risk via knowledge-aware safe mass.
arXiv Detail & Related papers (2025-10-03T22:11:15Z) - Training Language Models to Generate Quality Code with Program Analysis Feedback [66.0854002147103]
Code generation with large language models (LLMs) is increasingly adopted in production but fails to ensure code quality.<n>We propose REAL, a reinforcement learning framework that incentivizes LLMs to generate production-quality code.
arXiv Detail & Related papers (2025-05-28T17:57:47Z) - Neuro-symbolic Weak Supervision: Theory and Semantics [5.455744338342196]
We propose a semantics for neuro-symbolic framework that integrates Inductive Logic Programming (ILP)<n>ILP defines a logical hypothesis space for label transitions, clarifies semantics, and establishes interpretable performance standards.<n>This hybrid approach improves robustness, transparency, and accountability in weakly supervised settings.
arXiv Detail & Related papers (2025-03-24T10:02:51Z) - Meta-Representational Predictive Coding: Biomimetic Self-Supervised Learning [51.22185316175418]
We present a new form of predictive coding that we call meta-representational predictive coding (MPC)<n>MPC sidesteps the need for learning a generative model of sensory input by learning to predict representations of sensory input across parallel streams.
arXiv Detail & Related papers (2025-03-22T22:13:14Z) - Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs [53.00384299879513]
In large language models (LLMs), code and reasoning reinforce each other.<n>Code provides verifiable execution paths, enforces logical decomposition, and enables runtime validation.<n>We identify key challenges and propose future research directions to strengthen this synergy.
arXiv Detail & Related papers (2025-02-26T18:55:42Z) - Bridging LLM-Generated Code and Requirements: Reverse Generation technique and SBC Metric for Developer Insights [0.0]
This paper introduces a novel scoring mechanism called the SBC score.<n>It is based on a reverse generation technique that leverages the natural language generation capabilities of Large Language Models.<n>Unlike direct code analysis, our approach reconstructs system requirements from AI-generated code and compares them with the original specifications.
arXiv Detail & Related papers (2025-02-11T01:12:11Z) - LatentQA: Teaching LLMs to Decode Activations Into Natural Language [72.87064562349742]
We introduce LatentQA, the task of answering open-ended questions about model activations in natural language.<n>We propose Latent Interpretation Tuning (LIT), which finetunes a decoder LLM on a dataset of activations and associated question-answer pairs.<n>Our decoder also specifies a differentiable loss that we use to control models, such as debiasing models on stereotyped sentences and controlling the sentiment of generations.
arXiv Detail & Related papers (2024-12-11T18:59:33Z) - Interpretable Concept-Based Memory Reasoning [12.562474638728194]
Concept-based Memory Reasoner (CMR) is a novel CBM designed to provide a human-understandable and provably-verifiable task prediction process.
CMR achieves better accuracy-interpretability trade-offs to state-of-the-art CBMs, discovers logic rules consistent with ground truths, allows for rule interventions, and allows pre-deployment verification.
arXiv Detail & Related papers (2024-07-22T10:32:48Z) - Automated Static Warning Identification via Path-based Semantic
Representation [37.70518599085676]
This paper employs deep neural networks' powerful feature extraction and representation abilities to generate code semantics from control flow graph paths for warning identification.
We fine-tune the pre-trained language model to encode the path sequences and capture the semantic representations for model building.
arXiv Detail & Related papers (2023-06-27T15:46:45Z) - Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions [54.55334589363247]
We study whether conveying information about uncertainty enables programmers to more quickly and accurately produce code.
We find that highlighting tokens with the highest predicted likelihood of being edited leads to faster task completion and more targeted edits.
arXiv Detail & Related papers (2023-02-14T18:43:34Z) - A Critical Review of Inductive Logic Programming Techniques for
Explainable AI [9.028858411921906]
Inductive Logic Programming (ILP) is a subfield of symbolic artificial intelligence.
ILP generates explainable first-order clausal theories from examples and background knowledge.
Existing ILP systems often have a vast solution space, and the induced solutions are very sensitive to noises and disturbances.
arXiv Detail & Related papers (2021-12-31T06:34:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.