Towards Trustworthy AI Software Development Assistance
- URL: http://arxiv.org/abs/2312.09126v2
- Date: Tue, 23 Jan 2024 09:37:39 GMT
- Title: Towards Trustworthy AI Software Development Assistance
- Authors: Daniel Maninger, Krishna Narasimhan, Mira Mezini
- Abstract summary: Current software development assistants tend to be unreliable, often producing incorrect, unsafe, or low-quality code.
We seek to resolve these issues by introducing a holistic architecture for constructing, training, and using trustworthy AI software development assistants.
- Score: 0.599251270168187
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is expected that in the near future, AI software development assistants
will play an important role in the software industry. However, current software
development assistants tend to be unreliable, often producing incorrect,
unsafe, or low-quality code. We seek to resolve these issues by introducing a
holistic architecture for constructing, training, and using trustworthy AI
software development assistants. In the center of the architecture, there is a
foundational LLM trained on datasets representative of real-world coding
scenarios and complex software architectures, and fine-tuned on code quality
criteria beyond correctness. The LLM will make use of graph-based code
representations for advanced semantic comprehension. We envision a knowledge
graph integrated into the system to provide up-to-date background knowledge and
to enable the assistant to provide appropriate explanations. Finally, a modular
framework for constrained decoding will ensure that certain guarantees (e.g.,
for correctness and security) hold for the generated code.
Related papers
- AI Software Engineer: Programming with Trust [33.88230182444934]
Large Language Models (LLMs) have shown surprising proficiency in generating code snippets.
We argue that successfully deploying AI software engineers requires a level of trust equal to or even greater than the trust established by human-driven software engineering practices.
arXiv Detail & Related papers (2025-02-19T14:28:42Z) - Bridging LLM-Generated Code and Requirements: Reverse Generation technique and SBC Metric for Developer Insights [0.0]
This paper introduces a novel scoring mechanism called the SBC score.
It is based on a reverse generation technique that leverages the natural language generation capabilities of Large Language Models.
Unlike direct code analysis, our approach reconstructs system requirements from AI-generated code and compares them with the original specifications.
arXiv Detail & Related papers (2025-02-11T01:12:11Z) - Human-In-the-Loop Software Development Agents [12.830816751625829]
Large Language Models (LLMs)-based multi-agent paradigms for software engineering are introduced to automatically resolve software development tasks.
In this paper, we introduce a Human-in-the-loop LLM-based Agents framework (HULA) for software development.
We design, implement, and deploy the HULA framework into Atlassian for internal uses.
arXiv Detail & Related papers (2024-11-19T23:22:33Z) - RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph [63.87660059104077]
We present RepoGraph, a plug-in module that manages a repository-level structure for modern AI software engineering solutions.
RepoGraph substantially boosts the performance of all systems, leading to a new state-of-the-art among open-source frameworks.
arXiv Detail & Related papers (2024-10-03T05:45:26Z) - Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs)
The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation.
We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z) - How Far Have We Gone in Binary Code Understanding Using Large Language Models [51.527805834378974]
We propose a benchmark to evaluate the effectiveness of Large Language Models (LLMs) in binary code understanding.
Our evaluations reveal that existing LLMs can understand binary code to a certain extent, thereby improving the efficiency of binary code analysis.
arXiv Detail & Related papers (2024-04-15T14:44:08Z) - Developer Experiences with a Contextualized AI Coding Assistant:
Usability, Expectations, and Outcomes [11.520721038793285]
This study focuses on the initial experiences of 62 participants who used a contextualized coding AI assistant -- named StackSpot AI -- in a controlled setting.
Assistants' use resulted in significant time savings, easier access to documentation, and the generation of accurate codes for internal APIs.
challenges associated with the knowledge sources necessary to make the coding assistant access more contextual information as well as variable responses and limitations in handling complex codes were observed.
arXiv Detail & Related papers (2023-11-30T10:52:28Z) - Enabling Automated Machine Learning for Model-Driven AI Engineering [60.09869520679979]
We propose a novel approach to enable Model-Driven Software Engineering and Model-Driven AI Engineering.
In particular, we support Automated ML, thus assisting software engineers without deep AI knowledge in developing AI-intensive systems.
arXiv Detail & Related papers (2022-03-06T10:12:56Z) - Empowered and Embedded: Ethics and Agile Processes [60.63670249088117]
We argue that ethical considerations need to be embedded into the (agile) software development process.
We put emphasis on the possibility to implement ethical deliberations in already existing and well established agile software development processes.
arXiv Detail & Related papers (2021-07-15T11:14:03Z) - Technology Readiness Levels for Machine Learning Systems [107.56979560568232]
Development and deployment of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end.
We have developed a proven systems engineering approach for machine learning development and deployment.
Our "Machine Learning Technology Readiness Levels" framework defines a principled process to ensure robust, reliable, and responsible systems.
arXiv Detail & Related papers (2021-01-11T15:54:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.