Related papers: Exploring the Evidence-Based Beliefs and Behaviors of LLM-Based Programming Assistants

Exploring the Evidence-Based Beliefs and Behaviors of LLM-Based Programming Assistants

URL: http://arxiv.org/abs/2407.13900v1
Date: Thu, 18 Jul 2024 21:06:39 GMT
Title: Exploring the Evidence-Based Beliefs and Behaviors of LLM-Based Programming Assistants
Authors: Chris Brown, Jason Cusati,
Abstract summary: This study investigates the beliefs and behaviors of large language models (LLMs) used to support software development tasks. Our findings show that LLM-based programming assistants have ambiguous beliefs regarding research claims, lack credible evidence to support responses, and are incapable of adopting practices demonstrated by empirical SE research to support development tasks.
Score: 2.3480418671346164
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent innovations in artificial intelligence (AI), primarily powered by large language models (LLMs), have transformed how programmers develop and maintain software -- leading to new frontiers in software engineering (SE). The advanced capabilities of LLM-based programming assistants to support software development tasks have led to a rise in the adoption of LLMs in SE. However, little is known about the evidenced-based practices, tools and processes verified by research findings, supported and adopted by AI programming assistants. To this end, our work conducts a preliminary evaluation exploring the beliefs and behaviors of LLM used to support software development tasks. We investigate 17 evidence-based claims posited by empirical SE research across five LLM-based programming assistants. Our findings show that LLM-based programming assistants have ambiguous beliefs regarding research claims, lack credible evidence to support responses, and are incapable of adopting practices demonstrated by empirical SE research to support development tasks. Based on our results, we provide implications for practitioners adopting LLM-based programming assistants in development contexts and shed light on future research directions to enhance the reliability and trustworthiness of LLMs -- aiming to increase awareness and adoption of evidence-based SE research findings in practice.

Related papers

Junior Software Developers' Perspectives on Adopting LLMs for Software Engineering: a Systematic Literature Review [17.22501688824729]
This paper provides an overview of junior software developers' perspectives and use of Large Language Model-based tools for software engineering (LLM4SE) We conducted a systematic literature review following guidelines by Kitchenham et al. on 56 primary studies. Only 8.9% of the studies provide a clear definition for junior software developers, and there is no uniformity.
arXiv Detail & Related papers (2025-03-10T17:25:24Z)
LLMs' Reshaping of People, Processes, Products, and Society in Software Development: A Comprehensive Exploration with Early Adopters [3.4069804433026314]
Large language models (LLMs) like OpenAI ChatGPT, Google Gemini, and GitHub Copilot are rapidly gaining traction in the software industry. Our study provides a nuanced understanding of how LLMs are shaping the landscape of software development.
arXiv Detail & Related papers (2025-03-06T22:27:05Z)
Assessing LLMs for Front-end Software Architecture Knowledge [0.0]
Large Language Models (LLMs) have demonstrated significant promise in automating software development tasks. This study investigates the capabilities of an LLM in understanding, reproducing, and generating structures within the VIPER architecture. Experimental results, using ChatGPT 4 Turbo 2024-04-09, reveal that the LLM excelled in higher-order tasks like evaluating and creating, but faced challenges with lower-order tasks requiring precise retrieval of architectural details.
arXiv Detail & Related papers (2025-02-26T19:33:35Z)
From Selection to Generation: A Survey of LLM-based Active Learning [153.8110509961261]
Large Language Models (LLMs) have been employed for generating entirely new data instances and providing more cost-effective annotations. This survey aims to serve as an up-to-date resource for researchers and practitioners seeking to gain an intuitive understanding of LLM-based AL techniques.
arXiv Detail & Related papers (2025-02-17T12:58:17Z)
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search [57.28671084993782]
Large language models (LLMs) have demonstrated remarkable reasoning capabilities across diverse domains. Recent studies have shown that increasing test-time computation enhances LLMs' reasoning capabilities. We propose a two-stage training paradigm: 1) a small-scale format tuning stage to internalize the COAT reasoning format and 2) a large-scale self-improvement stage leveraging reinforcement learning.
arXiv Detail & Related papers (2025-02-04T17:26:58Z)
Large Language Models for Code Generation: The Practitioners Perspective [4.946128083535776]
Large Language Models (LLMs) have emerged as coding assistants, capable of generating source code from natural language prompts. We propose and develop a multi-model unified platform to generate and execute code based on natural language prompts. We conducted a survey with 60 software practitioners from 11 countries across four continents to evaluate the usability, performance, strengths, and limitations of each model.
arXiv Detail & Related papers (2025-01-28T14:52:16Z)
Experiences from Using LLMs for Repository Mining Studies in Empirical Software Engineering [12.504438766461027]
Large Language Models (LLMs) have transformed Software Engineering (SE) by providing innovative methods for analyzing software repositories. Our research packages a framework, coined Prompt Refinement and Insights for Mining Empirical Software repositories (PRIMES) Our findings indicate that standardizing prompt engineering and using PRIMES can enhance the reliability and accuracy of studies utilizing LLMs.
arXiv Detail & Related papers (2024-11-15T06:08:57Z)
From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future [15.568939568441317]
We investigate the current practice and solutions for large language models (LLMs) and LLM-based agents for software engineering. In particular we summarise six key topics: requirement engineering, code generation, autonomous decision-making, software design, test generation, and software maintenance. We discuss the models and benchmarks used, providing a comprehensive analysis of their applications and effectiveness in software engineering.
arXiv Detail & Related papers (2024-08-05T14:01:15Z)
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks. LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning. We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z)
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing [56.75702900542643]
We introduce AlphaLLM for the self-improvements of Large Language Models. It integrates Monte Carlo Tree Search (MCTS) with LLMs to establish a self-improving loop. Our experimental results show that AlphaLLM significantly enhances the performance of LLMs without additional annotations.
arXiv Detail & Related papers (2024-04-18T15:21:34Z)
VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding [65.12464615430036]
This paper introduces a Video Understanding and Reasoning Framework (VURF) based on the reasoning power of Large Language Models (LLMs) Ours is a novel approach to extend the utility of LLMs in the context of video tasks. We harness their contextual learning capabilities to generate executable visual programs for video understanding.
arXiv Detail & Related papers (2024-03-21T18:00:00Z)
LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges. Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model. This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z)
An Empirical Study on Usage and Perceptions of LLMs in a Software Engineering Project [1.433758865948252]
Large Language Models (LLMs) represent a leap in artificial intelligence, excelling in tasks using human language(s) In this paper, we analyze the AI-generated code, prompts used for code generation, and the human intervention levels to integrate the code into the code base. Our findings suggest that LLMs can play a crucial role in the early stages of software development.
arXiv Detail & Related papers (2024-01-29T14:32:32Z)
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code) Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z)
Experiential Co-Learning of Software-Developing Agents [83.34027623428096]
Large language models (LLMs) have brought significant changes to various domains, especially in software development. We introduce Experiential Co-Learning, a novel LLM-agent learning framework. Experiments demonstrate that the framework enables agents to tackle unseen software-developing tasks more effectively.
arXiv Detail & Related papers (2023-12-28T13:50:42Z)
Software Testing with Large Language Models: Survey, Landscape, and Vision [32.34617250991638]
Pre-trained large language models (LLMs) have emerged as a breakthrough technology in natural language processing and artificial intelligence. This paper provides a comprehensive review of the utilization of LLMs in software testing.
arXiv Detail & Related papers (2023-07-14T08:26:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.