Evaluating the Effectiveness of Large Language Models in Solving Simple Programming Tasks: A User-Centered Study
- URL: http://arxiv.org/abs/2507.04043v1
- Date: Sat, 05 Jul 2025 13:52:31 GMT
- Title: Evaluating the Effectiveness of Large Language Models in Solving Simple Programming Tasks: A User-Centered Study
- Authors: Kai Deng,
- Abstract summary: This study investigates how different interaction styles with ChatGPT-4o affect user performance on simple programming tasks.<n>I conducted a within-subjects experiment where fifteen high school students completed three problems under three distinct versions of the model.
- Score: 1.0467092641687232
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As large language models (LLMs) become more common in educational tools and programming environments, questions arise about how these systems should interact with users. This study investigates how different interaction styles with ChatGPT-4o (passive, proactive, and collaborative) affect user performance on simple programming tasks. I conducted a within-subjects experiment where fifteen high school students participated, completing three problems under three distinct versions of the model. Each version was designed to represent a specific style of AI support: responding only when asked, offering suggestions automatically, or engaging the user in back-and-forth dialogue.Quantitative analysis revealed that the collaborative interaction style significantly improved task completion time compared to the passive and proactive conditions. Participants also reported higher satisfaction and perceived helpfulness when working with the collaborative version. These findings suggest that the way an LLM communicates, how it guides, prompts, and responds, can meaningfully impact learning and performance. This research highlights the importance of designing LLMs that go beyond functional correctness to support more interactive, adaptive, and user-centered experiences, especially for novice programmers.
Related papers
- Experimental Analysis of Productive Interaction Strategy with ChatGPT: User Study on Function and Project-level Code Generation Tasks [5.659969270836789]
This study investigates the impact of Human-LLM Interactions (HLI) features on code generation productivity.<n>Three out of 15 HLI features significantly impacted the productivity in code generation.<n>A taxonomy of 29 runtime and logic errors can occur during HLI processes, along with suggested mitigation plans.
arXiv Detail & Related papers (2025-08-06T06:48:48Z) - Teaching Language Models To Gather Information Proactively [53.85419549904644]
Large language models (LLMs) are increasingly expected to function as collaborative partners.<n>In this work, we introduce a new task paradigm: proactive information gathering.<n>We design a scalable framework that generates partially specified, real-world tasks, masking key information.<n>Within this setup, our core innovation is a reinforcement finetuning strategy that rewards questions that elicit genuinely new, implicit user information.
arXiv Detail & Related papers (2025-07-28T23:50:09Z) - Can Large Language Models Help Students Prove Software Correctness? An Experimental Study with Dafny [79.56218230251953]
Students in computing education increasingly use large language models (LLMs) such as ChatGPT.<n>This paper investigates how students interact with an LLM when solving formal verification exercises in Dafny.
arXiv Detail & Related papers (2025-06-27T16:34:13Z) - Prototypical Human-AI Collaboration Behaviors from LLM-Assisted Writing in the Wild [10.23533525266164]
Large language models (LLMs) are used in complex writing to steer generations to better fit their needs.<n>We conduct a large-scale analysis of this collaborative behavior for users engaged in writing tasks in the wild.<n>We identify prototypical behaviors in how users interact with LLMs in prompts following their original request.
arXiv Detail & Related papers (2025-05-21T21:13:01Z) - Understanding Learner-LLM Chatbot Interactions and the Impact of Prompting Guidelines [9.834055425277874]
This study investigates learner-AI interactions through an educational experiment in which participants receive structured guidance on effective prompting.<n>To assess user behavior and prompting efficacy, we analyze a dataset of 642 interactions from 107 users.<n>Our findings provide a deeper understanding of how users engage with Large Language Models and the role of structured prompting guidance in enhancing AI-assisted communication.
arXiv Detail & Related papers (2025-04-10T15:20:43Z) - Use Me Wisely: AI-Driven Assessment for LLM Prompting Skills Development [5.559706293891474]
Large language model (LLM)-powered chatbots have become popular across various domains, supporting a range of tasks and processes.<n>Yet, prompting is highly task- and domain-dependent, limiting the effectiveness of generic approaches.<n>In this study, we explore whether LLM-based methods can facilitate learning assessments by using ad-hoc guidelines and a minimal number of annotated prompt samples.
arXiv Detail & Related papers (2025-03-04T11:56:33Z) - Interactive Agents to Overcome Ambiguity in Software Engineering [61.40183840499932]
AI agents are increasingly being deployed to automate tasks, often based on ambiguous and underspecified user instructions.<n>Making unwarranted assumptions and failing to ask clarifying questions can lead to suboptimal outcomes.<n>We study the ability of LLM agents to handle ambiguous instructions in interactive code generation settings by evaluating proprietary and open-weight models on their performance.
arXiv Detail & Related papers (2025-02-18T17:12:26Z) - Generating Situated Reflection Triggers about Alternative Solution Paths: A Case Study of Generative AI for Computer-Supported Collaborative Learning [3.2721068185888127]
We present a proof-of-concept application to offer students dynamic and contextualized feedback.
Specifically, we augment an Online Programming Exercise bot for a college-level Cloud Computing course with ChatGPT.
We demonstrate that LLMs can be used to generate highly situated reflection triggers that incorporate details of the collaborative discussion happening in context.
arXiv Detail & Related papers (2024-04-28T17:56:14Z) - Narrative Action Evaluation with Prompt-Guided Multimodal Interaction [60.281405999483]
Narrative action evaluation (NAE) aims to generate professional commentary that evaluates the execution of an action.
NAE is a more challenging task because it requires both narrative flexibility and evaluation rigor.
We propose a prompt-guided multimodal interaction framework to facilitate the interaction between different modalities of information.
arXiv Detail & Related papers (2024-04-22T17:55:07Z) - Self-Explanation Prompting Improves Dialogue Understanding in Large
Language Models [52.24756457516834]
We propose a novel "Self-Explanation" prompting strategy to enhance the comprehension abilities of Large Language Models (LLMs)
This task-agnostic approach requires the model to analyze each dialogue utterance before task execution, thereby improving performance across various dialogue-centric tasks.
Experimental results from six benchmark datasets confirm that our method consistently outperforms other zero-shot prompts and matches or exceeds the efficacy of few-shot prompts.
arXiv Detail & Related papers (2023-09-22T15:41:34Z) - Learning an Effective Context-Response Matching Model with
Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination.
We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner.
Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.