Related papers: Integrating Large Language Models and Evaluating Student Outcomes in an Introductory Computer Science Course

Integrating Large Language Models and Evaluating Student Outcomes in an Introductory Computer Science Course

URL: http://arxiv.org/abs/2510.18806v1
Date: Tue, 21 Oct 2025 16:59:54 GMT
Title: Integrating Large Language Models and Evaluating Student Outcomes in an Introductory Computer Science Course
Authors: Annapurna Vadaparty, David H. Smith IV, Samvrit Srinath, Mounika Padala, Christine Alvarado, Jamie Gorson Benario, Daniel Zingaro, Leo Porter,
Abstract summary: We present the design and evaluation of a new CS1-LLM course at a large research-intensive university.<n>We describe the design principles used to create our new CS1-LLM course, our new course objectives, and evaluation of student outcomes and perceptions throughout the course.
Score: 0.2810625954925814
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative AI (GenAI) models have broad implications for education in general, impacting the foundations of what we teach and how we assess. This is especially true in computing, where LLMs tuned for coding have demonstrated shockingly good performance on the types of assignments historically used in introductory CS (CS1) courses. As a result, CS1 courses will need to change what skills are taught and how they are assessed. Computing education researchers have begun to study student use of LLMs, but there remains much to be understood about the ways that these tools affect student outcomes. In this paper, we present the design and evaluation of a new CS1 course at a large research-intensive university that integrates the use of LLMs as a learning tool for students. We describe the design principles used to create our new CS1-LLM course, our new course objectives, and evaluation of student outcomes and perceptions throughout the course as measured by assessment scores and surveys. Our findings suggest that 1) student exam performance outcomes, including differences among demographic groups, are largely similar to historical outcomes for courses without integration of LLM tools, 2) large, open-ended projects may be particularly valuable in an LLM context, and 3) students predominantly found the LLM tools helpful, although some had concerns regarding over-reliance on the tools.

Related papers

Demystify, Use, Reflect: Preparing students to be informed LLM-users [12.70014939919203]
This course introduces Large Language Models (LLMs) in a structured, critical, and practical manner.<n>It aims to help students develop the skills needed to engage meaningfully and responsibly with AI.
arXiv Detail & Related papers (2025-11-14T04:43:49Z)
From Selection to Generation: A Survey of LLM-based Active Learning [153.8110509961261]
Large Language Models (LLMs) have been employed for generating entirely new data instances and providing more cost-effective annotations.<n>This survey aims to serve as an up-to-date resource for researchers and practitioners seeking to gain an intuitive understanding of LLM-based AL techniques.
arXiv Detail & Related papers (2025-02-17T12:58:17Z)
Tool Learning with Large Language Models: A Survey [60.733557487886635]
Tool learning with large language models (LLMs) has emerged as a promising paradigm for augmenting the capabilities of LLMs to tackle highly complex problems. Despite growing attention and rapid advancements in this field, the existing literature remains fragmented and lacks systematic organization.
arXiv Detail & Related papers (2024-05-28T08:01:26Z)
CS1-LLM: Integrating LLMs into CS1 Instruction [0.6282171844772422]
This experience report describes a CS1 course at a large research-intensive university that fully embraces the use of Large Language Models. To incorporate the LLMs, the course was intentionally altered to reduce emphasis on syntax and writing code from scratch. Students were given three large, open-ended projects in three separate domains that allowed them to showcase their creativity.
arXiv Detail & Related papers (2024-04-17T14:44:28Z)
Analyzing LLM Usage in an Advanced Computing Class in India [4.580708389528142]
This study examines the use of large language models (LLMs) by undergraduate and graduate students for programming assignments in advanced computing classes. We conducted a comprehensive analysis involving 411 students from a Distributed Systems class at an Indian university.
arXiv Detail & Related papers (2024-04-06T12:06:56Z)
An Exploratory Study on Upper-Level Computing Students' Use of Large Language Models as Tools in a Semester-Long Project [2.7325338323814328]
The purpose of this study is to explore computing students' experiences and approaches to using LLMs during a semester-long software engineering project. We collected data from a senior-level software engineering course at Purdue University. We analyzed the data to identify themes related to students' usage patterns and learning outcomes.
arXiv Detail & Related papers (2024-03-27T15:21:58Z)
Evaluating and Optimizing Educational Content with Large Language Model Judgments [52.33701672559594]
We use Language Models (LMs) as educational experts to assess the impact of various instructions on learning outcomes. We introduce an instruction optimization approach in which one LM generates instructional materials using the judgments of another LM as a reward function. Human teachers' evaluations of these LM-generated worksheets show a significant alignment between the LM judgments and human teacher preferences.
arXiv Detail & Related papers (2024-03-05T09:09:15Z)
From Summary to Action: Enhancing Large Language Models for Complex Tasks with Open World APIs [62.496139001509114]
We introduce a novel tool invocation pipeline designed to control massive real-world APIs. This pipeline mirrors the human task-solving process, addressing complicated real-life user queries. Empirical evaluations of our Sum2Act pipeline on the ToolBench benchmark show significant performance improvements.
arXiv Detail & Related papers (2024-02-28T08:42:23Z)
An Empirical Study on Usage and Perceptions of LLMs in a Software Engineering Project [1.433758865948252]
Large Language Models (LLMs) represent a leap in artificial intelligence, excelling in tasks using human language(s) In this paper, we analyze the AI-generated code, prompts used for code generation, and the human intervention levels to integrate the code into the code base. Our findings suggest that LLMs can play a crucial role in the early stages of software development.
arXiv Detail & Related papers (2024-01-29T14:32:32Z)
Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z)
CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models [74.22729793816451]
Large Language Models (LLMs) have made significant progress in utilizing tools, but their ability is limited by API availability. We propose CREATOR, a novel framework that enables LLMs to create their own tools using documentation and code realization. We evaluate CREATOR on MATH and TabMWP benchmarks, respectively consisting of challenging math competition problems.
arXiv Detail & Related papers (2023-05-23T17:51:52Z)
ElitePLM: An Empirical Study on General Language Ability Evaluation of Pretrained Language Models [78.08792285698853]
We present a large-scale empirical study on general language ability evaluation of pretrained language models (ElitePLM) Our empirical results demonstrate that: (1) PLMs with varying training objectives and strategies are good at different ability tests; (2) fine-tuning PLMs in downstream tasks is usually sensitive to the data size and distribution; and (3) PLMs have excellent transferability between similar tasks.
arXiv Detail & Related papers (2022-05-03T14:18:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.