Related papers: CSEPrompts: A Benchmark of Introductory Computer Science Prompts

CSEPrompts: A Benchmark of Introductory Computer Science Prompts

URL: http://arxiv.org/abs/2404.02540v2
Date: Thu, 4 Apr 2024 04:17:27 GMT
Title: CSEPrompts: A Benchmark of Introductory Computer Science Prompts
Authors: Nishat Raihan, Dhiman Goswami, Sadiya Sayara Chowdhury Puspo, Christian Newman, Tharindu Ranasinghe, Marcos Zampieri,
Abstract summary: Recent advances in AI, machine learning, and NLP have led to the development of a new generation of Large Language Models (LLMs) Commercial applications have made this technology available to the general public, thus making it possible to use LLMs to produce high-quality texts for academic and professional purposes. Schools and universities are aware of the increasing use of AI-generated content by students and they have been researching the impact of this new technology and its potential misuse.
Score: 11.665831944836118
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recent advances in AI, machine learning, and NLP have led to the development of a new generation of Large Language Models (LLMs) that are trained on massive amounts of data and often have trillions of parameters. Commercial applications (e.g., ChatGPT) have made this technology available to the general public, thus making it possible to use LLMs to produce high-quality texts for academic and professional purposes. Schools and universities are aware of the increasing use of AI-generated content by students and they have been researching the impact of this new technology and its potential misuse. Educational programs in Computer Science (CS) and related fields are particularly affected because LLMs are also capable of generating programming code in various programming languages. To help understand the potential impact of publicly available LLMs in CS education, we introduce CSEPrompts, a framework with hundreds of programming exercise prompts and multiple-choice questions retrieved from introductory CS and programming courses. We also provide experimental results on CSEPrompts to evaluate the performance of several LLMs with respect to generating Python code and answering basic computer science and programming questions.

Related papers

Evaluating Code Generation of LLMs in Advanced Computer Science Problems [0.0]
Large Language Models (LLMs) have become popular among programming students. We evaluate the ability of four LLM tools to solve programming assignments from advanced Computer Science courses.
arXiv Detail & Related papers (2025-04-21T08:45:23Z)
On the Opportunities of Large Language Models for Programming Process Data [6.023152721616896]
We discuss opportunities of using large language models for analyzing programming process data. To complement our discussion, we outline a case study where we have leveraged LLMs for automatically summarizing the programming process.
arXiv Detail & Related papers (2024-11-01T07:20:01Z)
Large Language Models in Computer Science Education: A Systematic Literature Review [7.240148550817106]
Large language models (LLMs) are becoming increasingly better at a wide range of Natural Language Processing tasks (NLP) Recently, these models have extended their capabilities to coding tasks, bridging the gap between natural languages (NL) and programming languages (PL)
arXiv Detail & Related papers (2024-10-21T17:49:50Z)
Large Language Models are Interpretable Learners [53.56735770834617]
In this paper, we show a combination of Large Language Models (LLMs) and symbolic programs can bridge the gap between expressiveness and interpretability. The pretrained LLM with natural language prompts provides a massive set of interpretable modules that can transform raw input into natural language concepts. As the knowledge learned by LSP is a combination of natural language descriptions and symbolic rules, it is easily transferable to humans (interpretable) and other LLMs.
arXiv Detail & Related papers (2024-06-25T02:18:15Z)
Let's Ask AI About Their Programs: Exploring ChatGPT's Answers To Program Comprehension Questions [2.377308748205625]
We explore the capability of the state-of-the-art LLMs in answering QLCs that are generated from code that the LLMs have created. Our results show that although the state-of-the-art LLMs can create programs and trace program execution when prompted, they easily succumb to similar errors that have previously been recorded for novice programmers.
arXiv Detail & Related papers (2024-04-17T20:37:00Z)
QACP: An Annotated Question Answering Dataset for Assisting Chinese Python Programming Learners [10.90557801193242]
This paper proposes a new Chinese question-and-answer dataset for Python learners. It is designed to enhance the effectiveness and quality of online programming education.
arXiv Detail & Related papers (2024-01-30T13:11:23Z)
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code) Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z)
Recommender Systems in the Era of Large Language Models (LLMs) [62.0129013439038]
Large Language Models (LLMs) have revolutionized the fields of Natural Language Processing (NLP) and Artificial Intelligence (AI) We conduct a comprehensive review of LLM-empowered recommender systems from various aspects including Pre-training, Fine-tuning, and Prompting.
arXiv Detail & Related papers (2023-07-05T06:03:40Z)
CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models [74.22729793816451]
Large Language Models (LLMs) have made significant progress in utilizing tools, but their ability is limited by API availability. We propose CREATOR, a novel framework that enables LLMs to create their own tools using documentation and code realization. We evaluate CREATOR on MATH and TabMWP benchmarks, respectively consisting of challenging math competition problems.
arXiv Detail & Related papers (2023-05-23T17:51:52Z)
Automatically Generating CS Learning Materials with Large Language Models [4.526618922750769]
Large Language Models (LLMs) enable software developers to generate code based on a natural language prompt. LLMs may enable students to interact with code in new ways while helping instructors scale their learning materials. LLMs also introduce new implications for academic integrity, curriculum design, and software engineering careers.
arXiv Detail & Related papers (2022-12-09T20:37:44Z)
A Survey of Knowledge Enhanced Pre-trained Language Models [78.56931125512295]
We present a comprehensive review of Knowledge Enhanced Pre-trained Language Models (KE-PLMs) For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG) and rule knowledge. The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods.
arXiv Detail & Related papers (2022-11-11T04:29:02Z)
The ILASP system for Inductive Learning of Answer Set Programs [79.41112438865386]
Our system learns Answer Set Programs, including normal rules, choice rules and hard and weak constraints. We first give a general overview of ILASP's learning framework and its capabilities. This is followed by a comprehensive summary of the evolution of the ILASP system.
arXiv Detail & Related papers (2020-05-02T19:04:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.