Dcc --help: Generating Context-Aware Compiler Error Explanations with
Large Language Models
- URL: http://arxiv.org/abs/2308.11873v2
- Date: Mon, 16 Oct 2023 03:05:35 GMT
- Title: Dcc --help: Generating Context-Aware Compiler Error Explanations with
Large Language Models
- Authors: Andrew Taylor and Alexandra Vassar and Jake Renzella and Hammond
Pearce
- Abstract summary: dcc --help was deployed to our CS1 and CS2 courses, with 2,565 students using the tool over 64,000 times in ten weeks.
We found that the LLM-generated explanations were conceptually accurate in 90% of compile-time and 75% of run-time cases, but often disregarded the instruction not to provide solutions in code.
- Score: 53.04357141450459
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the challenging field of introductory programming, high enrollments and
failure rates drive us to explore tools and systems to enhance student
outcomes, especially automated tools that scale to large cohorts. This paper
presents and evaluates the dcc --help tool, an integration of a Large Language
Model (LLM) into the Debugging C Compiler (DCC) to generate unique,
novice-focused explanations tailored to each error. dcc --help prompts an LLM
with contextual information of compile- and run-time error occurrences,
including the source code, error location and standard compiler error message.
The LLM is instructed to generate novice-focused, actionable error explanations
and guidance, designed to help students understand and resolve problems without
providing solutions. dcc --help was deployed to our CS1 and CS2 courses, with
2,565 students using the tool over 64,000 times in ten weeks. We analysed a
subset of these error/explanation pairs to evaluate their properties, including
conceptual correctness, relevancy, and overall quality. We found that the
LLM-generated explanations were conceptually accurate in 90% of compile-time
and 75% of run-time cases, but often disregarded the instruction not to provide
solutions in code. Our findings, observations and reflections following
deployment indicate that dcc-help provides novel opportunities for scaffolding
students' introduction to programming.
Related papers
- SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs [77.79172008184415]
SpecTool is a new benchmark to identify error patterns in LLM output on tool-use tasks.
We show that even the most prominent LLMs exhibit these error patterns in their outputs.
Researchers can use the analysis and insights from SPECTOOL to guide their error mitigation strategies.
arXiv Detail & Related papers (2024-11-20T18:56:22Z) - Not the Silver Bullet: LLM-enhanced Programming Error Messages are Ineffective in Practice [1.106787864231365]
We show that GPT-4 generated error messages outperformed conventional compiler error messages in only 1 of the 6 tasks.
Despite promising evidence on synthetic benchmarks, we found that GPT-4 generated error messages outperformed conventional compiler error messages in only 1 of the 6 tasks.
arXiv Detail & Related papers (2024-09-27T11:45:56Z) - Scaling CS1 Support with Compiler-Integrated Conversational AI [43.77796322595561]
DCC Sidekick is a web-based AI tool that enhances an existing LLM-powered C/C++ compiler by generating educational programming error explanations.
We analyse usage data from a large Australian CS1 course, where 959 students engaged in 11,222 DCC Sidekick sessions, resulting in 17,982 error explanations over seven weeks.
arXiv Detail & Related papers (2024-07-22T10:53:55Z) - BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions [72.56339136017759]
We introduce BigCodeBench, a benchmark that challenges Large Language Models (LLMs) to invoke multiple function calls as tools from 139 libraries and 7 domains for 1,140 fine-grained tasks.
Our evaluation shows that LLMs are not yet capable of following complex instructions to use function calls precisely, with scores up to 60%, significantly lower than the human performance of 97%.
We propose a natural-language-oriented variant of BigCodeBench, BigCodeBench-Instruct, that automatically transforms the original docstrings into short instructions only with essential information.
arXiv Detail & Related papers (2024-06-22T15:52:04Z) - LLM-aided explanations of EDA synthesis errors [10.665347817363623]
Large Language Models (LLMs) have demonstrated text comprehension and question-answering capabilities.
We generate 936 error message explanations using three OpenAI LLMs over 21 different buggy code samples.
These are then graded for relevance and correctness, and we find that in approximately 71% of cases the LLMs give correct & complete explanations suitable for novice learners.
arXiv Detail & Related papers (2024-04-07T07:12:16Z) - Patterns of Student Help-Seeking When Using a Large Language
Model-Powered Programming Assistant [2.5949084781328744]
This study examines students' use of an innovative tool that provides on-demand programming assistance without revealing solutions directly.
We collected more than 2,500 queries submitted by students throughout the term.
We found that most queries requested immediate help with programming assignments, whereas fewer requests asked for help on related concepts or for deepening conceptual understanding.
arXiv Detail & Related papers (2023-10-25T20:36:05Z) - Can Large Language Models Understand Real-World Complex Instructions? [54.86632921036983]
Large language models (LLMs) can understand human instructions, but struggle with complex instructions.
Existing benchmarks are insufficient to assess LLMs' ability to understand complex instructions.
We propose CELLO, a benchmark for evaluating LLMs' ability to follow complex instructions systematically.
arXiv Detail & Related papers (2023-09-17T04:18:39Z) - CodeHelp: Using Large Language Models with Guardrails for Scalable
Support in Programming Classes [2.5949084781328744]
Large language models (LLMs) have emerged recently and show great promise for providing on-demand help at a large scale.
We introduce CodeHelp, a novel LLM-powered tool designed with guardrails to provide on-demand assistance to programming students without directly revealing solutions.
Our findings suggest that CodeHelp is well-received by students who especially value its availability and help with resolving errors, and that for instructors it is easy to deploy and complements, rather than replaces, the support that they provide to students.
arXiv Detail & Related papers (2023-08-14T03:52:24Z) - ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification.
A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors.
Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.