Handwritten Code Recognition for Pen-and-Paper CS Education
- URL: http://arxiv.org/abs/2408.07220v1
- Date: Wed, 7 Aug 2024 21:02:17 GMT
- Title: Handwritten Code Recognition for Pen-and-Paper CS Education
- Authors: Md Sazzad Islam, Moussa Koulako Bala Doumbouya, Christopher D. Manning, Chris Piech,
- Abstract summary: Teaching Computer Science (CS) by having students write programs by hand on paper has key pedagogical advantages.
However, a key obstacle is the current lack of teaching methods and support software for working with and running handwritten programs.
Our approach integrates two innovative methods. The first combines OCR with an indentation recognition module and a language model designed for post-OCR error correction without introducing hallucinations.
- Score: 33.53124589437863
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Teaching Computer Science (CS) by having students write programs by hand on paper has key pedagogical advantages: It allows focused learning and requires careful thinking compared to the use of Integrated Development Environments (IDEs) with intelligent support tools or "just trying things out". The familiar environment of pens and paper also lessens the cognitive load of students with no prior experience with computers, for whom the mere basic usage of computers can be intimidating. Finally, this teaching approach opens learning opportunities to students with limited access to computers. However, a key obstacle is the current lack of teaching methods and support software for working with and running handwritten programs. Optical character recognition (OCR) of handwritten code is challenging: Minor OCR errors, perhaps due to varied handwriting styles, easily make code not run, and recognizing indentation is crucial for languages like Python but is difficult to do due to inconsistent horizontal spacing in handwriting. Our approach integrates two innovative methods. The first combines OCR with an indentation recognition module and a language model designed for post-OCR error correction without introducing hallucinations. This method, to our knowledge, surpasses all existing systems in handwritten code recognition. It reduces error from 30\% in the state of the art to 5\% with minimal hallucination of logical fixes to student programs. The second method leverages a multimodal language model to recognize handwritten programs in an end-to-end fashion. We hope this contribution can stimulate further pedagogical research and contribute to the goal of making CS education universally accessible. We release a dataset of handwritten programs and code to support future research at https://github.com/mdoumbouya/codeocr
Related papers
- Integrating Natural Language Prompting Tasks in Introductory Programming Courses [3.907735250728617]
This report explores the inclusion of two prompt-focused activities in an introductory programming course.
The first requires students to solve computational problems by writing natural language prompts, emphasizing problem-solving over syntax.
The second involves students crafting prompts to generate code equivalent to provided fragments, to foster an understanding of the relationship between prompts and code.
arXiv Detail & Related papers (2024-10-04T01:03:25Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - Thinking beyond chatbots' threat to education: Visualizations to
elucidate the writing and coding process [0.0]
The landscape of educational practices for teaching and learning languages has been predominantly centered around outcome-driven approaches.
The recent accessibility of large language models has thoroughly disrupted these approaches.
This work presents a new set of visualization tools to summarize the inherent and taught capabilities of a learner's writing or programming process.
arXiv Detail & Related papers (2023-04-25T22:11:29Z) - An end-to-end, interactive Deep Learning based Annotation system for
cursive and print English handwritten text [0.0]
We present an innovative, complete end-to-end pipeline, that annotates offline handwritten manuscripts written in both print and cursive English.
This novel method involves an architectural combination of a detection system built upon a state-of-the-art text detection model, and a custom made Deep Learning model for the recognition system.
arXiv Detail & Related papers (2023-04-18T00:24:07Z) - Giving Feedback on Interactive Student Programs with Meta-Exploration [74.5597783609281]
Developing interactive software, such as websites or games, is a particularly engaging way to learn computer science.
Standard approaches require instructors to manually grade student-implemented interactive programs.
Online platforms that serve millions, like Code.org, are unable to provide any feedback on assignments for implementing interactive programs.
arXiv Detail & Related papers (2022-11-16T10:00:23Z) - Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents.
Previous work has demonstrated the utility of neural post-correction methods on recognition of less-well-resourced languages.
We present a semi-supervised learning method that makes it possible to utilize raw images to improve performance.
arXiv Detail & Related papers (2021-11-04T04:39:02Z) - ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification.
A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors.
Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z) - Exploring a Handwriting Programming Language for Educational Robots [1.310461046819527]
This study presents the development of a handwriting-based programming language for educational robots.
It allows students to program a robot by drawing symbols with ordinary pens and paper.
The system was evaluated in a preliminary test with eight teachers, developers and educational researchers.
arXiv Detail & Related papers (2021-05-11T12:00:34Z) - Skeleton Based Sign Language Recognition Using Whole-body Keypoints [71.97020373520922]
Sign language is used by deaf or speech impaired people to communicate.
Skeleton-based recognition is becoming popular that it can be further ensembled with RGB-D based method to achieve state-of-the-art performance.
Inspired by the recent development of whole-body pose estimation citejin 2020whole, we propose recognizing sign language based on the whole-body key points and features.
arXiv Detail & Related papers (2021-03-16T03:38:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.