Handwriting recognition and automatic scoring for descriptive answers in
Japanese language tests
- URL: http://arxiv.org/abs/2201.03215v2
- Date: Thu, 30 Nov 2023 06:51:24 GMT
- Title: Handwriting recognition and automatic scoring for descriptive answers in
Japanese language tests
- Authors: Hung Tuan Nguyen, Cuong Tuan Nguyen, Haruki Oka, Tsunenori Ishioka,
Masaki Nakagawa
- Abstract summary: This paper presents an experiment of automatically scoring handwritten descriptive answers in the trial tests for the new Japanese university entrance examination.
Although all answers have been scored by human examiners, handwritten characters are not labeled.
We present our attempt to adapt deep neural network-based handwriting recognizers trained on a labeled handwriting dataset into this unlabeled answer set.
- Score: 7.489722641968594
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents an experiment of automatically scoring handwritten
descriptive answers in the trial tests for the new Japanese university entrance
examination, which were made for about 120,000 examinees in 2017 and 2018.
There are about 400,000 answers with more than 20 million characters. Although
all answers have been scored by human examiners, handwritten characters are not
labeled. We present our attempt to adapt deep neural network-based handwriting
recognizers trained on a labeled handwriting dataset into this unlabeled answer
set. Our proposed method combines different training strategies, ensembles
multiple recognizers, and uses a language model built from a large general
corpus to avoid overfitting into specific data. In our experiment, the proposed
method records character accuracy of over 97% using about 2,000 verified
labeled answers that account for less than 0.5% of the dataset. Then, the
recognized answers are fed into a pre-trained automatic scoring system based on
the BERT model without correcting misrecognized characters and providing rubric
annotations. The automatic scoring system achieves from 0.84 to 0.98 of
Quadratic Weighted Kappa (QWK). As QWK is over 0.8, it represents an acceptable
similarity of scoring between the automatic scoring system and the human
examiners. These results are promising for further research on end-to-end
automatic scoring of descriptive answers.
Related papers
- SpellMapper: A non-autoregressive neural spellchecker for ASR
customization with candidate retrieval based on n-gram mappings [76.87664008338317]
Contextual spelling correction models are an alternative to shallow fusion to improve automatic speech recognition.
We propose a novel algorithm for candidate retrieval based on misspelled n-gram mappings.
Experiments on Spoken Wikipedia show 21.4% word error rate improvement compared to a baseline ASR system.
arXiv Detail & Related papers (2023-06-04T10:00:12Z) - Modeling and Analyzing Scorer Preferences in Short-Answer Math Questions [2.277447144331876]
We investigate a collection of models that account for the individual preferences and tendencies of each human scorer in the automated scoring task.
We conduct quantitative experiments and case studies to analyze the individual preferences and tendencies of scorers.
arXiv Detail & Related papers (2023-06-01T15:22:05Z) - PART: Pre-trained Authorship Representation Transformer [64.78260098263489]
Authors writing documents imprint identifying information within their texts: vocabulary, registry, punctuation, misspellings, or even emoji usage.
Previous works use hand-crafted features or classification tasks to train their authorship models, leading to poor performance on out-of-domain authors.
We propose a contrastively trained model fit to learn textbfauthorship embeddings instead of semantics.
arXiv Detail & Related papers (2022-09-30T11:08:39Z) - AiM: Taking Answers in Mind to Correct Chinese Cloze Tests in
Educational Applications [26.610045625897275]
We propose a multimodal approach to automatically correct handwritten assignments.
The encoded representations of answers interact with the visual information of students' handwriting.
Experimental results show that AiM outperforms OCR-based methods by a large margin.
arXiv Detail & Related papers (2022-08-26T08:56:32Z) - Writer Recognition Using Off-line Handwritten Single Block Characters [59.17685450892182]
We use personal identity numbers consisting of the six digits of the date of birth, DoB.
We evaluate two recognition approaches, one based on handcrafted features that compute directional measurements, and another based on deep features from a ResNet50 model.
Results show the presence of identity-related information in a piece of handwritten information as small as six digits with the DoB.
arXiv Detail & Related papers (2022-01-25T23:04:10Z) - Detecting Handwritten Mathematical Terms with Sensor Based Data [71.84852429039881]
We propose a solution to the UbiComp 2021 Challenge by Stabilo in which handwritten mathematical terms are supposed to be automatically classified.
The input data set contains data of different writers, with label strings constructed from a total of 15 different possible characters.
arXiv Detail & Related papers (2021-09-12T19:33:34Z) - Get It Scored Using AutoSAS -- An Automated System for Scoring Short
Answers [63.835172924290326]
We present a fast, scalable, and accurate approach towards automated Short Answer Scoring (SAS)
We propose and explain the design and development of a system for SAS, namely AutoSAS.
AutoSAS shows state-of-the-art performance and achieves better results by over 8% in some of the question prompts.
arXiv Detail & Related papers (2020-12-21T10:47:30Z) - Stacking Neural Network Models for Automatic Short Answer Scoring [0.0]
We propose the use of a stacking model based on neural network and XGBoost for classification process with sentence embedding feature.
Best model obtained an F1-score of 0.821 exceeding the previous work at the same dataset.
arXiv Detail & Related papers (2020-10-21T16:00:09Z) - Text-independent writer identification using convolutional neural
network [8.526559246026162]
We propose an end-to-end deep-learning method for text-independent writer identification.
Our method achieved over 91.81% accuracy to classify writers.
arXiv Detail & Related papers (2020-09-10T14:18:03Z) - Evaluation Toolkit For Robustness Testing Of Automatic Essay Scoring
Systems [64.4896118325552]
We evaluate the current state-of-the-art AES models using a model adversarial evaluation scheme and associated metrics.
We find that AES models are highly overstable. Even heavy modifications(as much as 25%) with content unrelated to the topic of the questions do not decrease the score produced by the models.
arXiv Detail & Related papers (2020-07-14T03:49:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.