Related papers: Stacking Neural Network Models for Automatic Short Answer Scoring

Stacking Neural Network Models for Automatic Short Answer Scoring

URL: http://arxiv.org/abs/2010.11092v1
Date: Wed, 21 Oct 2020 16:00:09 GMT
Title: Stacking Neural Network Models for Automatic Short Answer Scoring
Authors: Rian Adam Rajagede and Rochana Prih Hastuti
Abstract summary: We propose the use of a stacking model based on neural network and XGBoost for classification process with sentence embedding feature. Best model obtained an F1-score of 0.821 exceeding the previous work at the same dataset.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Automatic short answer scoring is one of the text classification problems to assess students' answers during exams automatically. Several challenges can arise in making an automatic short answer scoring system, one of which is the quantity and quality of the data. The data labeling process is not easy because it requires a human annotator who is an expert in their field. Further, the data imbalance process is also a challenge because the number of labels for correct answers is always much less than the wrong answers. In this paper, we propose the use of a stacking model based on neural network and XGBoost for classification process with sentence embedding feature. We also propose to use data upsampling method to handle imbalance classes and hyperparameters optimization algorithm to find a robust model automatically. We use Ukara 1.0 Challenge dataset and our best model obtained an F1-score of 0.821 exceeding the previous work at the same dataset.

Related papers

Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books. Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z)
Neural Networks Against (and For) Self-Training: Classification with Small Labeled and Large Unlabeled Sets [11.385682758047775]
One of the weaknesses of self-training is the semantic drift problem. We reshape the role of pseudo-labels and create a hierarchical order of information. A crucial step in self-training is to use the confidence prediction to select the best candidate pseudo-labels.
arXiv Detail & Related papers (2023-12-31T19:25:34Z)
Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box. This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z)
A Lightweight Method to Generate Unanswerable Questions in English [18.323248259867356]
We examine a simpler data augmentation method for unanswerable question generation in English. We perform antonym and entity swaps on answerable questions. Compared to the prior state-of-the-art, data generated with our training-free and lightweight strategy results in better models.
arXiv Detail & Related papers (2023-10-30T10:14:52Z)
Modeling and Analyzing Scorer Preferences in Short-Answer Math Questions [2.277447144331876]
We investigate a collection of models that account for the individual preferences and tendencies of each human scorer in the automated scoring task. We conduct quantitative experiments and case studies to analyze the individual preferences and tendencies of scorers.
arXiv Detail & Related papers (2023-06-01T15:22:05Z)
Cost-Effective Online Contextual Model Selection [14.094350329970537]
We formulate this task as an online contextual active model selection problem, where at each round the learner receives an unlabeled data point along with a context. The goal is to output the best model for any given context without obtaining an excessive amount of labels. We propose a contextual active model selection algorithm (CAMS), which relies on a novel uncertainty sampling query criterion defined on a given policy class for adaptive model selection.
arXiv Detail & Related papers (2022-07-13T08:22:22Z)
Improving Passage Retrieval with Zero-Shot Question Generation [109.11542468380331]
We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. The re-ranker re-scores retrieved passages with a zero-shot question generation model, which uses a pre-trained language model to compute the probability of the input question conditioned on a retrieved passage.
arXiv Detail & Related papers (2022-04-15T14:51:41Z)
Less is More: Data-Efficient Complex Question Answering over Knowledge Bases [26.026065844896465]
We propose the Neural-Symbolic Complex Question Answering (NS-CQA) model, a data-efficient reinforcement learning framework for complex question answering. Our framework consists of a neural generator and a symbolic executor that transforms a natural-language question into a sequence of primitive actions. Our model is evaluated on two datasets: CQA, a recent large-scale complex question answering dataset, and WebQuestionsSP, a multi-hop question answering dataset.
arXiv Detail & Related papers (2020-10-29T18:42:44Z)
TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task [80.38130122127882]
TACRED is one of the largest, most widely used crowdsourced datasets in Relation Extraction (RE) In this paper, we investigate the questions: Have we reached a performance ceiling or is there still room for improvement? We find that label errors account for 8% absolute F1 test error, and that more than 50% of the examples need to be relabeled.
arXiv Detail & Related papers (2020-04-30T15:07:37Z)
The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection [55.390442067381755]
We show that grayscale data can be automatically constructed without human effort. Our method employs off-the-shelf response retrieval models and response generation models as automatic grayscale data generators. Experiments on three benchmark datasets and four state-of-the-art matching models show that the proposed approach brings significant and consistent performance improvements.
arXiv Detail & Related papers (2020-04-06T06:34:54Z)
Training Question Answering Models From Synthetic Data [26.91650323300262]
This work aims to narrow the gap between synthetic and human-generated question-answer pairs. We synthesize questions and answers from a synthetic corpus generated by an 8.3 billion parameter GPT-2 model. With no access to human supervision and only access to other models, we are able to train state of the art question answering networks on entirely model-generated data.
arXiv Detail & Related papers (2020-02-22T01:49:27Z)
Progressive Identification of True Labels for Partial-Label Learning [112.94467491335611]
Partial-label learning (PLL) is a typical weakly supervised learning problem, where each training instance is equipped with a set of candidate labels among which only one is the true label. Most existing methods elaborately designed as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data. This paper proposes a novel framework of classifier with flexibility on the model and optimization algorithm.
arXiv Detail & Related papers (2020-02-19T08:35:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.