The Effectiveness of a Dynamic Loss Function in Neural Network Based
Automated Essay Scoring
- URL: http://arxiv.org/abs/2305.10447v1
- Date: Mon, 15 May 2023 16:39:35 GMT
- Title: The Effectiveness of a Dynamic Loss Function in Neural Network Based
Automated Essay Scoring
- Authors: Oscar Morris
- Abstract summary: We present a dynamic loss function that creates an incentive for the model to predict with the correct distribution, as well as predicting the correct values.
Our loss function achieves this goal without sacrificing any performance achieving a Quadratic Weighted Kappa score of 0.752 on the Automated Student Assessment Prize Automated Essay Scoring dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Neural networks and in particular the attention mechanism have brought
significant advances to the field of Automated Essay Scoring. Many of these
systems use a regression-based model which may be prone to underfitting when
the model only predicts the mean of the training data. In this paper, we
present a dynamic loss function that creates an incentive for the model to
predict with the correct distribution, as well as predicting the correct
values. Our loss function achieves this goal without sacrificing any
performance achieving a Quadratic Weighted Kappa score of 0.752 on the
Automated Student Assessment Prize Automated Essay Scoring dataset.
Related papers
- Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest.
Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z) - Rubric-Specific Approach to Automated Essay Scoring with Augmentation
Training [0.1227734309612871]
We propose a series of data augmentation operations that train and test an automated scoring model to learn features and functions overlooked by previous works.
We achieve state-of-the-art performance in the Automated Student Assessment Prize dataset.
arXiv Detail & Related papers (2023-09-06T05:51:19Z) - Machine Unlearning for Causal Inference [0.6621714555125157]
It is important to enable the model to forget some of its learning/captured information about a given user (machine unlearning)
This paper introduces the concept of machine unlearning for causal inference, particularly propensity score matching and treatment effect estimation.
The dataset used in the study is the Lalonde dataset, a widely used dataset for evaluating the effectiveness of job training programs.
arXiv Detail & Related papers (2023-08-24T17:27:01Z) - Effect of Choosing Loss Function when Using T-batching for
Representation Learning on Dynamic Networks [0.0]
T-batching is a valuable technique for training dynamic network models.
We have identified a limitation in the training loss function used with t-batching.
We propose two alternative loss functions that overcome these issues, resulting in enhanced training performance.
arXiv Detail & Related papers (2023-08-13T23:34:36Z) - Real-to-Sim: Predicting Residual Errors of Robotic Systems with Sparse
Data using a Learning-based Unscented Kalman Filter [65.93205328894608]
We learn the residual errors between a dynamic and/or simulator model and the real robot.
We show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
arXiv Detail & Related papers (2022-09-07T15:15:12Z) - Adversarial Robustness Assessment of NeuroEvolution Approaches [1.237556184089774]
We evaluate the robustness of models found by two NeuroEvolution approaches on the CIFAR-10 image classification task.
Our results show that when the evolved models are attacked with iterative methods, their accuracy usually drops to, or close to, zero.
Some of these techniques can exacerbate the perturbations added to the original inputs, potentially harming robustness.
arXiv Detail & Related papers (2022-07-12T10:40:19Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - Churn Reduction via Distillation [54.5952282395487]
We show an equivalence between training with distillation using the base model as the teacher and training with an explicit constraint on the predictive churn.
We then show that distillation performs strongly for low churn training against a number of recent baselines.
arXiv Detail & Related papers (2021-06-04T18:03:31Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task.
This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z) - Learning Queuing Networks by Recurrent Neural Networks [0.0]
We propose a machine-learning approach to derive performance models from data.
We exploit a deterministic approximation of their average dynamics in terms of a compact system of ordinary differential equations.
This allows for an interpretable structure of the neural network, which can be trained from system measurements to yield a white-box parameterized model.
arXiv Detail & Related papers (2020-02-25T10:56:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.