UKARA 1.0 Challenge Track 1: Automatic Short-Answer Scoring in Bahasa
Indonesia
- URL: http://arxiv.org/abs/2002.12540v1
- Date: Fri, 28 Feb 2020 04:32:16 GMT
- Title: UKARA 1.0 Challenge Track 1: Automatic Short-Answer Scoring in Bahasa
Indonesia
- Authors: Ali Akbar Septiandri, Yosef Ardhito Winatmoko
- Abstract summary: We describe our third-place solution to the UKARA 1.0 challenge on automated essay scoring.
The task consists of a binary classification problem on two datasets | answers from two different questions.
We ended up using two different models for the two datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We describe our third-place solution to the UKARA 1.0 challenge on automated
essay scoring. The task consists of a binary classification problem on two
datasets | answers from two different questions. We ended up using two
different models for the two datasets. For task A, we applied a random forest
algorithm on features extracted using unigram with latent semantic analysis
(LSA). On the other hand, for task B, we only used logistic regression on
TF-IDF features. Our model results in F1 score of 0.812.
Related papers
- Mavericks at ArAIEval Shared Task: Towards a Safer Digital Space --
Transformer Ensemble Models Tackling Deception and Persuasion [0.0]
We present our approaches for task 1-A and task 2-A of the shared task which focus on persuasion technique detection and disinformation detection respectively.
The tasks use multigenre snippets of tweets and news articles for the given binary classification problem.
We achieved a micro F1-score of 0.742 on task 1-A (8th rank on the leaderboard) and 0.901 on task 2-A (7th rank on the leaderboard) respectively.
arXiv Detail & Related papers (2023-11-30T17:26:57Z) - Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data.
We design a simple but effective ensemble-based framework that combines various transfer learning techniques.
We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z) - Question-Answer Sentence Graph for Joint Modeling Answer Selection [122.29142965960138]
We train and integrate state-of-the-art (SOTA) models for computing scores between question-question, question-answer, and answer-answer pairs.
Online inference is then performed to solve the AS2 task on unseen queries.
arXiv Detail & Related papers (2022-02-16T05:59:53Z) - Sequence-level self-learning with multiple hypotheses [53.04725240411895]
We develop new self-learning techniques with an attention-based sequence-to-sequence (seq2seq) model for automatic speech recognition (ASR)
In contrast to conventional unsupervised learning approaches, we adopt the emphmulti-task learning (MTL) framework.
Our experiment results show that our method can reduce the WER on the British speech data from 14.55% to 10.36% compared to the baseline model trained with the US English data only.
arXiv Detail & Related papers (2021-12-10T20:47:58Z) - Detecting Handwritten Mathematical Terms with Sensor Based Data [71.84852429039881]
We propose a solution to the UbiComp 2021 Challenge by Stabilo in which handwritten mathematical terms are supposed to be automatically classified.
The input data set contains data of different writers, with label strings constructed from a total of 15 different possible characters.
arXiv Detail & Related papers (2021-09-12T19:33:34Z) - Reference-based Weak Supervision for Answer Sentence Selection using Web
Data [87.18646699292293]
We introduce Reference-based Weak Supervision (RWS), a fully automatic large-scale data pipeline.
RWS harvests high-quality weakly-supervised answers from abundant Web data.
Our experiments indicate that the produced data consistently bolsters TANDA.
arXiv Detail & Related papers (2021-04-18T19:41:17Z) - Modeling Context in Answer Sentence Selection Systems on a Latency
Budget [87.45819843513598]
We present an approach to efficiently incorporate contextual information in AS2 models.
For each answer candidate, we first use unsupervised similarity techniques to extract relevant sentences from its source document.
Our best approach, which leverages a multi-way attention architecture to efficiently encode context, improves 6% to 11% over nonanswer state of the art in AS2 with minimal impact on system latency.
arXiv Detail & Related papers (2021-01-28T16:24:48Z) - Detecting Hostile Posts using Relational Graph Convolutional Network [1.8734449181723827]
This work is based on the submission to competition conducted by AAAI@2021 for detection of hostile posts in Hindi on social media platforms.
Here, a model is presented for classification of hostile posts using Convolutional Networks.
The proposed model is performing at par with Google's XLM-RoBERTa on the given dataset.
Among all submissions to the challenge, our classification system with XLMRoberta secured 2nd rank on fine-grained classification.
arXiv Detail & Related papers (2021-01-10T06:50:22Z) - Stacking Neural Network Models for Automatic Short Answer Scoring [0.0]
We propose the use of a stacking model based on neural network and XGBoost for classification process with sentence embedding feature.
Best model obtained an F1-score of 0.821 exceeding the previous work at the same dataset.
arXiv Detail & Related papers (2020-10-21T16:00:09Z) - Phonemer at WNUT-2020 Task 2: Sequence Classification Using COVID
Twitter BERT and Bagging Ensemble Technique based on Plurality Voting [0.0]
We develop a system that automatically identifies whether an English Tweet related to the novel coronavirus (COVID-19) is informative or not.
Our final approach achieved an F1-score of 0.9037 and we were ranked sixth overall with F1-score as the evaluation criteria.
arXiv Detail & Related papers (2020-10-01T10:54:54Z) - Tha3aroon at NSURL-2019 Task 8: Semantic Question Similarity in Arabic [5.214494546503266]
We describe our team's effort on the semantic text question similarity task of NSURL 2019.
Our top performing system utilizes several innovative data augmentation techniques to enlarge the training data.
It takes ELMo pre-trained contextual embeddings of the data and feeds them into an ON-LSTM network with self-attention.
arXiv Detail & Related papers (2019-12-28T20:11:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.