IITK@LCP at SemEval 2021 Task 1: Classification for Lexical Complexity
Regression Task
- URL: http://arxiv.org/abs/2104.01046v1
- Date: Fri, 2 Apr 2021 13:40:12 GMT
- Title: IITK@LCP at SemEval 2021 Task 1: Classification for Lexical Complexity
Regression Task
- Authors: Neil Rajiv Shirude, Sagnik Mukherjee, Tushar Shandhilya, Ananta
Mukherjee, Ashutosh Modi
- Abstract summary: We leverage the ELECTRA model and attempt to mirror the data annotation scheme.
This somewhat counter-intuitive approach achieved an MAE score of 0.0654 for Sub-Task 1 and MAE of 0.0811 on Sub-Task 2.
- Score: 1.5952305322416085
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper describes our contribution to SemEval 2021 Task 1: Lexical
Complexity Prediction. In our approach, we leverage the ELECTRA model and
attempt to mirror the data annotation scheme. Although the task is a regression
task, we show that we can treat it as an aggregation of several classification
and regression models. This somewhat counter-intuitive approach achieved an MAE
score of 0.0654 for Sub-Task 1 and MAE of 0.0811 on Sub-Task 2. Additionally,
we used the concept of weak supervision signals from Gloss-BERT in our work,
and it significantly improved the MAE score in Sub-Task 1.
Related papers
- Interpetable Target-Feature Aggregation for Multi-Task Learning based on Bias-Variance Analysis [53.38518232934096]
Multi-task learning (MTL) is a powerful machine learning paradigm designed to leverage shared knowledge across tasks to improve generalization and performance.
We propose an MTL approach at the intersection between task clustering and feature transformation based on a two-phase iterative aggregation of targets and features.
In both phases, a key aspect is to preserve the interpretability of the reduced targets and features through the aggregation with the mean, which is motivated by applications to Earth science.
arXiv Detail & Related papers (2024-06-12T08:30:16Z) - The All-Seeing Project V2: Towards General Relation Comprehension of the Open World [58.40101895719467]
We present the All-Seeing Project V2, a new model and dataset designed for understanding object relations in images.
We propose the All-Seeing Model V2 that integrates the formulation of text generation, object localization, and relation comprehension into a relation conversation task.
Our model excels not only in perceiving and recognizing all objects within the image but also in grasping the intricate relation graph between them.
arXiv Detail & Related papers (2024-02-29T18:59:17Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - UU-Tax at SemEval-2022 Task 3: Improving the generalizability of
language models for taxonomy classification through data augmentation [0.0]
This paper addresses the SemEval-2022 Task 3 PreTENS: Presupposed Taxonomies evaluating Neural Network Semantics.
The goal of the task is to identify if a sentence is deemed acceptable or not, depending on the taxonomic relationship that holds between a noun pair contained in the sentence.
We propose an effective way to enhance the robustness and the generalizability of language models for better classification.
arXiv Detail & Related papers (2022-10-07T07:41:28Z) - RoBLEURT Submission for the WMT2021 Metrics Task [72.26898579202076]
We present our submission to the Shared Metrics Task: RoBLEURT.
Our model reaches state-of-the-art correlations with the WMT 2020 human annotations upon 8 out of 10 to-English language pairs.
arXiv Detail & Related papers (2022-04-28T08:49:40Z) - Team Enigma at ArgMining-EMNLP 2021: Leveraging Pre-trained Language
Models for Key Point Matching [0.0]
We present the system description for our submission towards the Key Point Analysis Shared Task at ArgMining 2021.
We leveraged existing state of the art pre-trained language models along with incorporating additional data and features extracted from the inputs (topics, key points, and arguments) to improve performance.
We were able to achieve mAP strict and mAP relaxed score of 0.872 and 0.966 respectively in the evaluation phase, securing 5th place on the leaderboard.
arXiv Detail & Related papers (2021-10-24T07:10:39Z) - TAPAS at SemEval-2021 Task 9: Reasoning over tables with intermediate
pre-training [3.0079490585515343]
We learn two binary classification models: A first model to predict if a statement is neutral or non-neutral and a second one to predict if it is entailed or refuted.
We find that the artificial neutral examples are somewhat effective at training the first model, achieving 68.03 test F1 versus the 60.47 of a majority baseline.
arXiv Detail & Related papers (2021-04-02T15:47:08Z) - Task Aligned Generative Meta-learning for Zero-shot Learning [64.16125851588437]
We propose a Task-aligned Generative Meta-learning model for Zero-shot learning (TGMZ)
TGMZ mitigates the potentially biased training and enables meta-ZSL to accommodate real-world datasets containing diverse distributions.
Our comparisons with state-of-the-art algorithms show the improvements of 2.1%, 3.0%, 2.5%, and 7.6% achieved by TGMZ on AWA1, AWA2, CUB, and aPY datasets.
arXiv Detail & Related papers (2021-03-03T05:18:36Z) - Revisiting LSTM Networks for Semi-Supervised Text Classification via
Mixed Objective Function [106.69643619725652]
We develop a training strategy that allows even a simple BiLSTM model, when trained with cross-entropy loss, to achieve competitive results.
We report state-of-the-art results for text classification task on several benchmark datasets.
arXiv Detail & Related papers (2020-09-08T21:55:22Z) - ERNIE at SemEval-2020 Task 10: Learning Word Emphasis Selection by
Pre-trained Language Model [18.41476971318978]
This paper describes the system designed by ERNIE Team which achieved the first place in SemEval-2020 Task 10: Emphasis Selection For Written Text in Visual Media.
We leverage the unsupervised pre-training model and finetune these models on our task.
Our best model achieves the highest score of 0.823 and ranks first for all kinds of metrics.
arXiv Detail & Related papers (2020-09-08T12:51:22Z) - Yseop at SemEval-2020 Task 5: Cascaded BERT Language Model for
Counterfactual Statement Analysis [0.0]
We use a BERT base model for the classification task and build a hybrid BERT Multi-Layer Perceptron system to handle the sequence identification task.
Our experiments show that while introducing syntactic and semantic features does little in improving the system in the classification task, using these types of features as cascaded linear inputs to fine-tune the sequence-delimiting ability of the model ensures it outperforms other similar-purpose complex systems like BiLSTM-CRF in the second task.
arXiv Detail & Related papers (2020-05-18T08:19:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.