OCHADAI-KYODAI at SemEval-2021 Task 1: Enhancing Model Generalization
and Robustness for Lexical Complexity Prediction
- URL: http://arxiv.org/abs/2105.05535v2
- Date: Thu, 13 May 2021 01:20:17 GMT
- Title: OCHADAI-KYODAI at SemEval-2021 Task 1: Enhancing Model Generalization
and Robustness for Lexical Complexity Prediction
- Authors: Yuki Taya, Lis Kanashiro Pereira, Fei Cheng, Ichiro Kobayashi
- Abstract summary: We propose an ensemble model for predicting the lexical complexity of words and multiword expressions.
The model receives as input a sentence with a target word or MWEand outputs its complexity score.
Our model achieved competitive results and ranked among the top-10 systems in both sub-tasks.
- Score: 8.066349353140819
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose an ensemble model for predicting the lexical complexity of words
and multiword expressions (MWEs). The model receives as input a sentence with a
target word or MWEand outputs its complexity score. Given that a key challenge
with this task is the limited size of annotated data, our model relies on
pretrained contextual representations from different state-of-the-art
transformer-based language models (i.e., BERT and RoBERTa), and on a variety of
training methods for further enhancing model generalization and
robustness:multi-step fine-tuning and multi-task learning, and adversarial
training. Additionally, we propose to enrich contextual representations by
adding hand-crafted features during training. Our model achieved competitive
results and ranked among the top-10 systems in both sub-tasks.
Related papers
- Unified Generative and Discriminative Training for Multi-modal Large Language Models [88.84491005030316]
Generative training has enabled Vision-Language Models (VLMs) to tackle various complex tasks.
Discriminative training, exemplified by models like CLIP, excels in zero-shot image-text classification and retrieval.
This paper proposes a unified approach that integrates the strengths of both paradigms.
arXiv Detail & Related papers (2024-11-01T01:51:31Z) - A Large-Scale Evaluation of Speech Foundation Models [110.95827399522204]
We establish the Speech processing Universal PERformance Benchmark (SUPERB) to study the effectiveness of the foundation model paradigm for speech.
We propose a unified multi-tasking framework to address speech processing tasks in SUPERB using a frozen foundation model followed by task-specialized, lightweight prediction heads.
arXiv Detail & Related papers (2024-04-15T00:03:16Z) - Split and Rephrase with Large Language Models [2.499907423888049]
Split and Rephrase (SPRP) task consists in splitting complex sentences into a sequence of shorter grammatical sentences.
We evaluate large language models on the task, showing that they can provide large improvements over the state of the art on the main metrics.
arXiv Detail & Related papers (2023-12-18T10:16:37Z) - NOWJ1@ALQAC 2023: Enhancing Legal Task Performance with Classic
Statistical Models and Pre-trained Language Models [4.329463429688995]
This paper describes the NOWJ1 Team's approach for the Automated Legal Question Answering Competition (ALQAC) 2023.
For the document retrieval task, we implement a pre-processing step to overcome input limitations and apply learning-to-rank methods to consolidate features from various models.
We incorporate state-of-the-art models to develop distinct systems for each sub-task, utilizing both classic statistical models and pre-trained Language Models.
arXiv Detail & Related papers (2023-09-16T18:32:15Z) - On Conditional and Compositional Language Model Differentiable Prompting [75.76546041094436]
Prompts have been shown to be an effective method to adapt a frozen Pretrained Language Model (PLM) to perform well on downstream tasks.
We propose a new model, Prompt Production System (PRopS), which learns to transform task instructions or input metadata, into continuous prompts.
arXiv Detail & Related papers (2023-07-04T02:47:42Z) - PaLM-E: An Embodied Multimodal Language Model [101.29116156731762]
We propose embodied language models to incorporate real-world continuous sensor modalities into language models.
We train these encodings end-to-end, in conjunction with a pre-trained large language model, for multiple embodied tasks.
Our largest model, PaLM-E-562B with 562B parameters, is a visual-language generalist with state-of-the-art performance on OK-VQA.
arXiv Detail & Related papers (2023-03-06T18:58:06Z) - ZhichunRoad at Amazon KDD Cup 2022: MultiTask Pre-Training for
E-Commerce Product Search [4.220439000486713]
We propose a robust multilingual model to improve the quality of search results.
In pre-training stage, we adopt mlm task, classification task and contrastive learning task.
In fine-tuning stage, we use confident learning, exponential moving average method (EMA), adversarial training (FGM) and regularized dropout strategy (R-Drop)
arXiv Detail & Related papers (2023-01-31T07:31:34Z) - OCHADAI at SemEval-2022 Task 2: Adversarial Training for Multilingual
Idiomaticity Detection [4.111899441919165]
We propose a multilingual adversarial training model for determining whether a sentence contains an idiomatic expression.
Our model relies on pre-trained contextual representations from different multi-lingual state-of-the-art transformer-based language models.
arXiv Detail & Related papers (2022-06-07T05:52:43Z) - BigGreen at SemEval-2021 Task 1: Lexical Complexity Prediction with
Assembly Models [2.4815579733050153]
This paper describes a system submitted by team BigGreen to 2021 for predicting the lexical complexity of English words in a given context.
We assemble a feature engineering-based model with a deep neural network model founded on BERT.
Our handcrafted features comprise a breadth of lexical, semantic, syntactic, and novel phonological measures.
arXiv Detail & Related papers (2021-04-19T04:05:50Z) - Masked Language Modeling and the Distributional Hypothesis: Order Word
Matters Pre-training for Little [74.49773960145681]
A possible explanation for the impressive performance of masked language model (MLM)-training is that such models have learned to represent the syntactic structures prevalent in NLP pipelines.
In this paper, we propose a different explanation: pre-trains succeed on downstream tasks almost entirely due to their ability to model higher-order word co-occurrence statistics.
Our results show that purely distributional information largely explains the success of pre-training, and underscore the importance of curating challenging evaluation datasets that require deeper linguistic knowledge.
arXiv Detail & Related papers (2021-04-14T06:30:36Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.