UniHD at TSAR-2022 Shared Task: Is Compute All We Need for Lexical
Simplification?
- URL: http://arxiv.org/abs/2301.01764v2
- Date: Thu, 5 Jan 2023 15:22:05 GMT
- Title: UniHD at TSAR-2022 Shared Task: Is Compute All We Need for Lexical
Simplification?
- Authors: Dennis Aumiller and Michael Gertz
- Abstract summary: We describe a pipeline based on prompted GPT-3 responses, beating competing approaches by a wide margin in settings with few training instances.
Applying to the Spanish and Portuguese subset, we achieve state-of-the-art results with only minor modification to the original prompts.
- Score: 2.931632009516441
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous state-of-the-art models for lexical simplification consist of
complex pipelines with several components, each of which requires deep
technical knowledge and fine-tuned interaction to achieve its full potential.
As an alternative, we describe a frustratingly simple pipeline based on
prompted GPT-3 responses, beating competing approaches by a wide margin in
settings with few training instances. Our best-performing submission to the
English language track of the TSAR-2022 shared task consists of an ``ensemble''
of six different prompt templates with varying context levels. As a
late-breaking result, we further detail a language transfer technique that
allows simplification in languages other than English. Applied to the Spanish
and Portuguese subset, we achieve state-of-the-art results with only minor
modification to the original prompts. Aside from detailing the implementation
and setup, we spend the remainder of this work discussing the particularities
of prompting and implications for future work. Code for the experiments is
available online at https://github.com/dennlinger/TSAR-2022-Shared-Task
Related papers
- Rethinking and Improving Multi-task Learning for End-to-end Speech
Translation [51.713683037303035]
We investigate the consistency between different tasks, considering different times and modules.
We find that the textual encoder primarily facilitates cross-modal conversion, but the presence of noise in speech impedes the consistency between text and speech representations.
We propose an improved multi-task learning (IMTL) approach for the ST task, which bridges the modal gap by mitigating the difference in length and representation.
arXiv Detail & Related papers (2023-11-07T08:48:46Z) - ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text
Translation [79.66359274050885]
We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models.
Our approach has demonstrated effectiveness in end-to-end speech-to-text translation tasks.
arXiv Detail & Related papers (2023-05-24T07:42:15Z) - A Study on the Integration of Pipeline and E2E SLU systems for Spoken
Semantic Parsing toward STOP Quality Challenge [33.89616011003973]
We describe our proposed spoken semantic parsing system for the quality track (Track 1) in Spoken Language Understanding Grand Challenge.
Strong automatic speech recognition (ASR) models like Whisper and pretrained Language models (LM) like BART are utilized inside our SLU framework to boost performance.
We also investigate the output level combination of various models to get an exact match accuracy of 80.8, which won the 1st place at the challenge.
arXiv Detail & Related papers (2023-05-02T17:25:19Z) - Findings of the TSAR-2022 Shared Task on Multilingual Lexical
Simplification [12.33631648094732]
The TSAR-2022 shared task was organized as part of the Workshop on Text Simplification, Accessibility, and Readability TSAR-2022 held in conjunction with EMNLP 2022.
The task called the Natural Language Processing research community to contribute with methods to advance the state of the art in multilingual lexical simplification for English, Portuguese, and Spanish.
Results of the shared task indicate new benchmarks in Lexical Simplification with English lexical simplification quantitative results noticeably higher than those obtained for Spanish and (Brazilian) Portuguese.
arXiv Detail & Related papers (2023-02-06T15:53:51Z) - Lexical Simplification using multi level and modular approach [1.9559144041082446]
This paper explains the work done by our team "teamPN" for English sub task.
We created a modular pipeline which combines modern day transformers based models with traditional NLP methods.
arXiv Detail & Related papers (2023-02-03T15:57:54Z) - MANTIS at TSAR-2022 Shared Task: Improved Unsupervised Lexical
Simplification with Pretrained Encoders [31.64341800095214]
We present our contribution to the TSAR-2022 Shared Task on Lexical Simplification of the EMNLP 2022 Workshop on Text Simplification, Accessibility, and Readability.
Our approach builds on and extends the unsupervised lexical simplification system with pretrained encoders (LSBert) system.
Our best-performing system improves LSBert by 5.9% accuracy and second place out of 33 ranked solutions.
arXiv Detail & Related papers (2022-12-19T20:57:45Z) - LASP: Text-to-Text Optimization for Language-Aware Soft Prompting of
Vision & Language Models [67.19124099815645]
We propose a novel Language-Aware Soft Prompting (LASP) learning method to alleviate base class overfitting.
LASP is inherently amenable to including, during training, virtual classes, i.e. class names for which no visual samples are available.
LASP matches and surpasses, for the first time, the accuracy on novel classes obtained by hand-crafted prompts and CLIP for 8 out of 11 test datasets.
arXiv Detail & Related papers (2022-10-03T17:56:35Z) - FST: the FAIR Speech Translation System for the IWSLT21 Multilingual
Shared Task [36.51221186190272]
We describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign.
Our system is built by leveraging transfer learning across modalities, tasks and languages.
arXiv Detail & Related papers (2021-07-14T19:43:44Z) - "Listen, Understand and Translate": Triple Supervision Decouples
End-to-end Speech-to-text Translation [49.610188741500274]
An end-to-end speech-to-text translation (ST) takes audio in a source language and outputs the text in a target language.
Existing methods are limited by the amount of parallel corpus.
We build a system to fully utilize signals in a parallel ST corpus.
arXiv Detail & Related papers (2020-09-21T09:19:07Z) - Abstractive Summarization of Spoken and Written Instructions with BERT [66.14755043607776]
We present the first application of the BERTSum model to conversational language.
We generate abstractive summaries of narrated instructional videos across a wide variety of topics.
We envision this integrated as a feature in intelligent virtual assistants, enabling them to summarize both written and spoken instructional content upon request.
arXiv Detail & Related papers (2020-08-21T20:59:34Z) - BURT: BERT-inspired Universal Representation from Twin Structure [89.82415322763475]
BURT (BERT inspired Universal Representation from Twin Structure) is capable of generating universal, fixed-size representations for input sequences of any granularity.
Our proposed BURT adopts the Siamese network, learning sentence-level representations from natural language inference dataset and word/phrase-level representations from paraphrasing dataset.
We evaluate BURT across different granularities of text similarity tasks, including STS tasks, SemEval2013 Task 5(a) and some commonly used word similarity tasks.
arXiv Detail & Related papers (2020-04-29T04:01:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.