Evaluating German Transformer Language Models with Syntactic Agreement
Tests
- URL: http://arxiv.org/abs/2007.03765v1
- Date: Tue, 7 Jul 2020 20:01:42 GMT
- Title: Evaluating German Transformer Language Models with Syntactic Agreement
Tests
- Authors: Karolina Zaczynska, Nils Feldhus, Robert Schwarzenberg, Aleksandra
Gabryszak, Sebastian M\"oller
- Abstract summary: Pre-trained transformer language models (TLMs) have recently refashioned natural language processing (NLP)
We design numerous agreement tasks, some of which consider peculiarities of the German language.
Our experimental results show that state-of-the-art German TLMs generally perform well on agreement tasks.
- Score: 63.760423764010376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pre-trained transformer language models (TLMs) have recently refashioned
natural language processing (NLP): Most state-of-the-art NLP models now operate
on top of TLMs to benefit from contextualization and knowledge induction. To
explain their success, the scientific community conducted numerous analyses.
Besides other methods, syntactic agreement tests were utilized to analyse TLMs.
Most of the studies were conducted for the English language, however. In this
work, we analyse German TLMs. To this end, we design numerous agreement tasks,
some of which consider peculiarities of the German language. Our experimental
results show that state-of-the-art German TLMs generally perform well on
agreement tasks, but we also identify and discuss syntactic structures that
push them to their limits.
Related papers
- Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance [6.907734681124986]
This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts.
We evaluate the performance of models such as Mistral, TowerInstruct, OpenHathi, Tamil-Llama, and Kan-Llama across languages including English, German, French, Italian, Spanish, Hindi, Tamil, and Kannada.
arXiv Detail & Related papers (2024-06-17T01:54:27Z) - Adapting Large Language Models for Document-Level Machine Translation [46.370862171452444]
Large language models (LLMs) have significantly advanced various natural language processing (NLP) tasks.
Recent research indicates that moderately-sized LLMs often outperform larger ones after task-specific fine-tuning.
This study focuses on adapting LLMs for document-level machine translation (DocMT) for specific language pairs.
arXiv Detail & Related papers (2024-01-12T09:29:13Z) - Speech Translation with Large Language Models: An Industrial Practice [64.5419534101104]
We introduce LLM-ST, a novel and effective speech translation model constructed upon a pre-trained large language model (LLM)
By integrating the large language model (LLM) with a speech encoder and employing multi-task instruction tuning, LLM-ST can produce accurate timestamped transcriptions and translations.
Through rigorous experimentation on English and Chinese datasets, we showcase the exceptional performance of LLM-ST.
arXiv Detail & Related papers (2023-12-21T05:32:49Z) - A Comparative Analysis of Pretrained Language Models for Text-to-Speech [13.962029761484022]
State-of-the-art text-to-speech (TTS) systems have utilized pretrained language models (PLMs) to enhance prosody and create more natural-sounding speech.
While PLMs have been extensively researched for natural language understanding (NLU), their impact on TTS has been overlooked.
This study is the first study of its kind to investigate the impact of different PLMs on TTS.
arXiv Detail & Related papers (2023-09-04T13:02:27Z) - How Does Pretraining Improve Discourse-Aware Translation? [41.20896077662125]
We introduce a probing task to interpret the ability of pretrained language models to capture discourse relation knowledge.
We validate three state-of-the-art PLMs across encoder-, decoder-, and encoder-decoder-based models.
Our findings are instructive to understand how and when discourse knowledge in PLMs should work for downstream tasks.
arXiv Detail & Related papers (2023-05-31T13:36:51Z) - Translate to Disambiguate: Zero-shot Multilingual Word Sense
Disambiguation with Pretrained Language Models [67.19567060894563]
Pretrained Language Models (PLMs) learn rich cross-lingual knowledge and can be finetuned to perform well on diverse tasks.
We present a new study investigating how well PLMs capture cross-lingual word sense with Contextual Word-Level Translation (C-WLT)
We find that as the model size increases, PLMs encode more cross-lingual word sense knowledge and better use context to improve WLT performance.
arXiv Detail & Related papers (2023-04-26T19:55:52Z) - Document-Level Machine Translation with Large Language Models [91.03359121149595]
Large language models (LLMs) can produce coherent, cohesive, relevant, and fluent answers for various natural language processing (NLP) tasks.
This paper provides an in-depth evaluation of LLMs' ability on discourse modeling.
arXiv Detail & Related papers (2023-04-05T03:49:06Z) - LERT: A Linguistically-motivated Pre-trained Language Model [67.65651497173998]
We propose LERT, a pre-trained language model that is trained on three types of linguistic features along with the original pre-training task.
We carried out extensive experiments on ten Chinese NLU tasks, and the experimental results show that LERT could bring significant improvements.
arXiv Detail & Related papers (2022-11-10T05:09:16Z) - Is Supervised Syntactic Parsing Beneficial for Language Understanding?
An Empirical Investigation [71.70562795158625]
Traditional NLP has long held (supervised) syntactic parsing necessary for successful higher-level semantic language understanding (LU)
Recent advent of end-to-end neural models, self-supervised via language modeling (LM), and their success on a wide range of LU tasks, questions this belief.
We empirically investigate the usefulness of supervised parsing for semantic LU in the context of LM-pretrained transformer networks.
arXiv Detail & Related papers (2020-08-15T21:03:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.