Related papers: Language Modelling as a Multi-Task Problem

Language Modelling as a Multi-Task Problem

URL: http://arxiv.org/abs/2101.11287v1
Date: Wed, 27 Jan 2021 09:47:42 GMT
Title: Language Modelling as a Multi-Task Problem
Authors: Lucas Weber, Jaap Jumelet, Elia Bruni and Dieuwke Hupkes
Abstract summary: We investigate whether language models adhere to learning principles of multi-task learning during training. Experiments demonstrate that a multi-task setting naturally emerges within the objective of the more general task of language modelling.
Score: 12.48699285085636
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we propose to study language modelling as a multi-task problem, bringing together three strands of research: multi-task learning, linguistics, and interpretability. Based on hypotheses derived from linguistic theory, we investigate whether language models adhere to learning principles of multi-task learning during training. To showcase the idea, we analyse the generalisation behaviour of language models as they learn the linguistic concept of Negative Polarity Items (NPIs). Our experiments demonstrate that a multi-task setting naturally emerges within the objective of the more general task of language modelling.We argue that this insight is valuable for multi-task learning, linguistics and interpretability research and can lead to exciting new findings in all three domains.

Related papers

The Multilingual Mind : A Survey of Multilingual Reasoning in Language Models [18.399229357408043]
Multilingual reasoning requires language models to handle logical reasoning across languages. This survey provides the first in-depth review of multilingual reasoning in Language Models.
arXiv Detail & Related papers (2025-02-13T16:25:16Z)
A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers [51.8203871494146]
The rapid development of Large Language Models (LLMs) demonstrates remarkable multilingual capabilities in natural language processing. Despite the breakthroughs of LLMs, the investigation into the multilingual scenario remains insufficient. This survey aims to help the research community address multilingual problems and provide a comprehensive understanding of the core concepts, key techniques, and latest developments in multilingual natural language processing based on LLMs.
arXiv Detail & Related papers (2024-05-17T17:47:39Z)
Visually Grounded Language Learning: a review of language games, datasets, tasks, and models [60.2604624857992]
Many Vision+Language (V+L) tasks have been defined with the aim of creating models that can ground symbols in the visual modality. In this work, we provide a systematic literature review of several tasks and models proposed in the V+L field.
arXiv Detail & Related papers (2023-12-05T02:17:29Z)
Exploring the Maze of Multilingual Modeling [2.0849578298972835]
We present a comprehensive evaluation of three popular multilingual language models: mBERT, XLM-R, and GPT-3. Our findings reveal that while the amount of language-specific pretraining data plays a crucial role in model performance, we also identify other factors such as general resource availability, language family, and script type, as important features.
arXiv Detail & Related papers (2023-10-09T04:48:14Z)
Multilingual Multi-Figurative Language Detection [14.799109368073548]
figurative language understanding is highly understudied in a multilingual setting. We introduce multilingual multi-figurative language modelling, and provide a benchmark for sentence-level figurative language detection. We develop a framework for figurative language detection based on template-based prompt learning.
arXiv Detail & Related papers (2023-05-31T18:52:41Z)
Universal and Independent: Multilingual Probing Framework for Exhaustive Model Interpretation and Evaluation [0.04199844472131922]
We present and apply the GUI-assisted framework allowing us to easily probe a massive number of languages. Most of the regularities revealed in the mBERT model are typical for the western-European languages. Our framework can be integrated with the existing probing toolboxes, model cards, and leaderboards.
arXiv Detail & Related papers (2022-10-24T13:41:17Z)
Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models [73.11488464916668]
This study investigates the dynamics of the multilingual pretraining process. We probe checkpoints taken from throughout XLM-R pretraining, using a suite of linguistic tasks. Our analysis shows that the model achieves high in-language performance early on, with lower-level linguistic skills acquired before more complex ones.
arXiv Detail & Related papers (2022-05-24T03:35:00Z)
Analyzing the Limits of Self-Supervision in Handling Bias in Language [52.26068057260399]
We evaluate how well language models capture the semantics of four tasks for bias: diagnosis, identification, extraction and rephrasing. Our analyses indicate that language models are capable of performing these tasks to widely varying degrees across different bias dimensions, such as gender and political affiliation.
arXiv Detail & Related papers (2021-12-16T05:36:08Z)
Discovering Representation Sprachbund For Multilingual Pre-Training [139.05668687865688]
We generate language representation from multilingual pre-trained models and conduct linguistic analysis. We cluster all the target languages into multiple groups and name each group as a representation sprachbund. Experiments are conducted on cross-lingual benchmarks and significant improvements are achieved compared to strong baselines.
arXiv Detail & Related papers (2021-09-01T09:32:06Z)
Specializing Multilingual Language Models: An Empirical Study [50.7526245872855]
Contextualized word representations from pretrained multilingual language models have become the de facto standard for addressing natural language tasks. For languages rarely or never seen by these models, directly using such models often results in suboptimal representation or use of data.
arXiv Detail & Related papers (2021-06-16T18:13:55Z)
Are pre-trained text representations useful for multilingual and multi-dimensional language proficiency modeling? [6.294759639481189]
This paper describes our experiments and observations about the role of pre-trained and fine-tuned multilingual embeddings in performing multi-dimensional, multilingual language proficiency classification. Our results indicate that while fine-tuned embeddings are useful for multilingual proficiency modeling, none of the features achieve consistently best performance for all dimensions of language proficiency.
arXiv Detail & Related papers (2021-02-25T16:23:52Z)
Meta-Learning for Effective Multi-task and Multilingual Modelling [23.53779501937046]
We propose a meta-learning approach to learn the interactions between both tasks and languages. We present experiments on five different tasks and six different languages from the XTREME multilingual benchmark dataset.
arXiv Detail & Related papers (2021-01-25T19:30:26Z)
Bridging Linguistic Typology and Multilingual Machine Translation with Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source. We observe that our representations embed typology and strengthen correlations with language relationships. We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.