Multi-Sense Language Modelling
- URL: http://arxiv.org/abs/2012.05776v1
- Date: Thu, 10 Dec 2020 16:06:05 GMT
- Title: Multi-Sense Language Modelling
- Authors: Andrea Lekkas, Peter Schneider-Kamp, Isabelle Augenstein
- Abstract summary: We propose a language model which not only predicts the next word, but also its sense in context.
This higher prediction granularity may be useful for end tasks such as assistive writing.
For sense prediction, we utilise a Graph Attention Network, which encodes definitions and example uses of word senses.
- Score: 19.396806939258806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The effectiveness of a language model is influenced by its token
representations, which must encode contextual information and handle the same
word form having a plurality of meanings (polysemy). Currently, none of the
common language modelling architectures explicitly model polysemy. We propose a
language model which not only predicts the next word, but also its sense in
context. We argue that this higher prediction granularity may be useful for end
tasks such as assistive writing, and allow for more a precise linking of
language models with knowledge bases. We find that multi-sense language
modelling requires architectures that go beyond standard language models, and
here propose a structured prediction framework that decomposes the task into a
word followed by a sense prediction task. For sense prediction, we utilise a
Graph Attention Network, which encodes definitions and example uses of word
senses. Overall, we find that multi-sense language modelling is a highly
challenging task, and suggest that future work focus on the creation of more
annotated training datasets.
Related papers
- Enhanced Auto Language Prediction with Dictionary Capsule -- A Novel
Approach [0.0]
The paper presents a novel Auto Language Prediction Dictionary Capsule framework for language prediction and machine translation.
The model uses a combination of neural networks and symbolic representations to predict the language of a given input text and then translate it to a target language using pre-built dictionaries.
arXiv Detail & Related papers (2024-03-09T18:43:48Z) - Visually Grounded Language Learning: a review of language games,
datasets, tasks, and models [60.2604624857992]
Many Vision+Language (V+L) tasks have been defined with the aim of creating models that can ground symbols in the visual modality.
In this work, we provide a systematic literature review of several tasks and models proposed in the V+L field.
arXiv Detail & Related papers (2023-12-05T02:17:29Z) - Tokenization Impacts Multilingual Language Modeling: Assessing
Vocabulary Allocation and Overlap Across Languages [3.716965622352967]
We propose new criteria to evaluate the quality of lexical representation and vocabulary overlap observed in sub-word tokenizers.
Our findings show that the overlap of vocabulary across languages can be actually detrimental to certain downstream tasks.
arXiv Detail & Related papers (2023-05-26T18:06:49Z) - Localization vs. Semantics: Visual Representations in Unimodal and
Multimodal Models [57.08925810659545]
We conduct a comparative analysis of the visual representations in existing vision-and-language models and vision-only models.
Our empirical observations suggest that vision-and-language models are better at label prediction tasks.
We hope our study sheds light on the role of language in visual learning, and serves as an empirical guide for various pretrained models.
arXiv Detail & Related papers (2022-12-01T05:00:18Z) - Language Models are not Models of Language [0.0]
Transfer learning has enabled large deep learning neural networks trained on the language modeling task to vastly improve performance.
We argue that the term language model is misleading because deep learning models are not theoretical models of language.
arXiv Detail & Related papers (2021-12-13T22:39:46Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Vokenization: Improving Language Understanding with Contextualized,
Visual-Grounded Supervision [110.66085917826648]
We develop a technique that extrapolates multimodal alignments to language-only data by contextually mapping language tokens to their related images.
"vokenization" is trained on relatively small image captioning datasets and we then apply it to generate vokens for large language corpora.
Trained with these contextually generated vokens, our visually-supervised language models show consistent improvements over self-supervised alternatives on multiple pure-language tasks.
arXiv Detail & Related papers (2020-10-14T02:11:51Z) - Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size.
We propose a fully compositional output embedding layer for language models.
To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.