Cross-Lingual Fine-Grained Entity Typing
- URL: http://arxiv.org/abs/2110.07837v1
- Date: Fri, 15 Oct 2021 03:22:30 GMT
- Title: Cross-Lingual Fine-Grained Entity Typing
- Authors: Nila Selvaraj, Yasumasa Onoe, and Greg Durrett
- Abstract summary: We present a unified cross-lingual fine-grained entity typing model capable of handling over 100 languages.
We analyze this model's ability to generalize to languages and entities unseen during training.
- Score: 26.973783464706447
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The growth of cross-lingual pre-trained models has enabled NLP tools to
rapidly generalize to new languages. While these models have been applied to
tasks involving entities, their ability to explicitly predict typological
features of these entities across languages has not been established. In this
paper, we present a unified cross-lingual fine-grained entity typing model
capable of handling over 100 languages and analyze this model's ability to
generalize to languages and entities unseen during training. We train this
model on cross-lingual training data collected from Wikipedia hyperlinks in
multiple languages (training languages). During inference, our model takes an
entity mention and context in a particular language (test language, possibly
not in the training languages) and predicts fine-grained types for that entity.
Generalizing to new languages and unseen entities are the fundamental
challenges of this entity typing setup, so we focus our evaluation on these
settings and compare against simple yet powerful string match baselines.
Experimental results show that our approach outperforms the baselines on unseen
languages such as Japanese, Tamil, Arabic, Serbian, and Persian. In addition,
our approach substantially improves performance on unseen entities (even in
unseen languages) over the baselines, and human evaluation shows a strong
ability to predict relevant types in these settings.
Related papers
- Tik-to-Tok: Translating Language Models One Token at a Time: An
Embedding Initialization Strategy for Efficient Language Adaptation [19.624330093598996]
Training monolingual language models for low and mid-resource languages is made challenging by limited and often inadequate pretraining data.
By generalizing over a word translation dictionary encompassing both the source and target languages, we map tokens from the target tokenizer to semantically similar tokens from the source language tokenizer.
We conduct experiments to convert high-resource models to mid- and low-resource languages, namely Dutch and Frisian.
arXiv Detail & Related papers (2023-10-05T11:45:29Z) - On the Impact of Language Selection for Training and Evaluating
Programming Language Models [16.125924759649106]
We evaluate the similarity of programming languages by analyzing their representations using a CodeBERT-based model.
Our experiments reveal that token representation in languages such as C++, Python, and Java exhibit proximity to one another, whereas the same tokens in languages such as Mathematica and R display significant dissimilarity.
arXiv Detail & Related papers (2023-08-25T12:57:59Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of
Multilingual Language Models [73.11488464916668]
This study investigates the dynamics of the multilingual pretraining process.
We probe checkpoints taken from throughout XLM-R pretraining, using a suite of linguistic tasks.
Our analysis shows that the model achieves high in-language performance early on, with lower-level linguistic skills acquired before more complex ones.
arXiv Detail & Related papers (2022-05-24T03:35:00Z) - Predicting the Performance of Multilingual NLP Models [16.250791929966685]
This paper proposes an alternate solution for evaluating a model across languages which make use of the existing performance scores of the model on languages that a particular task has test sets for.
We train a predictor on these performance scores and use this predictor to predict the model's performance in different evaluation settings.
Our results show that our method is effective in filling the gaps in the evaluation for an existing set of languages, but might require additional improvements if we want it to generalize to unseen languages.
arXiv Detail & Related papers (2021-10-17T17:36:53Z) - mLUKE: The Power of Entity Representations in Multilingual Pretrained
Language Models [15.873069955407406]
We train a multilingual language model with 24 languages with entity representations.
We show the model consistently outperforms word-based pretrained models in various cross-lingual transfer tasks.
We also evaluate the model with a multilingual cloze prompt task with the mLAMA dataset.
arXiv Detail & Related papers (2021-10-15T15:28:38Z) - Language Models are Few-shot Multilingual Learners [66.11011385895195]
We evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages.
We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones.
arXiv Detail & Related papers (2021-09-16T03:08:22Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z) - Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language
Model [58.27176041092891]
Recent research indicates that pretraining cross-lingual language models on large-scale unlabeled texts yields significant performance improvements.
We propose a novel unsupervised feature decomposition method that can automatically extract domain-specific features from the entangled pretrained cross-lingual representations.
Our proposed model leverages mutual information estimation to decompose the representations computed by a cross-lingual model into domain-invariant and domain-specific parts.
arXiv Detail & Related papers (2020-11-23T16:00:42Z) - Linguistic Typology Features from Text: Inferring the Sparse Features of
World Atlas of Language Structures [73.06435180872293]
We construct a recurrent neural network predictor based on byte embeddings and convolutional layers.
We show that some features from various linguistic types can be predicted reliably.
arXiv Detail & Related papers (2020-04-30T21:00:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.