Cross-Cultural Similarity Features for Cross-Lingual Transfer Learning
of Pragmatically Motivated Tasks
- URL: http://arxiv.org/abs/2006.09336v2
- Date: Thu, 8 Apr 2021 08:31:54 GMT
- Title: Cross-Cultural Similarity Features for Cross-Lingual Transfer Learning
of Pragmatically Motivated Tasks
- Authors: Jimin Sun, Hwijeen Ahn, Chan Young Park, Yulia Tsvetkov, David R.
Mortensen
- Abstract summary: We introduce three linguistic features that capture cross-cultural similarities that manifest in linguistic patterns and quantify distinct aspects of language pragmatics.
Our analyses show that the proposed pragmatic features do capture cross-cultural similarities and align well with existing work in sociolinguistics and linguistic anthropology.
- Score: 30.580822082075475
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Much work in cross-lingual transfer learning explored how to select better
transfer languages for multilingual tasks, primarily focusing on typological
and genealogical similarities between languages. We hypothesize that these
measures of linguistic proximity are not enough when working with
pragmatically-motivated tasks, such as sentiment analysis. As an alternative,
we introduce three linguistic features that capture cross-cultural similarities
that manifest in linguistic patterns and quantify distinct aspects of language
pragmatics: language context-level, figurative language, and the lexification
of emotion concepts. Our analyses show that the proposed pragmatic features do
capture cross-cultural similarities and align well with existing work in
sociolinguistics and linguistic anthropology. We further corroborate the
effectiveness of pragmatically-driven transfer in the downstream task of
choosing transfer languages for cross-lingual sentiment analysis.
Related papers
- Event Extraction in Basque: Typologically motivated Cross-Lingual Transfer-Learning Analysis [18.25948580496853]
Cross-lingual transfer-learning is widely used in Event Extraction for low-resource languages.
This paper studies whether the typological similarity between source and target languages impacts the performance of cross-lingual transfer.
arXiv Detail & Related papers (2024-04-09T15:35:41Z) - Can Machine Translation Bridge Multilingual Pretraining and Cross-lingual Transfer Learning? [8.630930380973489]
This paper investigates the potential benefits of employing machine translation as a continued training objective to enhance language representation learning.
Our results show that, contrary to expectations, machine translation as the continued training fails to enhance cross-lingual representation learning.
We conclude that explicit sentence-level alignment in the cross-lingual scenario is detrimental to cross-lingual transfer pretraining.
arXiv Detail & Related papers (2024-03-25T13:53:04Z) - A study of conceptual language similarity: comparison and evaluation [0.3093890460224435]
An interesting line of research in natural language processing (NLP) aims to incorporate linguistic typology to bridge linguistic diversity.
Recent work has introduced a novel approach to defining language similarity based on how they represent basic concepts.
In this work, we study the conceptual similarity in detail and evaluate it extensively on a binary classification task.
arXiv Detail & Related papers (2023-05-22T18:28:02Z) - Cross-Linguistic Syntactic Difference in Multilingual BERT: How Good is
It and How Does It Affect Transfer? [50.48082721476612]
Multilingual BERT (mBERT) has demonstrated considerable cross-lingual syntactic ability.
We investigate the distributions of grammatical relations induced from mBERT in the context of 24 typologically different languages.
arXiv Detail & Related papers (2022-12-21T09:44:08Z) - Cross-lingual Lifelong Learning [53.06904052325966]
We present a principled Cross-lingual Continual Learning (CCL) evaluation paradigm.
We provide insights into what makes multilingual sequential learning particularly challenging.
The implications of this analysis include a recipe for how to measure and balance different cross-lingual continual learning desiderata.
arXiv Detail & Related papers (2022-05-23T09:25:43Z) - Analyzing Gender Representation in Multilingual Models [59.21915055702203]
We focus on the representation of gender distinctions as a practical case study.
We examine the extent to which the gender concept is encoded in shared subspaces across different languages.
arXiv Detail & Related papers (2022-04-20T00:13:01Z) - Cross-Lingual Ability of Multilingual Masked Language Models: A Study of
Language Structure [54.01613740115601]
We study three language properties: constituent order, composition and word co-occurrence.
Our main conclusion is that the contribution of constituent order and word co-occurrence is limited, while the composition is more crucial to the success of cross-linguistic transfer.
arXiv Detail & Related papers (2022-03-16T07:09:35Z) - Discovering Representation Sprachbund For Multilingual Pre-Training [139.05668687865688]
We generate language representation from multilingual pre-trained models and conduct linguistic analysis.
We cluster all the target languages into multiple groups and name each group as a representation sprachbund.
Experiments are conducted on cross-lingual benchmarks and significant improvements are achieved compared to strong baselines.
arXiv Detail & Related papers (2021-09-01T09:32:06Z) - Rediscovering the Slavic Continuum in Representations Emerging from
Neural Models of Spoken Language Identification [16.369477141866405]
We present a neural model for Slavic language identification in speech signals.
We analyze its emergent representations to investigate whether they reflect objective measures of language relatedness.
arXiv Detail & Related papers (2020-10-22T18:18:19Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.