Capturing the diversity of multilingual societies
- URL: http://arxiv.org/abs/2105.02570v1
- Date: Thu, 6 May 2021 10:27:43 GMT
- Title: Capturing the diversity of multilingual societies
- Authors: Thomas Louf, David Sanchez and Jose J. Ramasco
- Abstract summary: We consider the processes at work in language shift through a conjunction of theoretical and data-driven perspectives.
A large-scale empirical study of spatial patterns of languages in multilingual societies using Twitter and census data yields a wide diversity.
We propose a model in which coexistence of languages may be reached when learning the other language is facilitated and when bilinguals favor the use of the endangered language.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cultural diversity encoded within languages of the world is at risk, as many
languages have become endangered in the last decades in a context of growing
globalization. To preserve this diversity, it is first necessary to understand
what drives language extinction, and which mechanisms might enable coexistence.
Here, we consider the processes at work in language shift through a conjunction
of theoretical and data-driven perspectives. A large-scale empirical study of
spatial patterns of languages in multilingual societies using Twitter and
census data yields a wide diversity. It ranges from an almost complete mixing
of language speakers, including multilinguals, to segregation with a neat
separation of the linguistic domains and with multilinguals mainly at their
boundaries. To understand how these different states can emerge and,
especially, become stable, we propose a model in which coexistence of languages
may be reached when learning the other language is facilitated and when
bilinguals favor the use of the endangered language. Simulations carried out in
a metapopulation framework highlight the importance of spatial interactions
arising from people mobility to explain the stability of a mixed state or the
presence of a boundary between two linguistic regions. Changes in the
parameters regulating the relation between the languages can destabilize a
system, which undergoes global transitions. According to our model, the
evolution of the system once it undergoes a transition is highly
history-dependent. It is easy to change the status quo but going back to a
previous state may not be simple or even possible.
Related papers
- ShifCon: Enhancing Non-Dominant Language Capabilities with a Shift-based Contrastive Framework [79.72910257530795]
ShifCon is a Shift-based Contrastive framework that aligns the internal forward process of other languages toward that of the dominant one.
It shifts the representations of non-dominant languages into the dominant language subspace, allowing them to access relatively rich information encoded in the model parameters.
Experiments demonstrate that our ShifCon framework significantly enhances the performance of non-dominant languages.
arXiv Detail & Related papers (2024-10-25T10:28:59Z) - A Roadmap for Multilingual, Multimodal Domain Independent Deception Detection [2.1506382989223782]
Deception, a prevalent aspect of human communication, has undergone a significant transformation in the digital age.
Recent studies have shown the possibility of the existence of universal linguistic cues to deception across domains within the English language.
The practical task of deception detection in low-resource languages is not a well-studied problem due to the lack of labeled data.
arXiv Detail & Related papers (2024-05-07T00:38:34Z) - Towards Bridging the Digital Language Divide [4.234367850767171]
multilingual language processing systems often exhibit a hardwired, yet usually involuntary and hidden representational preference towards certain languages.
We show that biased technology is often the result of research and development methodologies that do not do justice to the complexity of the languages being represented.
We present a new initiative that aims at reducing linguistic bias through both technological design and methodology.
arXiv Detail & Related papers (2023-07-25T10:53:20Z) - Multi-lingual and Multi-cultural Figurative Language Understanding [69.47641938200817]
Figurative language permeates human communication, but is relatively understudied in NLP.
We create a dataset for seven diverse languages associated with a variety of cultures: Hindi, Indonesian, Javanese, Kannada, Sundanese, Swahili and Yoruba.
Our dataset reveals that each language relies on cultural and regional concepts for figurative expressions, with the highest overlap between languages originating from the same region.
All languages exhibit a significant deficiency compared to English, with variations in performance reflecting the availability of pre-training and fine-tuning data.
arXiv Detail & Related papers (2023-05-25T15:30:31Z) - Cross-Lingual Ability of Multilingual Masked Language Models: A Study of
Language Structure [54.01613740115601]
We study three language properties: constituent order, composition and word co-occurrence.
Our main conclusion is that the contribution of constituent order and word co-occurrence is limited, while the composition is more crucial to the success of cross-linguistic transfer.
arXiv Detail & Related papers (2022-03-16T07:09:35Z) - When is BERT Multilingual? Isolating Crucial Ingredients for
Cross-lingual Transfer [15.578267998149743]
We show that the absence of sub-word overlap significantly affects zero-shot transfer when languages differ in their word order.
There is a strong correlation between transfer performance and word embedding alignment between languages.
Our results call for focus in multilingual models on explicitly improving word embedding alignment between languages.
arXiv Detail & Related papers (2021-10-27T21:25:39Z) - Discovering Representation Sprachbund For Multilingual Pre-Training [139.05668687865688]
We generate language representation from multilingual pre-trained models and conduct linguistic analysis.
We cluster all the target languages into multiple groups and name each group as a representation sprachbund.
Experiments are conducted on cross-lingual benchmarks and significant improvements are achieved compared to strong baselines.
arXiv Detail & Related papers (2021-09-01T09:32:06Z) - How individuals change language [1.2437226707039446]
We introduce a very general mathematical model that encompasses a wide variety of individual-level linguistic behaviours.
We compare the likelihood of empirically-attested changes in definite and indefinite articles in multiple languages under different assumptions.
We find that accounts of language change that appeal primarily to errors in childhood language acquisition are very weakly supported by the historical data.
arXiv Detail & Related papers (2021-04-20T19:02:49Z) - Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer [101.58431011820755]
We study gender bias in multilingual embeddings and how it affects transfer learning for NLP applications.
We create a multilingual dataset for bias analysis and propose several ways for quantifying bias in multilingual representations.
arXiv Detail & Related papers (2020-05-02T04:34:37Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z) - On the coexistence of competing languages [0.0]
We revisit the question of language competition, with an emphasis on uncovering the ways in which coexistence might emerge.
We find that this emergence is related to symmetry breaking, and explore two particular scenarios.
For each of these, the investigation of paradigmatic situations leads us to a quantitative understanding of the conditions leading to language coexistence.
arXiv Detail & Related papers (2020-03-10T14:06:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.