Creolizing the Web
- URL: http://arxiv.org/abs/2102.12382v1
- Date: Wed, 24 Feb 2021 16:08:45 GMT
- Title: Creolizing the Web
- Authors: Abhinav Tamaskar, Roy Rinberg, Sunandan Chakraborty, Bud Mishra
- Abstract summary: We present a method for detecting evolutionary patterns in a sociological model of language evolution.
We develop a minimalistic model that provides a rigorous base for any generalized evolutionary model for language based on communication between individuals.
We present empirical results and their interpretations on a real world dataset from rdt to identify communities and echo chambers for opinions.
- Score: 2.393911349115195
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The evolution of language has been a hotly debated subject with contradicting
hypotheses and unreliable claims. Drawing from signalling games, dynamic
population mechanics, machine learning and algebraic topology, we present a
method for detecting evolutionary patterns in a sociological model of language
evolution. We develop a minimalistic model that provides a rigorous base for
any generalized evolutionary model for language based on communication between
individuals. We also discuss theoretical guarantees of this model, ranging from
stability of language representations to fast convergence of language by
temporal communication and language drift in an interactive setting. Further we
present empirical results and their interpretations on a real world dataset
from \rdt to identify communities and echo chambers for opinions, thus placing
obstructions to reliable communication among communities.
Related papers
- Analyzing The Language of Visual Tokens [48.62180485759458]
We take a natural-language-centric approach to analyzing discrete visual languages.
We show that higher token innovation drives greater entropy and lower compression, with tokens predominantly representing object parts.
We also show that visual languages lack cohesive grammatical structures, leading to higher perplexity and weaker hierarchical organization compared to natural languages.
arXiv Detail & Related papers (2024-11-07T18:59:28Z) - Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning [84.94709351266557]
We focus on the trustworthiness of language models with respect to retrieval augmentation.
We deem that retrieval-augmented language models have the inherent capabilities of supplying response according to both contextual and parametric knowledge.
Inspired by aligning language models with human preference, we take the first step towards aligning retrieval-augmented language models to a status where it responds relying merely on the external evidence.
arXiv Detail & Related papers (2024-10-22T09:25:21Z) - Language Models as Models of Language [0.0]
This chapter critically examines the potential contributions of modern language models to theoretical linguistics.
I review a growing body of empirical evidence suggesting that language models can learn hierarchical syntactic structure and exhibit sensitivity to various linguistic phenomena.
I conclude that closer collaboration between theoretical linguists and computational researchers could yield valuable insights.
arXiv Detail & Related papers (2024-08-13T18:26:04Z) - NeLLCom-X: A Comprehensive Neural-Agent Framework to Simulate Language Learning and Group Communication [2.184775414778289]
Recently introduced NeLLCom framework allows agents to first learn an artificial language and then use it to communicate.
We extend this framework by introducing more realistic role-alternating agents and group communication.
arXiv Detail & Related papers (2024-07-19T03:03:21Z) - Modeling language contact with the Iterated Learning Model [0.0]
Iterated learning models are agent-based models of language change.
A recently introduced type of iterated learning model, the Semi-Supervised ILM is used to simulate language contact.
arXiv Detail & Related papers (2024-06-11T01:43:23Z) - Language Evolution with Deep Learning [49.879239655532324]
Computational modeling plays an essential role in the study of language emergence.
It aims to simulate the conditions and learning processes that could trigger the emergence of a structured language.
This chapter explores another class of computational models that have recently revolutionized the field of machine learning: deep learning models.
arXiv Detail & Related papers (2024-03-18T16:52:54Z) - From Word Models to World Models: Translating from Natural Language to
the Probabilistic Language of Thought [124.40905824051079]
We propose rational meaning construction, a computational framework for language-informed thinking.
We frame linguistic meaning as a context-sensitive mapping from natural language into a probabilistic language of thought.
We show that LLMs can generate context-sensitive translations that capture pragmatically-appropriate linguistic meanings.
We extend our framework to integrate cognitively-motivated symbolic modules.
arXiv Detail & Related papers (2023-06-22T05:14:00Z) - Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - Language Model Evaluation Beyond Perplexity [47.268323020210175]
We analyze whether text generated from language models exhibits the statistical tendencies present in the human-generated text on which they were trained.
We find that neural language models appear to learn only a subset of the tendencies considered, but align much more closely with empirical trends than proposed theoretical distributions.
arXiv Detail & Related papers (2021-05-31T20:13:44Z) - The Rediscovery Hypothesis: Language Models Need to Meet Linguistics [8.293055016429863]
We study whether linguistic knowledge is a necessary condition for good performance of modern language models.
We show that language models that are significantly compressed but perform well on their pretraining objectives retain good scores when probed for linguistic structures.
This result supports the rediscovery hypothesis and leads to the second contribution of our paper: an information-theoretic framework that relates language modeling objective with linguistic information.
arXiv Detail & Related papers (2021-03-02T15:57:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.