Tracking the perspectives of interacting language models
- URL: http://arxiv.org/abs/2406.11938v1
- Date: Mon, 17 Jun 2024 17:20:16 GMT
- Title: Tracking the perspectives of interacting language models
- Authors: Hayden Helm, Brandon Duderstadt, Youngser Park, Carey E. Priebe,
- Abstract summary: Large language models (LLMs) are capable of producing high quality information at unprecedented rates.
As these models continue to entrench themselves in society, the content they produce will become increasingly pervasive in databases.
- Score: 11.601000749578647
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large language models (LLMs) are capable of producing high quality information at unprecedented rates. As these models continue to entrench themselves in society, the content they produce will become increasingly pervasive in databases that are, in turn, incorporated into the pre-training data, fine-tuning data, retrieval data, etc. of other language models. In this paper we formalize the idea of a communication network of LLMs and introduce a method for representing the perspective of individual models within a collection of LLMs. Given these tools we systematically study information diffusion in the communication network of LLMs in various simulated settings.
Related papers
- A Survey of Multimodal Large Language Model from A Data-centric Perspective [46.57232264950785]
Multimodal large language models (MLLMs) enhance the capabilities of standard large language models by integrating and processing data from multiple modalities.
Data plays a pivotal role in the development and refinement of these models.
arXiv Detail & Related papers (2024-05-26T17:31:21Z) - Data Augmentation using Large Language Models: Data Perspectives, Learning Paradigms and Challenges [47.45993726498343]
Data augmentation (DA) has emerged as a pivotal technique for enhancing model performance by diversifying training examples without the need for additional data collection.
This survey explores the transformative impact of large language models (LLMs) on DA, particularly addressing the unique challenges and opportunities they present in the context of natural language processing (NLP) and beyond.
arXiv Detail & Related papers (2024-03-05T14:11:54Z) - Exploring the Frontier of Vision-Language Models: A Survey of Current Methodologies and Future Directions [11.786387517781328]
Vision-Language Models (VLMs) are advanced models that can tackle more intricate tasks such as image captioning and visual question answering.
Our classification organizes VLMs into three distinct categories: models dedicated to vision-language understanding, models that process multimodal inputs to generate unimodal (textual) outputs and models that both accept and produce multimodal inputs and outputs.
We meticulously dissect each model, offering an extensive analysis of its foundational architecture, training data sources, as well as its strengths and limitations wherever possible.
arXiv Detail & Related papers (2024-02-20T18:57:34Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - Let Models Speak Ciphers: Multiagent Debate through Embeddings [84.20336971784495]
We introduce CIPHER (Communicative Inter-Model Protocol Through Embedding Representation) to address this issue.
By deviating from natural language, CIPHER offers an advantage of encoding a broader spectrum of information without any modification to the model weights.
This showcases the superiority and robustness of embeddings as an alternative "language" for communication among LLMs.
arXiv Detail & Related papers (2023-10-10T03:06:38Z) - Large Language Models Are Latent Variable Models: Explaining and Finding
Good Demonstrations for In-Context Learning [104.58874584354787]
In recent years, pre-trained large language models (LLMs) have demonstrated remarkable efficiency in achieving an inference-time few-shot learning capability known as in-context learning.
This study aims to examine the in-context learning phenomenon through a Bayesian lens, viewing real-world LLMs as latent variable models.
arXiv Detail & Related papers (2023-01-27T18:59:01Z) - Towards Continual Entity Learning in Language Models for Conversational
Agents [0.5330240017302621]
We introduce entity-aware language models (EALM), where we integrate entity models trained on catalogues of entities into pre-trained LMs.
Our combined language model adaptively adds information from the entity models into the pre-trained LM depending on the sentence context.
We show significant perplexity improvements on task-oriented dialogue datasets, especially on long-tailed utterances.
arXiv Detail & Related papers (2021-07-30T21:10:09Z) - Data Augmentation for Spoken Language Understanding via Pretrained
Language Models [113.56329266325902]
Training of spoken language understanding (SLU) models often faces the problem of data scarcity.
We put forward a data augmentation method using pretrained language models to boost the variability and accuracy of generated utterances.
arXiv Detail & Related papers (2020-04-29T04:07:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.