A Mathematical Abstraction for Balancing the Trade-off Between
Creativity and Reality in Large Language Models
- URL: http://arxiv.org/abs/2306.02295v1
- Date: Sun, 4 Jun 2023 08:12:34 GMT
- Title: A Mathematical Abstraction for Balancing the Trade-off Between
Creativity and Reality in Large Language Models
- Authors: Ritwik Sinha, Zhao Song, Tianyi Zhou
- Abstract summary: Large Language Models are increasingly being used in domains such as generating prose, poetry or art.
This work provides a mathematical abstraction to describe creativity and reality based on certain losses.
A model trained on these losses balances the trade-off between the creativity and reality of the model.
- Score: 35.25919932657958
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Large Language Models have become popular for their remarkable capabilities
in human-oriented tasks and traditional natural language processing tasks. Its
efficient functioning is attributed to the attention mechanism in the
Transformer architecture, enabling it to concentrate on particular aspects of
the input.
LLMs are increasingly being used in domains such as generating prose, poetry
or art, which require the model to be creative (e.g. Adobe firefly). LLMs
possess advanced language generation abilities that enable them to generate
distinctive and captivating content. This utilization of LLMs in generating
narratives shows their flexibility and potential for use in domains that extend
beyond conventional natural language processing duties.
In different contexts, we may expect the LLM to generate factually correct
answers, that match reality; e.g., question-answering systems or online
assistants. In such situations, being correct is critical to LLMs being trusted
in practice. The Bing Chatbot provides its users with the flexibility to select
one of the three output modes: creative, balanced, and precise. Each mode
emphasizes creativity and factual accuracy differently.
In this work, we provide a mathematical abstraction to describe creativity
and reality based on certain losses. A model trained on these losses balances
the trade-off between the creativity and reality of the model.
Related papers
- WisdomBot: Tuning Large Language Models with Artificial Intelligence Knowledge [17.74988145184004]
Large language models (LLMs) have emerged as powerful tools in natural language processing (NLP)
This paper presents a novel LLM for education named WisdomBot, which combines the power of LLMs with educational theories.
We introduce two key enhancements during inference, i.e., local knowledge base retrieval augmentation and search engine retrieval augmentation during inference.
arXiv Detail & Related papers (2025-01-22T13:36:46Z) - MaestroMotif: Skill Design from Artificial Intelligence Feedback [67.17724089381056]
MaestroMotif is a method for AI-assisted skill design, which yields high-performing and adaptable agents.
We present MaestroMotif, a method for AI-assisted skill design, which yields high-performing and adaptable agents.
arXiv Detail & Related papers (2024-12-11T16:59:31Z) - Dynamic Universal Approximation Theory: The Basic Theory for Transformer-based Large Language Models [9.487731634351787]
Large-scale Transformer networks have quickly become the leading approach for advancing natural language processing algorithms.
This paper explores the theoretical foundations of large language models (LLMs)
It offers a theoretical backdrop, shedding light on the mechanisms that underpin these advancements.
arXiv Detail & Related papers (2024-07-01T04:29:35Z) - Creativity Has Left the Chat: The Price of Debiasing Language Models [1.223779595809275]
We investigate the unintended consequences of Reinforcement Learning from Human Feedback on the creativity of Large Language Models (LLMs)
Our findings have significant implications for marketers who rely on LLMs for creative tasks such as copywriting, ad creation, and customer persona generation.
arXiv Detail & Related papers (2024-06-08T22:14:51Z) - Characterising the Creative Process in Humans and Large Language Models [6.363158395541767]
We provide an automated method to characterise how humans and LLMs explore semantic spaces on the Alternate Uses Task.
We use sentence embeddings to identify response categories and compute semantic similarities, which we use to generate jump profiles.
Our results corroborate earlier work in humans reporting both persistent (deep search in few semantic spaces) and flexible (broad search across multiple semantic spaces) pathways to creativity.
Though LLMs as a population match human profiles, their relationship with creativity is different, where the more flexible models score higher on creativity.
arXiv Detail & Related papers (2024-05-01T23:06:46Z) - Finetuning an LLM on Contextual Knowledge of Classics for Q&A [0.0]
This project is an attempt to merge the knowledge of Classics with the capabilities of artificial intelligence.
The goal of this project is to develop an LLM that not only reproduces contextual knowledge accurately but also exhibits a consistent "personality"
arXiv Detail & Related papers (2023-12-13T02:32:01Z) - Democratizing Reasoning Ability: Tailored Learning from Large Language
Model [97.4921006089966]
We propose a tailored learning approach to distill such reasoning ability to smaller LMs.
We exploit the potential of LLM as a reasoning teacher by building an interactive multi-round learning paradigm.
To exploit the reasoning potential of the smaller LM, we propose self-reflection learning to motivate the student to learn from self-made mistakes.
arXiv Detail & Related papers (2023-10-20T07:50:10Z) - Let Models Speak Ciphers: Multiagent Debate through Embeddings [84.20336971784495]
We introduce CIPHER (Communicative Inter-Model Protocol Through Embedding Representation) to address this issue.
By deviating from natural language, CIPHER offers an advantage of encoding a broader spectrum of information without any modification to the model weights.
This showcases the superiority and robustness of embeddings as an alternative "language" for communication among LLMs.
arXiv Detail & Related papers (2023-10-10T03:06:38Z) - Empowering Language Models with Knowledge Graph Reasoning for Question
Answering [117.79170629640525]
We propose knOwledge REasOning empowered Language Model (OREO-LM)
OREO-LM consists of a novel Knowledge Interaction Layer that can be flexibly plugged into existing Transformer-based LMs.
We show significant performance gain, achieving state-of-art results in the Closed-Book setting.
arXiv Detail & Related papers (2022-11-15T18:26:26Z) - Inner Monologue: Embodied Reasoning through Planning with Language
Models [81.07216635735571]
Large Language Models (LLMs) can be applied to domains beyond natural language processing.
LLMs planning in embodied environments need to consider not just what skills to do, but also how and when to do them.
We propose that by leveraging environment feedback, LLMs are able to form an inner monologue that allows them to more richly process and plan in robotic control scenarios.
arXiv Detail & Related papers (2022-07-12T15:20:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.