A Mathematical Abstraction for Balancing the Trade-off Between
Creativity and Reality in Large Language Models
- URL: http://arxiv.org/abs/2306.02295v1
- Date: Sun, 4 Jun 2023 08:12:34 GMT
- Title: A Mathematical Abstraction for Balancing the Trade-off Between
Creativity and Reality in Large Language Models
- Authors: Ritwik Sinha, Zhao Song, Tianyi Zhou
- Abstract summary: Large Language Models are increasingly being used in domains such as generating prose, poetry or art.
This work provides a mathematical abstraction to describe creativity and reality based on certain losses.
A model trained on these losses balances the trade-off between the creativity and reality of the model.
- Score: 35.25919932657958
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Large Language Models have become popular for their remarkable capabilities
in human-oriented tasks and traditional natural language processing tasks. Its
efficient functioning is attributed to the attention mechanism in the
Transformer architecture, enabling it to concentrate on particular aspects of
the input.
LLMs are increasingly being used in domains such as generating prose, poetry
or art, which require the model to be creative (e.g. Adobe firefly). LLMs
possess advanced language generation abilities that enable them to generate
distinctive and captivating content. This utilization of LLMs in generating
narratives shows their flexibility and potential for use in domains that extend
beyond conventional natural language processing duties.
In different contexts, we may expect the LLM to generate factually correct
answers, that match reality; e.g., question-answering systems or online
assistants. In such situations, being correct is critical to LLMs being trusted
in practice. The Bing Chatbot provides its users with the flexibility to select
one of the three output modes: creative, balanced, and precise. Each mode
emphasizes creativity and factual accuracy differently.
In this work, we provide a mathematical abstraction to describe creativity
and reality based on certain losses. A model trained on these losses balances
the trade-off between the creativity and reality of the model.
Related papers
- Universal Approximation Theory: The Basic Theory for Transformer-based Large Language Models [9.487731634351787]
Large-scale Transformer networks have quickly become the leading approach for advancing natural language processing algorithms.
This paper explores the theoretical foundations of large language models (LLMs)
It offers a theoretical backdrop, shedding light on the mechanisms that underpin these advancements.
arXiv Detail & Related papers (2024-07-01T04:29:35Z) - Creativity Has Left the Chat: The Price of Debiasing Language Models [1.223779595809275]
We investigate the unintended consequences of Reinforcement Learning from Human Feedback on the creativity of Large Language Models (LLMs)
Our findings have significant implications for marketers who rely on LLMs for creative tasks such as copywriting, ad creation, and customer persona generation.
arXiv Detail & Related papers (2024-06-08T22:14:51Z) - Characterising the Creative Process in Humans and Large Language Models [6.363158395541767]
We provide an automated method to characterise how humans and LLMs explore semantic spaces on the Alternate Uses Task.
We use sentence embeddings to identify response categories and compute semantic similarities, which we use to generate jump profiles.
Our results corroborate earlier work in humans reporting both persistent (deep search in few semantic spaces) and flexible (broad search across multiple semantic spaces) pathways to creativity.
Though LLMs as a population match human profiles, their relationship with creativity is different, where the more flexible models score higher on creativity.
arXiv Detail & Related papers (2024-05-01T23:06:46Z) - Finetuning an LLM on Contextual Knowledge of Classics for Q&A [0.0]
This project is an attempt to merge the knowledge of Classics with the capabilities of artificial intelligence.
The goal of this project is to develop an LLM that not only reproduces contextual knowledge accurately but also exhibits a consistent "personality"
arXiv Detail & Related papers (2023-12-13T02:32:01Z) - LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language
Models [56.25156596019168]
This paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for large language models (LLMs)
Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.
arXiv Detail & Related papers (2023-11-30T03:59:31Z) - Democratizing Reasoning Ability: Tailored Learning from Large Language
Model [97.4921006089966]
We propose a tailored learning approach to distill such reasoning ability to smaller LMs.
We exploit the potential of LLM as a reasoning teacher by building an interactive multi-round learning paradigm.
To exploit the reasoning potential of the smaller LM, we propose self-reflection learning to motivate the student to learn from self-made mistakes.
arXiv Detail & Related papers (2023-10-20T07:50:10Z) - Let Models Speak Ciphers: Multiagent Debate through Embeddings [84.20336971784495]
We introduce CIPHER (Communicative Inter-Model Protocol Through Embedding Representation) to address this issue.
By deviating from natural language, CIPHER offers an advantage of encoding a broader spectrum of information without any modification to the model weights.
This showcases the superiority and robustness of embeddings as an alternative "language" for communication among LLMs.
arXiv Detail & Related papers (2023-10-10T03:06:38Z) - User-Controlled Knowledge Fusion in Large Language Models: Balancing
Creativity and Hallucination [5.046007553593371]
Large Language Models (LLMs) generate diverse, relevant, and creative responses.
Striking a balance between the LLM's imaginative capabilities and its adherence to factual information is a key challenge.
This paper presents an innovative user-controllable mechanism that modulates the balance between an LLM's imaginative capabilities and its adherence to factual information.
arXiv Detail & Related papers (2023-07-30T06:06:35Z) - Empowering Language Models with Knowledge Graph Reasoning for Question
Answering [117.79170629640525]
We propose knOwledge REasOning empowered Language Model (OREO-LM)
OREO-LM consists of a novel Knowledge Interaction Layer that can be flexibly plugged into existing Transformer-based LMs.
We show significant performance gain, achieving state-of-art results in the Closed-Book setting.
arXiv Detail & Related papers (2022-11-15T18:26:26Z) - Inner Monologue: Embodied Reasoning through Planning with Language
Models [81.07216635735571]
Large Language Models (LLMs) can be applied to domains beyond natural language processing.
LLMs planning in embodied environments need to consider not just what skills to do, but also how and when to do them.
We propose that by leveraging environment feedback, LLMs are able to form an inner monologue that allows them to more richly process and plan in robotic control scenarios.
arXiv Detail & Related papers (2022-07-12T15:20:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.