Related papers: Language Model-In-The-Loop: Data Optimal Approach to Learn-To-Recommend Actions in Text Games

Language Model-In-The-Loop: Data Optimal Approach to Learn-To-Recommend Actions in Text Games

URL: http://arxiv.org/abs/2311.07687v1
Date: Mon, 13 Nov 2023 19:12:49 GMT
Title: Language Model-In-The-Loop: Data Optimal Approach to Learn-To-Recommend Actions in Text Games
Authors: Arjun Vaithilingam Sudhakar, Prasanna Parthasarathi, Janarthanan Rajendran, Sarath Chandar
Abstract summary: Large Language Models (LLMs) have demonstrated superior performance in language understanding benchmarks. LLMs leverage linguistic priors of LLMs -- GPT-2 -- for action candidate recommendations to improve the performance in text games. CalM adapts GPT-2 with annotated human gameplays and keeps the LLM fixed during the learning of the text based games.
Score: 16.281640651021434
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have demonstrated superior performance in language understanding benchmarks. CALM, a popular approach, leverages linguistic priors of LLMs -- GPT-2 -- for action candidate recommendations to improve the performance in text games in Jericho without environment-provided actions. However, CALM adapts GPT-2 with annotated human gameplays and keeps the LLM fixed during the learning of the text based games. In this work, we explore and evaluate updating LLM used for candidate recommendation during the learning of the text based game as well to mitigate the reliance on the human annotated gameplays, which are costly to acquire. We observe that by updating the LLM during learning using carefully selected in-game transitions, we can reduce the dependency on using human annotated game plays for fine-tuning the LLMs. We conducted further analysis to study the transferability of the updated LLMs and observed that transferring in-game trained models to other games did not result in a consistent transfer.

Related papers

Grammar and Gameplay-aligned RL for Game Description Generation with LLMs [12.329521804287259]
Game Description Generation (GDG) is the task of generating a game description written in a Game Description Language (GDL) from natural language text. We propose reinforcement learning-based fine-tuning of Large Language Models (LLMs) for GDG (RLGDG) Our training method simultaneously improves grammatical correctness and fidelity to game concepts by introducing both grammar rewards and concept rewards.
arXiv Detail & Related papers (2025-03-20T01:47:33Z)
Can Large Language Models Capture Video Game Engagement? [1.3873323883842132]
We evaluate comprehensively the capacity of popular Large Language Models to annotate and successfully predict continuous affect annotations of videos. We run over 2,400 experiments to investigate the impact of LLM architecture, model size, input modality, prompting strategy, and ground truth processing method on engagement prediction.
arXiv Detail & Related papers (2025-02-05T17:14:47Z)
Enhancing Commentary Strategies for Imperfect Information Card Games: A Study of Large Language Models in Guandan Commentary [5.1244906826828736]
We introduce a novel commentary method that combine Reinforcement Learning (RL) and large language models (LLMs) Our system leverages RL to generate intricate card-playing scenarios and employs LLMs to generate corresponding commentary text. We showcase the substantial enhancement in performance achieved by the proposed commentary framework when applied to open-source LLMs.
arXiv Detail & Related papers (2024-06-23T11:58:26Z)
Exploring Design Choices for Building Language-Specific LLMs [36.32622880071991]
We study building language-specific language models by adapting monolingual and multilingual models. We find that the initial performance of LLM does not always correlate with the final performance after the adaptation.
arXiv Detail & Related papers (2024-06-20T18:47:43Z)
Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning [57.323716555996114]
Off-target translation remains an unsolved problem, especially for low-resource languages. Recent works have either designed advanced prompting strategies to highlight the functionality of translation instructions or exploited the in-context learning ability of LLMs. In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs.
arXiv Detail & Related papers (2024-03-21T13:47:40Z)
Large Language Models as Agents in Two-Player Games [12.303405412105187]
This paper delineates the parallels between the training methods of large language models (LLMs) and the strategies employed for the development of agents in two-player games. We propose a re-conceptualization of LLM learning processes in terms of agent learning in language-based games.
arXiv Detail & Related papers (2024-02-12T21:44:32Z)
Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks. LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z)
LLMRefine: Pinpointing and Refining Large Language Models via Fine-Grained Actionable Feedback [65.84061725174269]
Recent large language models (LLM) are leveraging human feedback to improve their generation quality. We propose LLMRefine, an inference time optimization method to refine LLM's output. We conduct experiments on three text generation tasks, including machine translation, long-form question answering (QA), and topical summarization. LLMRefine consistently outperforms all baseline approaches, achieving improvements up to 1.7 MetricX points on translation tasks, 8.1 ROUGE-L on ASQA, 2.2 ROUGE-L on topical summarization.
arXiv Detail & Related papers (2023-11-15T19:52:11Z)
Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models [105.39236338147715]
The paper is inspired by the popular language game Who is Spy'' We develop DEEP to evaluate LLMs' expression and disguising abilities. We then introduce SpyGame, an interactive multi-agent framework.
arXiv Detail & Related papers (2023-10-31T14:37:42Z)
Evaluating Large Language Models at Evaluating Instruction Following [54.49567482594617]
We introduce a challenging meta-evaluation benchmark, LLMBar, designed to test the ability of an LLM evaluator in discerning instruction-following outputs. We discover that different evaluators exhibit distinct performance on LLMBar and even the highest-scoring ones have substantial room for improvement.
arXiv Detail & Related papers (2023-10-11T16:38:11Z)
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback [61.83548032416181]
We present Okapi, the first system with instruction-tuned LLMs based on RLHF for multiple languages. Okapi introduces instruction and response-ranked data in 26 diverse languages to facilitate the experiments and development of future multilingual LLM research.
arXiv Detail & Related papers (2023-07-29T18:01:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.