ChessGPT: Bridging Policy Learning and Language Modeling
- URL: http://arxiv.org/abs/2306.09200v2
- Date: Thu, 21 Dec 2023 16:59:44 GMT
- Title: ChessGPT: Bridging Policy Learning and Language Modeling
- Authors: Xidong Feng, Yicheng Luo, Ziyan Wang, Hongrui Tang, Mengyue Yang, Kun
Shao, David Mguni, Yali Du, Jun Wang
- Abstract summary: ChessGPT is a GPT model bridging policy learning and language modeling.
We build a large-scale game and language dataset related to chess.
We showcase two model examples ChessCLIP and ChessGPT, integrating policy learning and language modeling.
- Score: 17.85415939196955
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: When solving decision-making tasks, humans typically depend on information
from two key sources: (1) Historical policy data, which provides interaction
replay from the environment, and (2) Analytical insights in natural language
form, exposing the invaluable thought process or strategic considerations.
Despite this, the majority of preceding research focuses on only one source:
they either use historical replay exclusively to directly learn policy or value
functions, or engaged in language model training utilizing mere language
corpus. In this paper, we argue that a powerful autonomous agent should cover
both sources. Thus, we propose ChessGPT, a GPT model bridging policy learning
and language modeling by integrating data from these two sources in Chess
games. Specifically, we build a large-scale game and language dataset related
to chess. Leveraging the dataset, we showcase two model examples ChessCLIP and
ChessGPT, integrating policy learning and language modeling. Finally, we
propose a full evaluation framework for evaluating language model's chess
ability. Experimental results validate our model and dataset's effectiveness.
We open source our code, model, and dataset at
https://github.com/waterhorse1/ChessGPT.
Related papers
- Learning to Play Chess from Textbooks (LEAP): a Corpus for Evaluating
Chess Moves based on Sentiment Analysis [4.314956204483074]
This paper examines chess textbooks as a new knowledge source for enabling machines to learn how to play chess.
We developed the LEAP corpus, a first and new heterogeneous dataset with structured (chess move notations and board states) and unstructured data.
We performed empirical experiments that assess the performance of various transformer-based baseline models for sentiment analysis.
arXiv Detail & Related papers (2023-10-31T08:26:02Z) - Large Language Models on the Chessboard: A Study on ChatGPT's Formal
Language Comprehension and Complex Reasoning Skills [4.138999291282392]
This paper probes the performance of ChatGPT, a sophisticated language model by OpenAI.
We assess ChatGPT's understanding of the chessboard, adherence to chess rules, and strategic decision-making abilities.
Our study also reveals ChatGPT's propensity for a coherent strategy in its gameplay and a noticeable uptick in decision-making assertiveness.
arXiv Detail & Related papers (2023-08-29T08:36:30Z) - Textually Pretrained Speech Language Models [107.10344535390956]
We propose TWIST, a method for training SpeechLMs using a warm-start from a pretrained textual language models.
We show using both automatic and human evaluations that TWIST outperforms a cold-start SpeechLM across the board.
arXiv Detail & Related papers (2023-05-22T13:12:16Z) - A Commonsense-Infused Language-Agnostic Learning Framework for Enhancing
Prediction of Political Polarity in Multilingual News Headlines [0.0]
We use the method of translation and retrieval to acquire the inferential knowledge in the target language.
We then employ an attention mechanism to emphasise important inferences.
We present a dataset of over 62.6K multilingual news headlines in five European languages annotated with their respective political polarities.
arXiv Detail & Related papers (2022-12-01T06:07:01Z) - Robust Preference Learning for Storytelling via Contrastive
Reinforcement Learning [53.92465205531759]
Controlled automated story generation seeks to generate natural language stories satisfying constraints from natural language critiques or preferences.
We train a contrastive bi-encoder model to align stories with human critiques, building a general purpose preference model.
We further fine-tune the contrastive reward model using a prompt-learning technique to increase story generation robustness.
arXiv Detail & Related papers (2022-10-14T13:21:33Z) - Pre-Trained Language Models for Interactive Decision-Making [72.77825666035203]
We describe a framework for imitation learning in which goals and observations are represented as a sequence of embeddings.
We demonstrate that this framework enables effective generalization across different environments.
For test tasks involving novel goals or novel scenes, initializing policies with language models improves task completion rates by 43.6%.
arXiv Detail & Related papers (2022-02-03T18:55:52Z) - Read Like Humans: Autonomous, Bidirectional and Iterative Language
Modeling for Scene Text Recognition [80.446770909975]
Linguistic knowledge is of great benefit to scene text recognition.
How to effectively model linguistic rules in end-to-end deep networks remains a research challenge.
We propose an autonomous, bidirectional and iterative ABINet for scene text recognition.
arXiv Detail & Related papers (2021-03-11T06:47:45Z) - Learning Chess Blindfolded: Evaluating Language Models on State Tracking [69.3794549747725]
We consider the task of language modeling for the game of chess.
Unlike natural language, chess notations describe a simple, constrained, and deterministic domain.
We find that transformer language models can learn to track pieces and predict legal moves with high accuracy when trained solely on move sequences.
arXiv Detail & Related papers (2021-02-26T01:16:23Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Navigating Human Language Models with Synthetic Agents [7.99536002595393]
We train a version of the GPT-2 on a corpora of historical chess games, and then "launch" clusters of synthetic agents into the model.
We find that the percentages of moves by piece using the model are substantially similar from human patterns.
arXiv Detail & Related papers (2020-08-10T14:39:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.