LLM should think and action as a human
- URL: http://arxiv.org/abs/2502.13475v1
- Date: Wed, 19 Feb 2025 06:58:34 GMT
- Title: LLM should think and action as a human
- Authors: Haun Leung, ZiNan Wang,
- Abstract summary: In the multi-turns conversation, for each user prompt, the large language model thinks based on elements such as chat history, thinking context, action calls, memory and knowledge.
Our experimental results show that the reasoning ability and planning ability of the large language model are enhanced, and the issues in the multi-turns conversation are solved.
- Score: 0.0
- License:
- Abstract: It is popular lately to train large language models to be used as chat assistants, but in the conversation between the user and the chat assistant, there are prompts, require multi-turns between the chat assistant and the user. However, there are a number of issues with the multi-turns conversation: The response of the chat assistant is prone to errors and cannot help users achieve their goals; It is difficult for chat assistant to generate responses with different processes based on actual needs for the same command or request; Chat assistant require the use of tools, but the current approach is not elegant and efficient, and the number of tool calls that can be supported is limited. The main reason for these issues is that large language models do not have the thinking ability as a human, lack the reasoning ability and planning ability, and lack the ability to execute plans. To solve these issues, we propose a thinking method based on a built-in chain of thought: In the multi-turns conversation, for each user prompt, the large language model thinks based on elements such as chat history, thinking context, action calls, memory and knowledge, makes detailed reasoning and planning, and actions according to the plan. We also explored how the large language model enhances thinking ability through this thinking method: Collect training datasets according to the thinking method and fine tune the large language model through supervised learning; Train a consistency reward model and use it as a reward function to fine tune the large language model using reinforcement learning, and the reinforced large language model outputs according to this way of thinking. Our experimental results show that the reasoning ability and planning ability of the large language model are enhanced, and the issues in the multi-turns conversation are solved.
Related papers
- Think Before You Speak: Cultivating Communication Skills of Large Language Models via Inner Monologue [73.69510478736483]
Large language models (LLMs) can generate fluent, coherent, and diverse responses.
However, they lack a crucial ability: communication skills.
This article aims to empower LLMs with communication skills through inner monologues.
Experimental results show that the proposed CSIM strategy improves the backbone models and outperforms the baselines.
arXiv Detail & Related papers (2023-11-13T16:19:42Z) - Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages [40.37822682459469]
We introduce the concept of $textitchat vector$ to equip pre-trained language models with instruction following and human value alignment.
By simply adding the chat vector to a continual pre-trained model's weights, we can endow the model with chat capabilities without the need for languages.
arXiv Detail & Related papers (2023-10-07T13:34:21Z) - ChatDev: Communicative Agents for Software Development [84.90400377131962]
ChatDev is a chat-powered software development framework in which specialized agents are guided in what to communicate.
These agents actively contribute to the design, coding, and testing phases through unified language-based communication.
arXiv Detail & Related papers (2023-07-16T02:11:34Z) - ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large
Language Models [125.7209927536255]
We propose ChatCoT, a tool-augmented chain-of-thought reasoning framework for chat-based LLMs.
In ChatCoT, we model the chain-of-thought (CoT) reasoning as multi-turn conversations, to utilize tools in a more natural way through chatting.
Our approach can effectively leverage the multi-turn conversation ability of chat-based LLMs, and integrate the thought chain following and tools manipulation in a unified way.
arXiv Detail & Related papers (2023-05-23T17:54:33Z) - ChatLLM Network: More brains, More intelligence [42.65167827451101]
We propose ChatLLM network that allows multiple dialogue-based language models to interact, provide feedback, and think together.
We show that our network attains significant improvements in problem-solving, leading to observable progress amongst each member.
arXiv Detail & Related papers (2023-04-24T08:29:14Z) - ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large
Language Models in Multilingual Learning [70.57126720079971]
Large language models (LLMs) have emerged as the most important breakthroughs in natural language processing (NLP)
This paper evaluates ChatGPT on 7 different tasks, covering 37 diverse languages with high, medium, low, and extremely low resources.
Compared to the performance of previous models, our extensive experimental results demonstrate a worse performance of ChatGPT for different NLP tasks and languages.
arXiv Detail & Related papers (2023-04-12T05:08:52Z) - Chain of Thought Prompting Elicits Reasoning in Large Language Models [56.811278668446825]
This paper explores the ability of language models to generate a coherent chain of thought.
Experiments show that inducing a chain of thought via prompting can enable sufficiently large language models to better perform reasoning tasks.
arXiv Detail & Related papers (2022-01-28T02:33:07Z) - Investigating Effect of Dialogue History in Multilingual Task Oriented
Dialogue Systems [2.695466667982714]
Up to Dec 2021, Alexa, one of the most popular smart speakers around the world, is able to support 9 different languages.
Training a virtual assistant in other languages is often more difficult, especially for those low-resource languages.
We devise an efficient and effective training solution for multilingual task-orientated dialogue systems.
arXiv Detail & Related papers (2021-12-23T02:27:10Z) - Few-Shot Bot: Prompt-Based Learning for Dialogue Systems [58.27337673451943]
Learning to converse using only a few examples is a great challenge in conversational AI.
The current best conversational models are either good chit-chatters (e.g., BlenderBot) or goal-oriented systems (e.g., MinTL)
We propose prompt-based few-shot learning which does not require gradient-based fine-tuning but instead uses a few examples as the only source of learning.
arXiv Detail & Related papers (2021-10-15T14:36:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.