The Evolving Landscape of LLM- and VLM-Integrated Reinforcement Learning
- URL: http://arxiv.org/abs/2502.15214v1
- Date: Fri, 21 Feb 2025 05:01:30 GMT
- Title: The Evolving Landscape of LLM- and VLM-Integrated Reinforcement Learning
- Authors: Sheila Schoepp, Masoud Jafaripour, Yingyue Cao, Tianpei Yang, Fatemeh Abdollahi, Shadan Golestan, Zahin Sufiyan, Osmar R. Zaiane, Matthew E. Taylor,
- Abstract summary: Reinforcement learning (RL) has shown impressive results in sequential decision-making tasks.<n>Large Language Models (LLMs) and Vision-Language Models (VLMs) have emerged, exhibiting impressive capabilities in multimodal understanding and reasoning.<n>We present a taxonomy that categorizes these LLM/VLM-assisted RL approaches into three roles: agent, planner, and reward.
- Score: 7.092323056639036
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) has shown impressive results in sequential decision-making tasks. Meanwhile, Large Language Models (LLMs) and Vision-Language Models (VLMs) have emerged, exhibiting impressive capabilities in multimodal understanding and reasoning. These advances have led to a surge of research integrating LLMs and VLMs into RL. In this survey, we review representative works in which LLMs and VLMs are used to overcome key challenges in RL, such as lack of prior knowledge, long-horizon planning, and reward design. We present a taxonomy that categorizes these LLM/VLM-assisted RL approaches into three roles: agent, planner, and reward. We conclude by exploring open problems, including grounding, bias mitigation, improved representations, and action advice. By consolidating existing research and identifying future directions, this survey establishes a framework for integrating LLMs and VLMs into RL, advancing approaches that unify natural language and visual understanding with sequential decision-making.
Related papers
- Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models [22.796496516709514]
This survey systematically reviews recent advances in RL-based reasoning for Multimodal Large Language Models.
We highlight two main RL paradigms--value-free and value-based methods--and analyze how RL enhances reasoning abilities.
We provide an extensive overview of benchmark datasets, evaluation protocols, and existing limitations.
arXiv Detail & Related papers (2025-04-30T03:14:28Z) - Meta-Thinking in LLMs via Multi-Agent Reinforcement Learning: A Survey [2.572335031488049]
This survey explores the development of meta-thinking capabilities in Large Language Models (LLMs) from a Multi-Agent Reinforcement Learning (MARL) perspective.
By exploring reward mechanisms, self-play, and continuous learning methods in MARL, this survey gives a comprehensive roadmap to building introspective, adaptive, and trustworthy LLMs.
arXiv Detail & Related papers (2025-04-20T07:34:26Z) - Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search [57.28671084993782]
Large language models (LLMs) have demonstrated remarkable reasoning capabilities across diverse domains.
Recent studies have shown that increasing test-time computation enhances LLMs' reasoning capabilities.
We propose a two-stage training paradigm: 1) a small-scale format tuning stage to internalize the COAT reasoning format and 2) a large-scale self-improvement stage leveraging reinforcement learning.
arXiv Detail & Related papers (2025-02-04T17:26:58Z) - Large Language Model Enhanced Recommender Systems: Taxonomy, Trend, Application and Future [31.31030891846837]
This paper presents a survey of the latest research efforts aimed at leveraging Large Language Model (LLM) to enhance recommender systems (RS)<n>We identify a critical shift in the field with the move towards incorporating LLM into the online system, notably by avoiding their use during inference.
arXiv Detail & Related papers (2024-12-18T02:07:21Z) - Reinforcement Learning Enhanced LLMs: A Survey [45.57586245741664]
We will make a systematic review of the most up-to-date state of knowledge on RL-enhanced large language models (LLMs)
Specifically, we detail the basics of RL; (2) introduce popular RL-enhanced LLMs; (3) review researches on two widely-used reward model-based RL techniques: Reinforcement Learning from Human Feedback (RLHF) and Reinforcement Learning from AI Feedback (RLAIF)
arXiv Detail & Related papers (2024-12-05T16:10:42Z) - Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks.
LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning.
We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z) - Reinforcement Learning Problem Solving with Large Language Models [0.0]
Large Language Models (LLMs) have an extensive amount of world knowledge, and this has enabled their application in various domains to improve the performance of Natural Language Processing (NLP) tasks.
This has also facilitated a more accessible paradigm of conversation-based interactions between humans and AI systems to solve intended problems.
We show the practicality of our approach through two detailed case studies for "Research Scientist" and "Legal Matter Intake"
arXiv Detail & Related papers (2024-04-29T12:16:08Z) - A Survey on Self-Evolution of Large Language Models [116.54238664264928]
Large language models (LLMs) have significantly advanced in various fields and intelligent agent applications.
To address this issue, self-evolution approaches that enable LLMs to autonomously acquire, refine, and learn from experiences generated by the model itself are rapidly growing.
arXiv Detail & Related papers (2024-04-22T17:43:23Z) - LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges.
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model.
This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z) - Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs [60.40396361115776]
This paper introduces a novel collaborative approach, namely SlimPLM, that detects missing knowledge in large language models (LLMs) with a slim proxy model.
We employ a proxy model which has far fewer parameters, and take its answers as answers.
Heuristic answers are then utilized to predict the knowledge required to answer the user question, as well as the known and unknown knowledge within the LLM.
arXiv Detail & Related papers (2024-02-19T11:11:08Z) - Mutual Enhancement of Large Language and Reinforcement Learning Models
through Bi-Directional Feedback Mechanisms: A Case Study [1.3597551064547502]
We employ a teacher-student learning framework to tackle problems of Large Language Models (LLMs) and reinforcement learning (RL) models.
Within this framework, the LLM acts as a teacher, while the RL model acts as a student.
We propose a practical algorithm to address the problem and conduct empirical experiments to evaluate the effectiveness of our method.
arXiv Detail & Related papers (2024-01-12T14:35:57Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - Integration of Large Language Models and Federated Learning [58.9876604258949]
We propose a research framework, dividing the fusion of LLMs and FL into three parts.
We first provide a review of the current state of research in the domain of LLMs combined with FL, including their typical applications.
We then discuss the practical applications of the combination of LLMs and FL in critical scenarios such as healthcare, finance, and education.
arXiv Detail & Related papers (2023-07-18T02:09:14Z) - A Comprehensive Overview of Large Language Models [68.22178313875618]
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks.
This article provides an overview of the existing literature on a broad range of LLM-related concepts.
arXiv Detail & Related papers (2023-07-12T20:01:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.