Navigating Motion Agents in Dynamic and Cluttered Environments through LLM Reasoning
- URL: http://arxiv.org/abs/2503.07323v2
- Date: Thu, 05 Jun 2025 12:17:03 GMT
- Title: Navigating Motion Agents in Dynamic and Cluttered Environments through LLM Reasoning
- Authors: Yubo Zhao, Qi Wu, Yifan Wang, Yu-Wing Tai, Chi-Keung Tang,
- Abstract summary: This paper advances motion agents empowered by large language models (LLMs) toward autonomous navigation in dynamic and cluttered environments.<n>Our training-free framework supports multi-agent coordination, closed-loop replanning, and dynamic obstacle avoidance without retraining or fine-tuning.
- Score: 69.5875073447454
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper advances motion agents empowered by large language models (LLMs) toward autonomous navigation in dynamic and cluttered environments, significantly surpassing first and recent seminal but limited studies on LLM's spatial reasoning, where movements are restricted in four directions in simple, static environments in the presence of only single agents much less multiple agents. Specifically, we investigate LLMs as spatial reasoners to overcome these limitations by uniformly encoding environments (e.g., real indoor floorplans), agents which can be dynamic obstacles and their paths as discrete tokens akin to language tokens. Our training-free framework supports multi-agent coordination, closed-loop replanning, and dynamic obstacle avoidance without retraining or fine-tuning. We show that LLMs can generalize across agents, tasks, and environments using only text-based interactions, opening new possibilities for semantically grounded, interactive navigation in both simulation and embodied systems.
Related papers
- RALLY: Role-Adaptive LLM-Driven Yoked Navigation for Agentic UAV Swarms [15.891423894740045]
We develop a Role-Adaptive LLM-Driven Yoked navigation algorithm RALLY.<n>RALLY uses structured natural language for efficient semantic communication and collaborative reasoning.<n> Experiments show that RALLY outperforms conventional approaches in terms of task coverage, convergence speed, and generalization.
arXiv Detail & Related papers (2025-07-02T05:44:17Z) - Enhancing Large Language Models for Mobility Analytics with Semantic Location Tokenization [29.17336622418242]
We propose QT-Mob, a novel framework that significantly enhances Large Language Models (LLMs) for mobility analytics.<n> QT-Mob introduces a location tokenization module that learns compact, semantically rich tokens to represent locations.<n>Experiments on three real-world dataset demonstrate the superior performance in both next-location prediction and mobility recovery tasks.
arXiv Detail & Related papers (2025-06-08T02:17:50Z) - Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap [51.198001060683296]
Large Language Models (LLMs) offer transformative potential to address transportation challenges.
This survey first presents LLM4TR, a novel conceptual framework that systematically categorizes the roles of LLMs in transportation.
For each role, our review spans diverse applications, from traffic prediction and autonomous driving to safety analytics and urban mobility optimization.
arXiv Detail & Related papers (2025-03-27T11:56:27Z) - EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks [24.41705039390567]
EmbodiedVSR (Embodied Visual Spatial Reasoning) is a novel framework that integrates dynamic scene graph-guided Chain-of-Thought (CoT) reasoning.<n>Our method enables zero-shot spatial reasoning without task-specific fine-tuning.<n>Experiments demonstrate that our framework significantly outperforms existing MLLM-based methods in accuracy and reasoning coherence.
arXiv Detail & Related papers (2025-03-14T05:06:07Z) - Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks.<n>However, they still struggle with problems requiring multi-step decision-making and environmental feedback.<n>We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z) - Multi-Agent Path Finding in Continuous Spaces with Projected Diffusion Models [57.45019514036948]
Multi-Agent Path Finding (MAPF) is a fundamental problem in robotics.<n>This work proposes a novel approach that integrates constrained optimization with diffusion models for MAPF in continuous spaces.
arXiv Detail & Related papers (2024-12-23T21:27:19Z) - MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation [52.739500459903724]
Large Language Models (LLMs) have demonstrated remarkable planning abilities across various domains, including robotics manipulation and navigation.
We propose a novel multi-agent LLM framework that distributes high-level planning and low-level control code generation across specialized LLM agents.
We evaluate our approach on nine RLBench tasks, including long-horizon tasks, and demonstrate its ability to solve robotics manipulation in a zero-shot setting.
arXiv Detail & Related papers (2024-11-26T17:53:44Z) - DynaSaur: Large Language Agents Beyond Predefined Actions [108.75187263724838]
Existing LLM agent systems typically select actions from a fixed and predefined set at every step.<n>We propose an LLM agent framework that can dynamically create and compose actions as needed.<n>In this framework, the agent interacts with its environment by generating and executing programs written in a general-purpose programming language.
arXiv Detail & Related papers (2024-11-04T02:08:59Z) - LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning [78.2390460278551]
Conventional Task and Motion Planning (TAMP) approaches rely on manually crafted interfaces connecting symbolic task planning with continuous motion generation.
Here, we present LLM3, a novel Large Language Model (LLM)-based TAMP framework featuring a domain-independent interface.
Specifically, we leverage the powerful reasoning and planning capabilities of pre-trained LLMs to propose symbolic action sequences and select continuous action parameters for motion planning.
arXiv Detail & Related papers (2024-03-18T08:03:47Z) - LgTS: Dynamic Task Sampling using LLM-generated sub-goals for
Reinforcement Learning Agents [10.936460061405157]
We propose LgTS (LLM-guided Teacher-Student learning), a novel approach that explores the planning abilities of LLMs.
Our approach does not assume access to a propreitary or a fine-tuned LLM, nor does it require pre-trained policies that achieve the sub-goals proposed by the LLM.
arXiv Detail & Related papers (2023-10-14T00:07:03Z) - LanguageMPC: Large Language Models as Decision Makers for Autonomous
Driving [87.1164964709168]
This work employs Large Language Models (LLMs) as a decision-making component for complex autonomous driving scenarios.
Extensive experiments demonstrate that our proposed method not only consistently surpasses baseline approaches in single-vehicle tasks, but also helps handle complex driving behaviors even multi-vehicle coordination.
arXiv Detail & Related papers (2023-10-04T17:59:49Z) - CARL: Controllable Agent with Reinforcement Learning for Quadruped
Locomotion [0.0]
We present CARL, a quadruped agent that can be controlled with high-level directives and react naturally to dynamic environments.
We use Generative Adrial Networks to adapt high-level controls, such as speed and heading, to action distributions that correspond to the original animations.
Further fine-tuning through the deep reinforcement learning enables the agent to recover from unseen external perturbations while producing smooth transitions.
arXiv Detail & Related papers (2020-05-07T07:18:57Z) - Learning to Move with Affordance Maps [57.198806691838364]
The ability to autonomously explore and navigate a physical space is a fundamental requirement for virtually any mobile autonomous agent.
Traditional SLAM-based approaches for exploration and navigation largely focus on leveraging scene geometry.
We show that learned affordance maps can be used to augment traditional approaches for both exploration and navigation, providing significant improvements in performance.
arXiv Detail & Related papers (2020-01-08T04:05:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.