Related papers: Advances in Embodied Navigation Using Large Language Models: A Survey

Advances in Embodied Navigation Using Large Language Models: A Survey

URL: http://arxiv.org/abs/2311.00530v4
Date: Fri, 7 Jun 2024 13:13:41 GMT
Title: Advances in Embodied Navigation Using Large Language Models: A Survey
Authors: Jinzhou Lin, Han Gao, Xuxiang Feng, Rongtao Xu, Changwei Wang, Man Zhang, Li Guo, Shibiao Xu,
Abstract summary: The article offers an exhaustive summary of the symbiosis between Large Language Models and Embodied Intelligence. It reviews state-of-the-art models, research methodologies, and assesses the advantages and disadvantages of existing embodied navigation models and datasets. Finally, the article elucidates the role of LLMs in embodied intelligence, based on current research, and forecasts future directions in the field.
Score: 16.8165925743264
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, the rapid advancement of Large Language Models (LLMs) such as the Generative Pre-trained Transformer (GPT) has attracted increasing attention due to their potential in a variety of practical applications. The application of LLMs with Embodied Intelligence has emerged as a significant area of focus. Among the myriad applications of LLMs, navigation tasks are particularly noteworthy because they demand a deep understanding of the environment and quick, accurate decision-making. LLMs can augment embodied intelligence systems with sophisticated environmental perception and decision-making support, leveraging their robust language and image-processing capabilities. This article offers an exhaustive summary of the symbiosis between LLMs and embodied intelligence with a focus on navigation. It reviews state-of-the-art models, research methodologies, and assesses the advantages and disadvantages of existing embodied navigation models and datasets. Finally, the article elucidates the role of LLMs in embodied intelligence, based on current research, and forecasts future directions in the field. A comprehensive list of studies in this survey is available at https://github.com/Rongtao-Xu/Awesome-LLM-EN.

Related papers

Multimodal Large Language Models Meet Multimodal Emotion Recognition and Reasoning: A Survey [40.20905051575087]
In AI for Science, multimodal emotion recognition and reasoning has become a rapidly growing frontier.<n>This paper is the first attempt to comprehensively survey the intersection of MLLMs with multimodal emotion recognition and reasoning.
arXiv Detail & Related papers (2025-09-29T06:13:14Z)
A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval [26.797683195019246]
Large language models (LLMs) have demonstrated capabilities that surpass human performance in various language-related tasks. This paper explores the transformative potential of LLM agents in enhancing recommender and search systems. We highlight the immense potential of LLM agents in addressing current challenges in recommendation and search.
arXiv Detail & Related papers (2025-03-07T18:20:30Z)
LLM Post-Training: A Deep Dive into Reasoning Large Language Models [131.10969986056]
Large Language Models (LLMs) have transformed the natural language processing landscape and brought to life diverse applications. Post-training methods enable LLMs to refine their knowledge, improve reasoning, enhance factual accuracy, and align more effectively with user intents and ethical considerations.
arXiv Detail & Related papers (2025-02-28T18:59:54Z)
From Selection to Generation: A Survey of LLM-based Active Learning [153.8110509961261]
Large Language Models (LLMs) have been employed for generating entirely new data instances and providing more cost-effective annotations. This survey aims to serve as an up-to-date resource for researchers and practitioners seeking to gain an intuitive understanding of LLM-based AL techniques.
arXiv Detail & Related papers (2025-02-17T12:58:17Z)
Lifelong Learning of Large Language Model based Agents: A Roadmap [39.01532420650279]
Lifelong learning, also known as continual or incremental learning, is a crucial component for advancing Artificial General Intelligence (AGI) This survey is the first to systematically summarize the potential techniques for incorporating lifelong learning into large language models (LLMs) We highlight how these pillars collectively enable continuous adaptation, mitigate catastrophic forgetting, and improve long-term performance.
arXiv Detail & Related papers (2025-01-13T12:42:04Z)
LLM4PR: Improving Post-Ranking in Search Engine with Large Language Models [9.566432486156335]
Large Language Models for Post-Ranking in search engine (LLM4PR) We introduce a novel paradigm named Large Language Models for Post-Ranking in search engine (LLM4PR)
arXiv Detail & Related papers (2024-11-02T08:36:16Z)
RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks. Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs. In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z)
From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal Reasoning with Large Language Models [56.9134620424985]
Cross-modal reasoning (CMR) is increasingly recognized as a crucial capability in the progression toward more sophisticated artificial intelligence systems. The recent trend of deploying Large Language Models (LLMs) to tackle CMR tasks has marked a new mainstream of approaches for enhancing their effectiveness. This survey offers a nuanced exposition of current methodologies applied in CMR using LLMs, classifying these into a detailed three-tiered taxonomy.
arXiv Detail & Related papers (2024-09-19T02:51:54Z)
NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models [30.685419129265252]
We bridge the divide between VLN-specialized models and LLM-based navigation paradigms. We exploit a way to incorporate LLMs and navigation policy networks for effective action predictions and navigational reasoning.
arXiv Detail & Related papers (2024-07-17T07:44:26Z)
A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models [71.25225058845324]
Large Language Models (LLMs) have demonstrated revolutionary abilities in language understanding and generation. Retrieval-Augmented Generation (RAG) can offer reliable and up-to-date external knowledge. RA-LLMs have emerged to harness external and authoritative knowledge bases, rather than relying on the model's internal knowledge.
arXiv Detail & Related papers (2024-05-10T02:48:45Z)
Large Language Models for Generative Information Extraction: A Survey [89.71273968283616]
Large Language Models (LLMs) have demonstrated remarkable capabilities in text understanding and generation. We present an extensive overview by categorizing these works in terms of various IE subtasks and techniques. We empirically analyze the most advanced methods and discover the emerging trend of IE tasks with LLMs.
arXiv Detail & Related papers (2023-12-29T14:25:22Z)
Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z)
Large Language Models Meet Computer Vision: A Brief Survey [0.0]
Large Language Models (LLMs) and Computer Vision (CV) have emerged as a pivotal area of research, driving significant advancements in the field of Artificial Intelligence (AI) This survey paper delves into the latest progressions in the domain of transformers, emphasizing their potential to revolutionize Vision Transformers (ViTs) and LLMs. The survey is concluded by highlighting open directions in the field, suggesting potential venues for future research and development.
arXiv Detail & Related papers (2023-11-28T10:39:19Z)
How Can Recommender Systems Benefit from Large Language Models: A Survey [82.06729592294322]
Large language models (LLM) have shown impressive general intelligence and human-like capabilities. We conduct a comprehensive survey on this research direction from the perspective of the whole pipeline in real-world recommender systems.
arXiv Detail & Related papers (2023-06-09T11:31:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.