DMWM: Dual-Mind World Model with Long-Term Imagination
- URL: http://arxiv.org/abs/2502.07591v1
- Date: Tue, 11 Feb 2025 14:40:57 GMT
- Title: DMWM: Dual-Mind World Model with Long-Term Imagination
- Authors: Lingyi Wang, Rashed Shelim, Walid Saad, Naren Ramakrishnan,
- Abstract summary: We propose a novel dual-mind world model (DMWM) framework that integrates logical reasoning to enable imagination with logical consistency.
The proposed framework is evaluated on benchmark tasks that require long-term planning from the DMControl suite.
- Score: 53.98633183204453
- License:
- Abstract: Imagination in world models is crucial for enabling agents to learn long-horizon policy in a sample-efficient manner. Existing recurrent state-space model (RSSM)-based world models depend on single-step statistical inference to capture the environment dynamics, and, hence, they are unable to perform long-term imagination tasks due to the accumulation of prediction errors. Inspired by the dual-process theory of human cognition, we propose a novel dual-mind world model (DMWM) framework that integrates logical reasoning to enable imagination with logical consistency. DMWM is composed of two components: an RSSM-based System 1 (RSSM-S1) component that handles state transitions in an intuitive manner and a logic-integrated neural network-based System 2 (LINN-S2) component that guides the imagination process through hierarchical deep logical reasoning. The inter-system feedback mechanism is designed to ensure that the imagination process follows the logical rules of the real environment. The proposed framework is evaluated on benchmark tasks that require long-term planning from the DMControl suite. Extensive experimental results demonstrate that the proposed framework yields significant improvements in terms of logical coherence, trial efficiency, data efficiency and long-term imagination over the state-of-the-art world models.
Related papers
- Zero-Shot Interactive Text-to-Image Retrieval via Diffusion-Augmented Representations [7.439049772394586]
Diffusion Augmented Retrieval (DAR) is a paradigm-shifting framework that bypasses MLLM finetuning entirely.
DAR synergizes Large Language Model (LLM)-guided query refinement with Diffusion Model (DM)-based visual synthesis to create contextually enriched intermediate representations.
arXiv Detail & Related papers (2025-01-26T03:29:18Z) - InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions [104.90258030688256]
This project introduces disentangled streaming perception, reasoning, and memory mechanisms, enabling real-time interaction with streaming video and audio input.
This project simulates human-like cognition, enabling multimodal large language models to provide continuous and adaptive service over time.
arXiv Detail & Related papers (2024-12-12T18:58:30Z) - Provable Benefits of Complex Parameterizations for Structured State Space Models [51.90574950170374]
Structured state space models (SSMs) are linear dynamical systems adhering to a specified structure.
In contrast to typical neural network modules, whose parameterizations are real, SSMs often use complex parameterizations.
This paper takes a step towards explaining the benefits of complex parameterizations for SSMs by establishing formal gaps between real and complex diagonal SSMs.
arXiv Detail & Related papers (2024-10-17T22:35:50Z) - Binarized Diffusion Model for Image Super-Resolution [61.963833405167875]
Binarization, an ultra-compression algorithm, offers the potential for effectively accelerating advanced diffusion models (DMs)
Existing binarization methods result in significant performance degradation.
We introduce a novel binarized diffusion model, BI-DiffSR, for image SR.
arXiv Detail & Related papers (2024-06-09T10:30:25Z) - Large Multi-Modal Models (LMMs) as Universal Foundation Models for
AI-Native Wireless Systems [57.41621687431203]
Large language models (LLMs) and foundation models have been recently touted as a game-changer for 6G systems.
This paper presents a comprehensive vision on how to design universal foundation models tailored towards the deployment of artificial intelligence (AI)-native networks.
arXiv Detail & Related papers (2024-01-30T00:21:41Z) - A Biologically-Inspired Dual Stream World Model [0.456877715768796]
The medial temporal lobe (MTL) is hypothesized to be an experience-construction system in mammals.
We propose a novel variant, the Dual Stream World Model (DSWM), which learns from high-dimensional observations and dissociates them into context and content streams.
We show that this representation is useful as a reinforcement learning basis function, and that the generative model can be used to aid the policy learning process using Dyna-like updates.
arXiv Detail & Related papers (2022-09-16T16:27:48Z) - One-shot Visual Reasoning on RPMs with an Application to Video Frame
Prediction [1.0932251830449902]
Raven's Progressive Matrices (RPMs) are frequently used in evaluating human's visual reasoning ability.
We propose a One-shot Human-Understandable ReaSoner (Os-HURS) to tackle the challenges of real-world visual recognition and subsequent logical reasoning tasks.
arXiv Detail & Related papers (2021-11-24T06:51:38Z) - Improving Coherence and Consistency in Neural Sequence Models with
Dual-System, Neuro-Symbolic Reasoning [49.6928533575956]
We use neural inference to mediate between the neural System 1 and the logical System 2.
Results in robust story generation and grounded instruction-following show that this approach can increase the coherence and accuracy of neurally-based generations.
arXiv Detail & Related papers (2021-07-06T17:59:49Z) - Relational State-Space Model for Stochastic Multi-Object Systems [24.234120525358456]
This paper introduces the relational state-space model (R-SSM), a sequential hierarchical latent variable model.
R-SSM makes use of graph neural networks (GNNs) to simulate the joint state transitions of multiple correlated objects.
The utility of R-SSM is empirically evaluated on synthetic and real time-series datasets.
arXiv Detail & Related papers (2020-01-13T03:45:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.