LLMPhy: Complex Physical Reasoning Using Large Language Models and World Models
- URL: http://arxiv.org/abs/2411.08027v1
- Date: Tue, 12 Nov 2024 18:56:58 GMT
- Title: LLMPhy: Complex Physical Reasoning Using Large Language Models and World Models
- Authors: Anoop Cherian, Radu Corcodel, Siddarth Jain, Diego Romeres,
- Abstract summary: We propose a new physical reasoning task and a dataset, dubbed TraySim.
Our task involves predicting the dynamics of several objects on a tray that is given an external impact.
We present LLMPhy, a zero-shot black-box optimization framework that leverages the physics knowledge and program synthesis abilities of LLMs.
Our results show that the combination of the LLM and the physics engine leads to state-of-the-art zero-shot physical reasoning performance.
- Score: 35.01842161084472
- License:
- Abstract: Physical reasoning is an important skill needed for robotic agents when operating in the real world. However, solving such reasoning problems often involves hypothesizing and reflecting over complex multi-body interactions under the effect of a multitude of physical forces and thus learning all such interactions poses a significant hurdle for state-of-the-art machine learning frameworks, including large language models (LLMs). To study this problem, we propose a new physical reasoning task and a dataset, dubbed TraySim. Our task involves predicting the dynamics of several objects on a tray that is given an external impact -- the domino effect of the ensued object interactions and their dynamics thus offering a challenging yet controlled setup, with the goal of reasoning being to infer the stability of the objects after the impact. To solve this complex physical reasoning task, we present LLMPhy, a zero-shot black-box optimization framework that leverages the physics knowledge and program synthesis abilities of LLMs, and synergizes these abilities with the world models built into modern physics engines. Specifically, LLMPhy uses an LLM to generate code to iteratively estimate the physical hyperparameters of the system (friction, damping, layout, etc.) via an implicit analysis-by-synthesis approach using a (non-differentiable) simulator in the loop and uses the inferred parameters to imagine the dynamics of the scene towards solving the reasoning task. To show the effectiveness of LLMPhy, we present experiments on our TraySim dataset to predict the steady-state poses of the objects. Our results show that the combination of the LLM and the physics engine leads to state-of-the-art zero-shot physical reasoning performance, while demonstrating superior convergence against standard black-box optimization methods and better estimation of the physical parameters.
Related papers
- Differentiable Physics-based System Identification for Robotic Manipulation of Elastoplastic Materials [43.99845081513279]
This work introduces a novel Differentiable Physics-based System Identification (DPSI) framework that enables a robot arm to infer the physics parameters of elastoplastic materials and the environment.
With only a single real-world interaction, the estimated parameters can accurately simulate visually and physically realistic behaviours.
arXiv Detail & Related papers (2024-11-01T13:04:25Z) - Exploring Failure Cases in Multimodal Reasoning About Physical Dynamics [5.497036643694402]
We construct a simple simulated environment and demonstrate examples of where, in a zero-shot setting, both text and multimodal LLMs display atomic world knowledge about various objects but fail to compose this knowledge in correct solutions for an object manipulation and placement task.
We also use BLIP, a vision-language model trained with more sophisticated cross-modal attention, to identify cases relevant to object physical properties that that model fails to ground.
arXiv Detail & Related papers (2024-02-24T00:01:01Z) - ContPhy: Continuum Physical Concept Learning and Reasoning from Videos [86.63174804149216]
ContPhy is a novel benchmark for assessing machine physical commonsense.
We evaluated a range of AI models and found that they still struggle to achieve satisfactory performance on ContPhy.
We also introduce an oracle model (ContPRO) that marries the particle-based physical dynamic models with the recent large language models.
arXiv Detail & Related papers (2024-02-09T01:09:21Z) - DeepSimHO: Stable Pose Estimation for Hand-Object Interaction via
Physics Simulation [81.11585774044848]
We present DeepSimHO, a novel deep-learning pipeline that combines forward physics simulation and backward gradient approximation with a neural network.
Our method noticeably improves the stability of the estimation and achieves superior efficiency over test-time optimization.
arXiv Detail & Related papers (2023-10-11T05:34:36Z) - UniQuadric: A SLAM Backend for Unknown Rigid Object 3D Tracking and
Light-Weight Modeling [7.626461564400769]
We propose a novel SLAM backend that unifies ego-motion tracking, rigid object motion tracking, and modeling.
Our system showcases the potential application of object perception in complex dynamic scenes.
arXiv Detail & Related papers (2023-09-29T07:50:09Z) - Physics-Based Task Generation through Causal Sequence of Physical
Interactions [3.2244944291325996]
Performing tasks in a physical environment is a crucial yet challenging problem for AI systems operating in the real world.
We present a systematic approach for defining a physical scenario using a causal sequence of physical interactions between objects.
We then propose a methodology for generating tasks in a physics-simulating environment using defined scenarios as inputs.
arXiv Detail & Related papers (2023-08-05T10:15:18Z) - Nonprehensile Riemannian Motion Predictive Control [57.295751294224765]
We introduce a novel Real-to-Sim reward analysis technique to reliably imagine and predict the outcome of taking possible actions for a real robotic platform.
We produce a closed-loop controller to reactively push objects in a continuous action space.
We observe that RMPC is robust in cluttered as well as occluded environments and outperforms the baselines.
arXiv Detail & Related papers (2021-11-15T18:50:04Z) - PlasticineLab: A Soft-Body Manipulation Benchmark with Differentiable
Physics [89.81550748680245]
We introduce a new differentiable physics benchmark called PasticineLab.
In each task, the agent uses manipulators to deform the plasticine into the desired configuration.
We evaluate several existing reinforcement learning (RL) methods and gradient-based methods on this benchmark.
arXiv Detail & Related papers (2021-04-07T17:59:23Z) - Physics-Integrated Variational Autoencoders for Robust and Interpretable
Generative Modeling [86.9726984929758]
We focus on the integration of incomplete physics models into deep generative models.
We propose a VAE architecture in which a part of the latent space is grounded by physics.
We demonstrate generative performance improvements over a set of synthetic and real-world datasets.
arXiv Detail & Related papers (2021-02-25T20:28:52Z) - Scalable Differentiable Physics for Learning and Control [99.4302215142673]
Differentiable physics is a powerful approach to learning and control problems that involve physical objects and environments.
We develop a scalable framework for differentiable physics that can support a large number of objects and their interactions.
arXiv Detail & Related papers (2020-07-04T19:07:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.