Object-centric Denoising Diffusion Models for Physical Reasoning
- URL: http://arxiv.org/abs/2507.04920v1
- Date: Mon, 07 Jul 2025 12:06:24 GMT
- Title: Object-centric Denoising Diffusion Models for Physical Reasoning
- Authors: Moritz Lange, Raphael C. Engelhardt, Wolfgang Konen, Andrew Melnik, Laurenz Wiskott,
- Abstract summary: Reasoning about trajectories of interacting objects is integral to physical reasoning tasks in machine learning.<n>We propose an object-centric denoising diffusion model architecture for physical reasoning that is translation equivariant over time.<n>We demonstrate how this model can solve tasks with multiple conditions and examine its performance when changing object numbers.
- Score: 0.14348906950833226
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reasoning about the trajectories of multiple, interacting objects is integral to physical reasoning tasks in machine learning. This involves conditions imposed on the objects at different time steps, for instance initial states or desired goal states. Existing approaches in physical reasoning generally rely on autoregressive modeling, which can only be conditioned on initial states, but not on later states. In fields such as planning for reinforcement learning, similar challenges are being addressed with denoising diffusion models. In this work, we propose an object-centric denoising diffusion model architecture for physical reasoning that is translation equivariant over time, permutation equivariant over objects, and can be conditioned on arbitrary time steps for arbitrary objects. We demonstrate how this model can solve tasks with multiple conditions and examine its performance when changing object numbers and trajectory lengths during inference.
Related papers
- EqCollide: Equivariant and Collision-Aware Deformable Objects Neural Simulator [6.056458618771203]
We introduce EqCollide, the first end-to-end equivariant neural fields simulator for deformable objects and their collisions.<n> Experimental results show that EqCollide achieves accurate, stable, and scalable simulations across diverse object configurations.
arXiv Detail & Related papers (2025-06-06T06:49:58Z) - Generative Perception of Shape and Material from Differential Motion [17.090405682103167]
We introduce a novel conditional denoising-diffusion model that generates shape-and-material maps from a short video of an object undergoing differential motions.<n>Our work suggests a generative perception approach for improving visual reasoning in physically-embodied systems.
arXiv Detail & Related papers (2025-06-03T05:43:20Z) - Object-centric architectures enable efficient causal representation
learning [51.6196391784561]
We show that when the observations are of multiple objects, the generative function is no longer injective and disentanglement fails in practice.
We develop an object-centric architecture that leverages weak supervision from sparse perturbations to disentangle each object's properties.
This approach is more data-efficient in the sense that it requires significantly fewer perturbations than a comparable approach that encodes to a Euclidean space.
arXiv Detail & Related papers (2023-10-29T16:01:03Z) - 6-DoF Stability Field via Diffusion Models [9.631625582146537]
We present 6-DoFusion, a generative model capable of generating 3D poses of an object that produces a stable configuration of a given scene.
We evaluate our model on different object placement and stacking tasks, demonstrating its ability to construct stable scenes.
arXiv Detail & Related papers (2023-10-26T17:59:12Z) - Neural Lumped Parameter Differential Equations with Application in
Friction-Stir Processing [2.158307833088858]
Lumped parameter methods aim to simplify the evolution of spatially-extended or continuous physical systems.
We build upon the notion of the Universal Differential Equation to construct data-driven models for reducing dynamics to that of a lumped parameter.
arXiv Detail & Related papers (2023-04-18T15:11:27Z) - Learning Physical Dynamics with Subequivariant Graph Neural Networks [99.41677381754678]
Graph Neural Networks (GNNs) have become a prevailing tool for learning physical dynamics.
Physical laws abide by symmetry, which is a vital inductive bias accounting for model generalization.
Our model achieves on average over 3% enhancement in contact prediction accuracy across 8 scenarios on Physion and 2X lower rollout MSE on RigidFall.
arXiv Detail & Related papers (2022-10-13T10:00:30Z) - Suspected Object Matters: Rethinking Model's Prediction for One-stage
Visual Grounding [93.82542533426766]
We propose a Suspected Object Transformation mechanism (SOT) to encourage the target object selection among the suspected ones.
SOT can be seamlessly integrated into existing CNN and Transformer-based one-stage visual grounders.
Extensive experiments demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2022-03-10T06:41:07Z) - A Bayesian Treatment of Real-to-Sim for Deformable Object Manipulation [59.29922697476789]
We propose a novel methodology for extracting state information from image sequences via a technique to represent the state of a deformable object as a distribution embedding.
Our experiments confirm that we can estimate posterior distributions of physical properties, such as elasticity, friction and scale of highly deformable objects, such as cloth and ropes.
arXiv Detail & Related papers (2021-12-09T17:50:54Z) - Scalable Differentiable Physics for Learning and Control [99.4302215142673]
Differentiable physics is a powerful approach to learning and control problems that involve physical objects and environments.
We develop a scalable framework for differentiable physics that can support a large number of objects and their interactions.
arXiv Detail & Related papers (2020-07-04T19:07:51Z) - Visual Grounding of Learned Physical Models [66.04898704928517]
Humans intuitively recognize objects' physical properties and predict their motion, even when the objects are engaged in complicated interactions.
We present a neural model that simultaneously reasons about physics and makes future predictions based on visual and dynamics priors.
Experiments show that our model can infer the physical properties within a few observations, which allows the model to quickly adapt to unseen scenarios and make accurate predictions into the future.
arXiv Detail & Related papers (2020-04-28T17:06:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.