Related papers: Multi-Object Manipulation via Object-Centric Neural Scattering Functions

Multi-Object Manipulation via Object-Centric Neural Scattering Functions

URL: http://arxiv.org/abs/2306.08748v1
Date: Wed, 14 Jun 2023 21:14:10 GMT
Title: Multi-Object Manipulation via Object-Centric Neural Scattering Functions
Authors: Stephen Tian, Yancheng Cai, Hong-Xing Yu, Sergey Zakharov, Katherine Liu, Adrien Gaidon, Yunzhu Li, Jiajun Wu
Abstract summary: We propose using object-centric neural scattering functions (OSFs) as object representations in a model-predictive control framework. OSFs model per-object light transport, enabling compositional scene re-rendering under object rearrangement and varying lighting conditions.
Score: 40.45919680959231
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learned visual dynamics models have proven effective for robotic manipulation tasks. Yet, it remains unclear how best to represent scenes involving multi-object interactions. Current methods decompose a scene into discrete objects, but they struggle with precise modeling and manipulation amid challenging lighting conditions as they only encode appearance tied with specific illuminations. In this work, we propose using object-centric neural scattering functions (OSFs) as object representations in a model-predictive control framework. OSFs model per-object light transport, enabling compositional scene re-rendering under object rearrangement and varying lighting conditions. By combining this approach with inverse parameter estimation and graph-based neural dynamics models, we demonstrate improved model-predictive control performance and generalization in compositional multi-object environments, even in previously unseen scenarios and harsh lighting conditions.

Related papers

ArmGS: Composite Gaussian Appearance Refinement for Modeling Dynamic Urban Environments [22.371417505012566]
This work focuses on modeling dynamic urban environments for autonomous driving simulation.<n>We propose a new approach named ArmGS that exploits composite driving Gaussian splatting with multi-granularity appearance refinement.<n>This not only models global scene appearance variations between frames and camera viewpoints, but also models local fine-grained photorealistic changes of background and objects.
arXiv Detail & Related papers (2025-07-05T03:54:40Z)
Particle-Grid Neural Dynamics for Learning Deformable Object Models from RGB-D Videos [30.367498271886866]
We develop a neural dynamics framework that combines object particles and spatial grids in a hybrid representation.<n>We demonstrate that our model learns the dynamics of diverse objects from sparse-view RGB-D recordings of robot-object interactions.<n>Our approach outperforms state-of-the-art learning-based and physics-based simulators, particularly in scenarios with limited camera views.
arXiv Detail & Related papers (2025-06-18T17:59:38Z)
ObjectMover: Generative Object Movement with Video Prior [69.75281888309017]
We present ObjectMover, a generative model that can perform object movement in challenging scenes. We show that with this approach, our model is able to adjust to complex real-world scenarios. We propose a multi-task learning strategy that enables training on real-world video data to improve the model generalization.
arXiv Detail & Related papers (2025-03-11T04:42:59Z)
DifFRelight: Diffusion-Based Facial Performance Relighting [12.909429637057343]
We present a novel framework for free-viewpoint facial performance relighting using diffusion-based image-to-image translation. We train a diffusion model for precise lighting control, enabling high-fidelity relit facial images from flat-lit inputs. The model accurately reproduces complex lighting effects like eye reflections, subsurface scattering, self-shadowing, and translucency.
arXiv Detail & Related papers (2024-10-10T17:56:44Z)
DENSER: 3D Gaussians Splatting for Scene Reconstruction of Dynamic Urban Environments [0.0]
We propose DENSER, a framework that significantly enhances the representation of dynamic objects. The proposed approach significantly outperforms state-of-the-art methods by a wide margin.
arXiv Detail & Related papers (2024-09-16T07:11:58Z)
Curved Diffusion: A Generative Model With Optical Geometry Control [56.24220665691974]
The influence of different optical systems on the final scene appearance is frequently overlooked. This study introduces a framework that intimately integrates a textto-image diffusion model with the particular lens used in image rendering.
arXiv Detail & Related papers (2023-11-29T13:06:48Z)
UniQuadric: A SLAM Backend for Unknown Rigid Object 3D Tracking and Light-Weight Modeling [7.626461564400769]
We propose a novel SLAM backend that unifies ego-motion tracking, rigid object motion tracking, and modeling. Our system showcases the potential application of object perception in complex dynamic scenes.
arXiv Detail & Related papers (2023-09-29T07:50:09Z)
Learning Object-Centric Neural Scattering Functions for Free-Viewpoint Relighting and Scene Composition [28.533032162292297]
We propose Object-Centric Neural Scattering Functions for learning to reconstruct object appearance from only images. OSFs support free-viewpoint object relighting, but also can model both opaque and translucent objects. Experiments on real and synthetic data show that OSFs accurately reconstruct appearances for both opaque and translucent objects.
arXiv Detail & Related papers (2023-03-10T18:55:46Z)
Robust Dynamic Radiance Fields [79.43526586134163]
Dynamic radiance field reconstruction methods aim to model the time-varying structure and appearance of a dynamic scene. Existing methods, however, assume that accurate camera poses can be reliably estimated by Structure from Motion (SfM) algorithms. We address this robustness issue by jointly estimating the static and dynamic radiance fields along with the camera parameters.
arXiv Detail & Related papers (2023-01-05T18:59:51Z)
MoCo-Flow: Neural Motion Consensus Flow for Dynamic Humans in Stationary Monocular Cameras [98.40768911788854]
We introduce MoCo-Flow, a representation that models the dynamic scene using a 4D continuous time-variant function. At the heart of our work lies a novel optimization formulation, which is constrained by a motion consensus regularization on the motion flow. We extensively evaluate MoCo-Flow on several datasets that contain human motions of varying complexity.
arXiv Detail & Related papers (2021-06-08T16:03:50Z)
Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency [114.02182755620784]
We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision. Our framework is shown to outperform the state-of-the-art depth and motion estimation methods.
arXiv Detail & Related papers (2021-02-04T14:26:42Z)
Learning Predictive Representations for Deformable Objects Using Contrastive Estimation [83.16948429592621]
We propose a new learning framework that jointly optimize both the visual representation model and the dynamics model. We show substantial improvements over standard model-based learning techniques across our rope and cloth manipulation suite.
arXiv Detail & Related papers (2020-03-11T17:55:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.