Related papers: My House, My Rules: Learning Tidying Preferences with Graph Neural Networks

My House, My Rules: Learning Tidying Preferences with Graph Neural Networks

URL: http://arxiv.org/abs/2111.03112v1
Date: Thu, 4 Nov 2021 19:17:19 GMT
Title: My House, My Rules: Learning Tidying Preferences with Graph Neural Networks
Authors: Ivan Kapelyukh and Edward Johns
Abstract summary: We present NeatNet: a novel Variational Autoencoder architecture using Graph Neural Network layers. We extract a low-dimensional latent preference vector from a user by observing how they arrange scenes. Given any set of objects, this vector can then be used to generate an arrangement which is tailored to that user's spatial preferences.
Score: 8.57914821832517
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Robots that arrange household objects should do so according to the user's preferences, which are inherently subjective and difficult to model. We present NeatNet: a novel Variational Autoencoder architecture using Graph Neural Network layers, which can extract a low-dimensional latent preference vector from a user by observing how they arrange scenes. Given any set of objects, this vector can then be used to generate an arrangement which is tailored to that user's spatial preferences, with word embeddings used for generalisation to new objects. We develop a tidying simulator to gather rearrangement examples from 75 users, and demonstrate empirically that our method consistently produces neat and personalised arrangements across a variety of rearrangement scenarios.

Related papers

DegustaBot: Zero-Shot Visual Preference Estimation for Personalized Multi-Object Rearrangement [53.86523017756224]
We present DegustaBot, an algorithm for visual preference learning that solves household multi-object rearrangement tasks according to personal preference. We collect a large dataset of naturalistic personal preferences in a simulated table-setting task. We find that 50% of our model's predictions are likely to be found acceptable by at least 20% of people.
arXiv Detail & Related papers (2024-07-11T21:28:02Z)
Knolling Bot: Learning Robotic Object Arrangement from Tidy Demonstrations [11.873522421121173]
This paper introduces a self-supervised learning framework that allows robots to understand and replicate the concept of tidiness. We leverage a transformer neural network to predict the placement of subsequent objects. Our method not only trains a generalizable concept of tidiness, but it can also incorporate human preferences to generate customized tidy tables.
arXiv Detail & Related papers (2023-10-06T20:13:07Z)
AnyDoor: Zero-shot Object-level Image Customization [63.44307304097742]
This work presents AnyDoor, a diffusion-based image generator with the power to teleport target objects to new scenes at user-specified locations. Our model is trained only once and effortlessly generalizes to diverse object-scene combinations at the inference stage.
arXiv Detail & Related papers (2023-07-18T17:59:02Z)
Modeling Dynamic Environments with Scene Graph Memory [46.587536843634055]
We present a new type of link prediction problem: link prediction on partially observable dynamic graphs. Our graph is a representation of a scene in which rooms and objects are nodes, and their relationships are encoded in the edges. We propose a novel state representation -- Scene Graph Memory (SGM) -- with captures the agent's accumulated set of observations. We evaluate our method in the Dynamic House Simulator, a new benchmark that creates diverse dynamic graphs following the semantic patterns typically seen at homes.
arXiv Detail & Related papers (2023-05-27T17:39:38Z)
Modeling Dynamic User Preference via Dictionary Learning for Sequential Recommendation [133.8758914874593]
Capturing the dynamics in user preference is crucial to better predict user future behaviors because user preferences often drift over time. Many existing recommendation algorithms -- including both shallow and deep ones -- often model such dynamics independently. This paper considers the problem of embedding a user's sequential behavior into the latent space of user preferences.
arXiv Detail & Related papers (2022-04-02T03:23:46Z)
Learning Multi-Object Dynamics with Compositional Neural Radiance Fields [63.424469458529906]
We present a method to learn compositional predictive models from image observations based on implicit object encoders, Neural Radiance Fields (NeRFs), and graph neural networks. NeRFs have become a popular choice for representing scenes due to their strong 3D prior. For planning, we utilize RRTs in the learned latent space, where we can exploit our model and the implicit object encoder to make sampling the latent space informative and more efficient.
arXiv Detail & Related papers (2022-02-24T01:31:29Z)
Predicting Stable Configurations for Semantic Placement of Novel Objects [37.18437299513799]
Our goal is to enable robots to repose previously unseen objects according to learned semantic relationships in novel environments. We build our models and training from the ground up to be tightly integrated with our proposed planning algorithm for semantic placement of unknown objects. Our approach enables motion planning for semantic rearrangement of unknown objects in scenes with varying geometry from only RGB-D sensing.
arXiv Detail & Related papers (2021-08-26T23:05:05Z)
RICE: Refining Instance Masks in Cluttered Environments with Graph Neural Networks [53.15260967235835]
We propose a novel framework that refines the output of such methods by utilizing a graph-based representation of instance masks. We train deep networks capable of sampling smart perturbations to the segmentations, and a graph neural network, which can encode relations between objects, to evaluate the segmentations. We demonstrate an application that uses uncertainty estimates generated by our method to guide a manipulator, leading to efficient understanding of cluttered scenes.
arXiv Detail & Related papers (2021-06-29T20:29:29Z)
One-Shot Object Localization Using Learnt Visual Cues via Siamese Networks [0.7832189413179361]
In this work, a visual cue is used to specify a novel object of interest which must be localized in new environments. An end-to-end neural network equipped with a Siamese network is used to learn the cue, infer the object of interest, and then to localize it in new environments.
arXiv Detail & Related papers (2020-12-26T07:40:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.