Structured Latent Variable Models for Articulated Object Interaction
- URL: http://arxiv.org/abs/2305.16567v1
- Date: Fri, 26 May 2023 01:22:35 GMT
- Title: Structured Latent Variable Models for Articulated Object Interaction
- Authors: Emily Liu, Michael Noseworthy, Nicholas Roy
- Abstract summary: We investigate a scenario in which a robot learns a low-dimensional representation of a door given a video of the door opening or closing.
This representation can be used to infer door-related parameters and predict the outcomes of interacting with the door.
- Score: 24.97457132614502
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we investigate a scenario in which a robot learns a
low-dimensional representation of a door given a video of the door opening or
closing. This representation can be used to infer door-related parameters and
predict the outcomes of interacting with the door. Current machine learning
based approaches in the doors domain are based primarily on labelled datasets.
However, the large quantity of available door data suggests the feasibility of
a semisupervised approach based on pretraining. To exploit the hierarchical
structure of the dataset where each door has multiple associated images, we
pretrain with a structured latent variable model known as a neural
statistician. The neural satsitician enforces separation between shared
context-level variables (common across all images associated with the same
door) and instance-level variables (unique to each individual image). We first
demonstrate that the neural statistician is able to learn an embedding that
enables reconstruction and sampling of realistic door images. Then, we evaluate
the correspondence of the learned embeddings to human-interpretable parameters
in a series of supervised inference tasks. It was found that a pretrained
neural statistician encoder outperformed analogous context-free baselines when
predicting door handedness, size, angle location, and configuration from door
images. Finally, in a visual bandit door-opening task with a variety of door
configuration, we found that neural statistician embeddings achieve lower
regret than context-free baselines.
Related papers
- Model Pairing Using Embedding Translation for Backdoor Attack Detection on Open-Set Classification Tasks [63.269788236474234]
We propose to use model pairs on open-set classification tasks for detecting backdoors.
We show that this score, can be an indicator for the presence of a backdoor despite models being of different architectures.
This technique allows for the detection of backdoors on models designed for open-set classification tasks, which is little studied in the literature.
arXiv Detail & Related papers (2024-02-28T21:29:16Z) - The Contextual Lasso: Sparse Linear Models via Deep Neural Networks [5.607237982617641]
We develop a new statistical estimator that fits a sparse linear model to the explanatory features such that the sparsity pattern and coefficients vary as a function of the contextual features.
An extensive suite of experiments on real and synthetic data suggests that the learned models, which remain highly transparent, can be sparser than the regular lasso.
arXiv Detail & Related papers (2023-02-02T05:00:29Z) - Unsupervised learning of features and object boundaries from local
prediction [0.0]
We introduce a layer of feature maps with a pairwise Markov random field model in which each factor is paired with an additional binary variable, which switches the factor on or off.
We can learn both the features and the parameters of the Markov random field factors from images without further supervision signals.
We show that computing predictions across space aids both segmentation and feature learning, and models trained to optimize these predictions show similarities to the human visual system.
arXiv Detail & Related papers (2022-05-27T18:54:10Z) - Category-Independent Articulated Object Tracking with Factor Graphs [14.574389906480867]
Articulated objects come with unexpected articulation mechanisms that are inconsistent with categorical priors.
We propose a category-independent framework for predicting the articulation models of unknown objects from sequences of RGB-D images.
We demonstrate that our visual perception and factor graph modules outperform baselines on simulated data and show the applicability of our factor graph on real world data.
arXiv Detail & Related papers (2022-05-07T20:59:44Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Scene Synthesis via Uncertainty-Driven Attribute Synchronization [52.31834816911887]
This paper introduces a novel neural scene synthesis approach that can capture diverse feature patterns of 3D scenes.
Our method combines the strength of both neural network-based and conventional scene synthesis approaches.
arXiv Detail & Related papers (2021-08-30T19:45:07Z) - Constructing interval variables via faceted Rasch measurement and
multitask deep learning: a hate speech application [63.10266319378212]
We propose a method for measuring complex variables on a continuous, interval spectrum by combining supervised deep learning with the Constructing Measures approach to faceted Rasch item response theory (IRT)
We demonstrate this new method on a dataset of 50,000 social media comments sourced from YouTube, Twitter, and Reddit and labeled by 11,000 U.S.-based Amazon Mechanical Turk workers.
arXiv Detail & Related papers (2020-09-22T02:15:05Z) - Visual Neural Decomposition to Explain Multivariate Data Sets [13.117139248511783]
Investigating relationships between variables in multi-dimensional data sets is a common task for data analysts and engineers.
We propose a novel approach to visualize correlations between input variables and a target output variable that scales to hundreds of variables.
arXiv Detail & Related papers (2020-09-11T15:53:37Z) - Connecting the Dots: Multivariate Time Series Forecasting with Graph
Neural Networks [91.65637773358347]
We propose a general graph neural network framework designed specifically for multivariate time series data.
Our approach automatically extracts the uni-directed relations among variables through a graph learning module.
Our proposed model outperforms the state-of-the-art baseline methods on 3 of 4 benchmark datasets.
arXiv Detail & Related papers (2020-05-24T04:02:18Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.