Disentangling Shape and Pose for Object-Centric Deep Active Inference
Models
- URL: http://arxiv.org/abs/2209.09097v1
- Date: Fri, 16 Sep 2022 12:53:49 GMT
- Title: Disentangling Shape and Pose for Object-Centric Deep Active Inference
Models
- Authors: Stefano Ferraro, Toon Van de Maele, Pietro Mazzaglia, Tim Verbelen and
Bart Dhoedt
- Abstract summary: We consider the problem of 3D object representation, and focus on different instances of the ShapeNet dataset.
We propose a model that factorizes object shape, pose and category, while still learning a representation for each factor using a deep neural network.
- Score: 4.298360054690217
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Active inference is a first principles approach for understanding the brain
in particular, and sentient agents in general, with the single imperative of
minimizing free energy. As such, it provides a computational account for
modelling artificial intelligent agents, by defining the agent's generative
model and inferring the model parameters, actions and hidden state beliefs.
However, the exact specification of the generative model and the hidden state
space structure is left to the experimenter, whose design choices influence the
resulting behaviour of the agent. Recently, deep learning methods have been
proposed to learn a hidden state space structure purely from data, alleviating
the experimenter from this tedious design task, but resulting in an entangled,
non-interpreteable state space. In this paper, we hypothesize that such a
learnt, entangled state space does not necessarily yield the best model in
terms of free energy, and that enforcing different factors in the state space
can yield a lower model complexity. In particular, we consider the problem of
3D object representation, and focus on different instances of the ShapeNet
dataset. We propose a model that factorizes object shape, pose and category,
while still learning a representation for each factor using a deep neural
network. We show that models, with best disentanglement properties, perform
best when adopted by an active agent in reaching preferred observations.
Related papers
- Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models [65.82564074712836]
We introduce DIFfusionHOI, a new HOI detector shedding light on text-to-image diffusion models.
We first devise an inversion-based strategy to learn the expression of relation patterns between humans and objects in embedding space.
These learned relation embeddings then serve as textual prompts, to steer diffusion models generate images that depict specific interactions.
arXiv Detail & Related papers (2024-10-26T12:00:33Z) - SOLD: Reinforcement Learning with Slot Object-Centric Latent Dynamics [16.020835290802548]
Slot-Attention for Object-centric Latent Dynamics is a novel algorithm that learns object-centric dynamics models from pixel inputs.
We demonstrate that the structured latent space not only improves model interpretability but also provides a valuable input space for behavior models to reason over.
Our results show that SOLD outperforms DreamerV3, a state-of-the-art model-based RL algorithm, across a range of benchmark robotic environments.
arXiv Detail & Related papers (2024-10-11T14:03:31Z) - Efficient Exploration and Discriminative World Model Learning with an Object-Centric Abstraction [19.59151245929067]
We study whether giving an agent an object-centric mapping (describing a set of items and their attributes) allow for more efficient learning.
We find this problem is best solved hierarchically by modelling items at a higher level of state abstraction to pixels.
We make use of this to propose a fully model-based algorithm that learns a discriminative world model.
arXiv Detail & Related papers (2024-08-21T17:59:31Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Interpreting Black-box Machine Learning Models for High Dimensional
Datasets [40.09157165704895]
We train a black-box model on a high-dimensional dataset to learn the embeddings on which the classification is performed.
We then approximate the behavior of the black-box model by means of an interpretable surrogate model on the top-k feature space.
Our approach outperforms state-of-the-art methods like TabNet and XGboost when tested on different datasets.
arXiv Detail & Related papers (2022-08-29T07:36:17Z) - Contrastive Neighborhood Alignment [81.65103777329874]
We present Contrastive Neighborhood Alignment (CNA), a manifold learning approach to maintain the topology of learned features.
The target model aims to mimic the local structure of the source representation space using a contrastive loss.
CNA is illustrated in three scenarios: manifold learning, where the model maintains the local topology of the original data in a dimension-reduced space; model distillation, where a small student model is trained to mimic a larger teacher; and legacy model update, where an older model is replaced by a more powerful one.
arXiv Detail & Related papers (2022-01-06T04:58:31Z) - imGHUM: Implicit Generative Models of 3D Human Shape and Articulated
Pose [42.4185273307021]
We present imGHUM, the first holistic generative model of 3D human shape and articulated pose.
We model the full human body implicitly as a function zero-level-set and without the use of an explicit template mesh.
arXiv Detail & Related papers (2021-08-24T17:08:28Z) - Model-Invariant State Abstractions for Model-Based Reinforcement
Learning [54.616645151708994]
We introduce a new type of state abstraction called textitmodel-invariance.
This allows for generalization to novel combinations of unseen values of state variables.
We prove that an optimal policy can be learned over this model-invariance state abstraction.
arXiv Detail & Related papers (2021-02-19T10:37:54Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z) - Intrinsic Relationship Reasoning for Small Object Detection [44.68289739449486]
Small objects in images and videos are usually not independent individuals. Instead, they more or less present some semantic and spatial layout relationships with each other.
We propose a novel context reasoning approach for small object detection which models and infers the intrinsic semantic and spatial layout relationships between objects.
arXiv Detail & Related papers (2020-09-02T06:03:05Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.