Disentangled Representation Learning Using ($\beta$-)VAE and GAN
- URL: http://arxiv.org/abs/2208.04549v1
- Date: Tue, 9 Aug 2022 05:37:06 GMT
- Title: Disentangled Representation Learning Using ($\beta$-)VAE and GAN
- Authors: Mohammad Haghir Ebrahimabadi
- Abstract summary: The dSprite dataset provided the desired features for the required experiments.
After training the VAE combined with a Generative Adversarial Network (GAN), each dimension of the hidden vector was disrupted to explore the disentanglement in each dimension.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Given a dataset of images containing different objects with different
features such as shape, size, rotation, and x-y position; and a Variational
Autoencoder (VAE); creating a disentangled encoding of these features in the
hidden space vector of the VAE was the task of interest in this paper. The
dSprite dataset provided the desired features for the required experiments in
this research. After training the VAE combined with a Generative Adversarial
Network (GAN), each dimension of the hidden vector was disrupted to explore the
disentanglement in each dimension. Note that the GAN was used to improve the
quality of output image reconstruction.
Related papers
- Adaptive Rotated Convolution for Rotated Object Detection [96.94590550217718]
We present Adaptive Rotated Convolution (ARC) module to handle rotated object detection problem.
In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images.
The proposed approach achieves state-of-the-art performance on the DOTA dataset with 81.77% mAP.
arXiv Detail & Related papers (2023-03-14T11:53:12Z) - Multi-Projection Fusion and Refinement Network for Salient Object
Detection in 360{\deg} Omnidirectional Image [141.10227079090419]
We propose a Multi-Projection Fusion and Refinement Network (MPFR-Net) to detect the salient objects in 360deg omnidirectional image.
MPFR-Net uses the equirectangular projection image and four corresponding cube-unfolding images as inputs.
Experimental results on two omnidirectional datasets demonstrate that the proposed approach outperforms the state-of-the-art methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-12-23T14:50:40Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - Spatially Invariant Unsupervised 3D Object Segmentation with Graph
Neural Networks [23.729853358582506]
We propose a framework, SPAIR3D, to model a point cloud as a spatial mixture model.
We jointly learn the multiple-object representation and segmentation in 3D via Variational Autoencoders (VAE)
Experimental results demonstrate that SPAIR3D is capable of detecting and segmenting variable number of objects without appearance information.
arXiv Detail & Related papers (2021-06-10T09:20:16Z) - Rotation Equivariant Feature Image Pyramid Network for Object Detection
in Optical Remote Sensing Imagery [39.25541709228373]
We propose the rotation equivariant feature image pyramid network (REFIPN), an image pyramid network based on rotation equivariance convolution.
The proposed pyramid network extracts features in a wide range of scales and orientations by using novel convolution filters.
The detection performance of the proposed model is validated on two commonly used aerial benchmarks.
arXiv Detail & Related papers (2021-06-02T01:33:49Z) - Spatial Dependency Networks: Neural Layers for Improved Generative Image
Modeling [79.15521784128102]
We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs)
In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way.
We show that augmenting the decoder of a hierarchical VAE by spatial dependency layers considerably improves density estimation.
arXiv Detail & Related papers (2021-03-16T07:01:08Z) - Inference for Generative Capsule Models [4.454557728745761]
Capsule networks aim to encode knowledge and reason about the relationship between an object and its parts.
Data is generated from multiple geometric objects at arbitrary translations, rotations and scales.
We derive a variational algorithm for inferring the transformation of each object and the assignments of points to parts of the objects.
arXiv Detail & Related papers (2021-03-11T14:10:29Z) - Learning Geometry-Disentangled Representation for Complementary
Understanding of 3D Object Point Cloud [50.56461318879761]
We propose Geometry-Disentangled Attention Network (GDANet) for 3D image processing.
GDANet disentangles point clouds into contour and flat part of 3D objects, respectively denoted by sharp and gentle variation components.
Experiments on 3D object classification and segmentation benchmarks demonstrate that GDANet achieves the state-of-the-arts with fewer parameters.
arXiv Detail & Related papers (2020-12-20T13:35:00Z) - Improving Point Cloud Semantic Segmentation by Learning 3D Object
Detection [102.62963605429508]
Point cloud semantic segmentation plays an essential role in autonomous driving.
Current 3D semantic segmentation networks focus on convolutional architectures that perform great for well represented classes.
We propose a novel Aware 3D Semantic Detection (DASS) framework that explicitly leverages localization features from an auxiliary 3D object detection task.
arXiv Detail & Related papers (2020-09-22T14:17:40Z) - Fixed-size Objects Encoding for Visual Relationship Detection [16.339394922532282]
We propose a fixed-size object encoding method (FOE-VRD) to improve performance of visual relationship detection tasks.
It uses one fixed-size vector to encoding all objects in each input image to assist the process of relationship detection.
Experimental results on VRD database show that the proposed method works well on both predicate classification and relationship detection.
arXiv Detail & Related papers (2020-05-29T14:36:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.