A Deeper Look at the Unsupervised Learning of Disentangled
Representations in $\beta$-VAE from the Perspective of Core Object
Recognition
- URL: http://arxiv.org/abs/2005.07114v1
- Date: Sat, 25 Apr 2020 08:14:03 GMT
- Title: A Deeper Look at the Unsupervised Learning of Disentangled
Representations in $\beta$-VAE from the Perspective of Core Object
Recognition
- Authors: Harshvardhan Sikka
- Abstract summary: The ability to recognize objects despite differences in appearance, known as Core Object Recognition, forms a critical part of human perception.
Various computational perceptual models have been built to attempt and tackle the object identification task in an artificial perceptual setting.
This thesis constitutes a research project exploring a generalization of the Variational Autoencoder (VAE), $beta$-VAE, that aims to learn disentangled representations using variational inference.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ability to recognize objects despite there being differences in
appearance, known as Core Object Recognition, forms a critical part of human
perception. While it is understood that the brain accomplishes Core Object
Recognition through feedforward, hierarchical computations through the visual
stream, the underlying algorithms that allow for invariant representations to
form downstream is still not well understood. (DiCarlo et al., 2012) Various
computational perceptual models have been built to attempt and tackle the
object identification task in an artificial perceptual setting. Artificial
Neural Networks, computational graphs consisting of weighted edges and
mathematical operations at vertices, are loosely inspired by neural networks in
the brain and have proven effective at various visual perceptual tasks,
including object characterization and identification. (Pinto et al., 2008)
(DiCarlo et al., 2012) For many data analysis tasks, learning representations
where each dimension is statistically independent and thus disentangled from
the others is useful. If the underlying generative factors of the data are also
statistically independent, Bayesian inference of latent variables can form
disentangled representations. This thesis constitutes a research project
exploring a generalization of the Variational Autoencoder (VAE), $\beta$-VAE,
that aims to learn disentangled representations using variational inference.
$\beta$-VAE incorporates the hyperparameter $\beta$, and enforces conditional
independence of its bottleneck neurons, which is in general not compatible with
the statistical independence of latent variables. This text examines this
architecture, and provides analytical and numerical arguments, with the goal of
demonstrating that this incompatibility leads to a non-monotonic inference
performance in $\beta$-VAE with a finite optimal $\beta$.
Related papers
- Application of the representative measure approach to assess the reliability of decision trees in dealing with unseen vehicle collision data [0.6571063542099526]
Representative datasets are a cornerstone in shaping the trajectory of artificial intelligence (AI) development.
We investigate the reliability of the $varepsilon$-representativeness method to assess the dataset similarity from a theoretical perspective for decision trees.
We extend the results experimentally in the context of unseen vehicle collision data for XGboost.
arXiv Detail & Related papers (2024-04-15T08:06:54Z) - Neural Clustering based Visual Representation Learning [61.72646814537163]
Clustering is one of the most classic approaches in machine learning and data analysis.
We propose feature extraction with clustering (FEC), which views feature extraction as a process of selecting representatives from data.
FEC alternates between grouping pixels into individual clusters to abstract representatives and updating the deep features of pixels with current representatives.
arXiv Detail & Related papers (2024-03-26T06:04:50Z) - Hierarchical Invariance for Robust and Interpretable Vision Tasks at Larger Scales [54.78115855552886]
We show how to construct over-complete invariants with a Convolutional Neural Networks (CNN)-like hierarchical architecture.
With the over-completeness, discriminative features w.r.t. the task can be adaptively formed in a Neural Architecture Search (NAS)-like manner.
For robust and interpretable vision tasks at larger scales, hierarchical invariant representation can be considered as an effective alternative to traditional CNN and invariants.
arXiv Detail & Related papers (2024-02-23T16:50:07Z) - Gaussian Mixture Models for Affordance Learning using Bayesian Networks [50.18477618198277]
Affordances are fundamental descriptors of relationships between actions, objects and effects.
This paper approaches the problem of an embodied agent exploring the world and learning these affordances autonomously from its sensory experiences.
arXiv Detail & Related papers (2024-02-08T22:05:45Z) - Object-centric architectures enable efficient causal representation
learning [51.6196391784561]
We show that when the observations are of multiple objects, the generative function is no longer injective and disentanglement fails in practice.
We develop an object-centric architecture that leverages weak supervision from sparse perturbations to disentangle each object's properties.
This approach is more data-efficient in the sense that it requires significantly fewer perturbations than a comparable approach that encodes to a Euclidean space.
arXiv Detail & Related papers (2023-10-29T16:01:03Z) - ICON$^2$: Reliably Benchmarking Predictive Inequity in Object Detection [23.419153864862174]
Concerns about social bias in computer vision systems are rising.
We introduce ICON$2$, a framework for robustly answering this question.
We conduct an in-depth study on the performance of object detection with respect to income from the BDD100K driving dataset.
arXiv Detail & Related papers (2023-06-07T17:42:42Z) - A simple probabilistic neural network for machine understanding [0.0]
We discuss probabilistic neural networks with a fixed internal representation as models for machine understanding.
We derive the internal representation by requiring that it satisfies the principles of maximal relevance and of maximal ignorance about how different features are combined.
We argue that learning machines with this architecture enjoy a number of interesting properties, like the continuity of the representation with respect to changes in parameters and data.
arXiv Detail & Related papers (2022-10-24T13:00:15Z) - Functional Indirection Neural Estimator for Better Out-of-distribution
Generalization [27.291114360472243]
FINE (Functional Indirection Neural Estorimator) learns to compose functions that map data input to output on-the-fly.
We train FINE and competing models on IQ tasks using images from the MNIST, Omniglot and CIFAR100 datasets.
FINE not only achieves the best performance on all tasks but also is able to adapt to small-scale data scenarios.
arXiv Detail & Related papers (2022-10-23T14:43:02Z) - $p$-DkNN: Out-of-Distribution Detection Through Statistical Testing of
Deep Representations [32.99800144249333]
We introduce $p$-DkNN, a novel inference procedure that takes a trained deep neural network and analyzes the similarity structures of its intermediate hidden representations.
We find that $p$-DkNN forces adaptive attackers crafting adversarial examples, a form of worst-case OOD inputs, to introduce semantically meaningful changes to the inputs.
arXiv Detail & Related papers (2022-07-25T21:42:08Z) - The Causal Neural Connection: Expressiveness, Learnability, and
Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation.
In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models.
We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.