Related papers: Learning Intermediate Features of Object Affordances with a Convolutional Neural Network

Learning Intermediate Features of Object Affordances with a Convolutional Neural Network

URL: http://arxiv.org/abs/2002.08975v1
Date: Thu, 20 Feb 2020 19:04:40 GMT
Title: Learning Intermediate Features of Object Affordances with a Convolutional Neural Network
Authors: Aria Yuan Wang and Michael J. Tarr
Abstract summary: We train a deep convolutional neural network (CNN) to recognize affordances from images and to learn the underlying features or the dimensionality of affordances. We view this representational analysis as the first step towards a more formal account of how humans perceive and interact with the environment.
Score: 1.52292571922932
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Our ability to interact with the world around us relies on being able to infer what actions objects afford -- often referred to as affordances. The neural mechanisms of object-action associations are realized in the visuomotor pathway where information about both visual properties and actions is integrated into common representations. However, explicating these mechanisms is particularly challenging in the case of affordances because there is hardly any one-to-one mapping between visual features and inferred actions. To better understand the nature of affordances, we trained a deep convolutional neural network (CNN) to recognize affordances from images and to learn the underlying features or the dimensionality of affordances. Such features form an underlying compositional structure for the general representation of affordances which can then be tested against human neural data. We view this representational analysis as the first step towards a more formal account of how humans perceive and interact with the environment.

Related papers

Convergent transformations of visual representation in brains and models [0.0]
A fundamental question in cognitive neuroscience is what shapes visual perception: the external world's structure or the brain's internal architecture.<n>We show a convergent computational solution for visual encoding in both human and artificial vision, driven by the structure of the external world.
arXiv Detail & Related papers (2025-07-18T14:13:54Z)
Emergent Active Perception and Dexterity of Simulated Humanoids from Visual Reinforcement Learning [69.71072181304066]
We introduce Perceptive Dexterous Control (PDC), a framework for vision-driven whole-body control with simulated humanoids.<n>PDC operates solely on egocentric vision for task specification, enabling object search, target placement, and skill selection through visual cues.<n>We show that training from scratch with reinforcement learning can produce emergent behaviors such as active search.
arXiv Detail & Related papers (2025-05-18T07:33:31Z)
Discovering Chunks in Neural Embeddings for Interpretability [53.80157905839065]
We propose leveraging the principle of chunking to interpret artificial neural population activities. We first demonstrate this concept in recurrent neural networks (RNNs) trained on artificial sequences with imposed regularities. We identify similar recurring embedding states corresponding to concepts in the input, with perturbations to these states activating or inhibiting the associated concepts.
arXiv Detail & Related papers (2025-02-03T20:30:46Z)
Artificial Kuramoto Oscillatory Neurons [65.16453738828672]
We introduce Artificial Kuramotoy Neurons (AKOrN) as a dynamical alternative to threshold units. We show that this idea provides performance improvements across a wide spectrum of tasks. We believe that these empirical results show the importance of our assumptions at the most basic neuronal level of neural representation.
arXiv Detail & Related papers (2024-10-17T17:47:54Z)
Visual-Geometric Collaborative Guidance for Affordance Learning [63.038406948791454]
We propose a visual-geometric collaborative guided affordance learning network that incorporates visual and geometric cues. Our method outperforms the representative models regarding objective metrics and visual quality.
arXiv Detail & Related papers (2024-10-15T07:35:51Z)
Binding Dynamics in Rotating Features [72.80071820194273]
We propose an alternative "cosine binding" mechanism, which explicitly computes the alignment between features and adjusts weights accordingly. This allows us to draw direct connections to self-attention and biological neural processes, and to shed light on the fundamental dynamics for object-centric representations to emerge in Rotating Features.
arXiv Detail & Related papers (2024-02-08T12:31:08Z)
Searching for the Essence of Adversarial Perturbations [73.96215665913797]
We show that adversarial perturbations contain human-recognizable information, which is the key conspirator responsible for a neural network's erroneous prediction. This concept of human-recognizable information allows us to explain key features related to adversarial perturbations.
arXiv Detail & Related papers (2022-05-30T18:04:57Z)
Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity [33.06823702945747]
We introduce a novel unsupervised approach for learning disentangled representations of neural activity called Swap-VAE. Our approach combines a generative modeling framework with an instance-specific alignment loss. We show that it is possible to build representations that disentangle neural datasets along relevant latent dimensions linked to behavior.
arXiv Detail & Related papers (2021-11-03T16:39:43Z)
Capturing the objects of vision with neural networks [0.0]
Human visual perception carves a scene at its physical joints, decomposing the world into objects. Deep neural network (DNN) models of visual object recognition, by contrast, remain largely tethered to the sensory input. We review related work in both fields and examine how these fields can help each other.
arXiv Detail & Related papers (2021-09-07T21:49:53Z)
On the Binding Problem in Artificial Neural Networks [12.04468744445707]
We argue that the underlying cause for this shortcoming is their inability to dynamically and flexibly bind information. We propose a unifying framework that revolves around forming meaningful entities from unstructured sensory inputs. We believe that a compositional approach to AI, in terms of grounded symbol-like representations, is of fundamental importance for realizing human-level generalization.
arXiv Detail & Related papers (2020-12-09T18:02:49Z)
Compositional Explanations of Neurons [52.71742655312625]
We describe a procedure for explaining neurons in deep representations by identifying compositional logical concepts. We use this procedure to answer several questions on interpretability in models for vision and natural language processing.
arXiv Detail & Related papers (2020-06-24T20:37:05Z)
Compositional Generalization by Learning Analytical Expressions [87.15737632096378]
A memory-augmented neural model is connected with analytical expressions to achieve compositional generalization. Experiments on the well-known benchmark SCAN demonstrate that our model seizes a great ability of compositional generalization.
arXiv Detail & Related papers (2020-06-18T15:50:57Z)
Visualizing and Understanding Vision System [0.6510507449705342]
We use a vision recognition-reconstruction network (RRN) to investigate the development, recognition, learning and forgetting mechanisms. In digit recognition study, we witness that the RRN could maintain object invariance representation under various viewing conditions. In the learning and forgetting study, novel structure recognition is implemented by adjusting entire synapses in low magnitude while pattern specificities of original synaptic connectivity are preserved.
arXiv Detail & Related papers (2020-06-11T07:08:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.