Neuromorphic Visual Scene Understanding with Resonator Networks
- URL: http://arxiv.org/abs/2208.12880v4
- Date: Wed, 26 Jun 2024 10:16:08 GMT
- Title: Neuromorphic Visual Scene Understanding with Resonator Networks
- Authors: Alpha Renner, Lazar Supic, Andreea Danielescu, Giacomo Indiveri, Bruno A. Olshausen, Yulia Sandamirskaya, Friedrich T. Sommer, E. Paxon Frady,
- Abstract summary: We propose a neuromorphic solution exploiting three key concepts.
The framework is based on Vector Architectures with complex-valued vectors.
The network is factorized to factorize the non-commutative transforms translation and rotation in visual scenes.
A companion paper demonstrates the same approach in real-world application scenarios for machine vision and robotics.
- Score: 11.701553530610973
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Analyzing a visual scene by inferring the configuration of a generative model is widely considered the most flexible and generalizable approach to scene understanding. Yet, one major problem is the computational challenge of the inference procedure, involving a combinatorial search across object identities and poses. Here we propose a neuromorphic solution exploiting three key concepts: (1) a computational framework based on Vector Symbolic Architectures (VSA) with complex-valued vectors; (2) the design of Hierarchical Resonator Networks (HRN) to factorize the non-commutative transforms translation and rotation in visual scenes; (3) the design of a multi-compartment spiking phasor neuron model for implementing complex-valued resonator networks on neuromorphic hardware. The VSA framework uses vector binding operations to form a generative image model in which binding acts as the equivariant operation for geometric transformations. A scene can, therefore, be described as a sum of vector products, which can then be efficiently factorized by a resonator network to infer objects and their poses. The HRN features a partitioned architecture in which vector binding is equivariant for horizontal and vertical translation within one partition and for rotation and scaling within the other partition. The spiking neuron model allows mapping the resonator network onto efficient and low-power neuromorphic hardware. Our approach is demonstrated on synthetic scenes composed of simple 2D shapes undergoing rigid geometric transformations and color changes. A companion paper demonstrates the same approach in real-world application scenarios for machine vision and robotics.
Related papers
- Geometric Algebra Planes: Convex Implicit Neural Volumes [70.12234371845445]
We show that GA-Planes is equivalent to a sparse low-rank factor plus low-resolution matrix.
We also show that GA-Planes can be adapted for many existing representations.
arXiv Detail & Related papers (2024-11-20T18:21:58Z) - Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning.
voxelization infers per-object occupancy probabilities at individual spatial locations.
Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z) - S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR)
Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection.
In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z) - Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object
Structure via HyperNetworks [53.67497327319569]
We introduce a novel neural rendering technique to solve image-to-3D from a single view.
Our approach employs the signed distance function as the surface representation and incorporates generalizable priors through geometry-encoding volumes and HyperNetworks.
Our experiments show the advantages of our proposed approach with consistent results and rapid generation.
arXiv Detail & Related papers (2023-12-24T08:42:37Z) - On the Transition from Neural Representation to Symbolic Knowledge [2.2528422603742304]
We propose a Neural-Symbolic Transitional Dictionary Learning (TDL) framework that employs an EM algorithm to learn a transitional representation of data.
We implement the framework with a diffusion model by regarding the decomposition of input as a cooperative game.
We additionally use RL enabled by the Markovian of diffusion models to further tune the learned prototypes.
arXiv Detail & Related papers (2023-08-03T19:29:35Z) - Visual Odometry with Neuromorphic Resonator Networks [9.903137966539898]
Visual Odometry (VO) is a method to estimate self-motion of a mobile robot using visual sensors.
Neuromorphic hardware offers low-power solutions to many vision and AI problems.
We present a modular neuromorphic algorithm that achieves state-of-the-art performance on two-dimensional VO tasks.
arXiv Detail & Related papers (2022-09-05T14:57:03Z) - VNT-Net: Rotational Invariant Vector Neuron Transformers [3.04585143845864]
We introduce a rotational invariant neural network by combining recently introduced vector neurons with self-attention layers.
Experiments demonstrate that our network efficiently handles 3D point cloud objects in arbitrary poses.
arXiv Detail & Related papers (2022-05-19T16:51:56Z) - Vector Neurons: A General Framework for SO(3)-Equivariant Networks [32.81671803104126]
In this paper, we introduce a general framework built on top of what we call Vector Neuron representations.
Our vector neurons enable a simple mapping of SO(3) actions to latent spaces.
We also show for the first time a rotation equivariant reconstruction network.
arXiv Detail & Related papers (2021-04-25T18:48:15Z) - Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible
Neural Networks [118.20778308823779]
We present a novel 3D primitive representation that defines primitives using an Invertible Neural Network (INN)
Our model learns to parse 3D objects into semantically consistent part arrangements without any part-level supervision.
arXiv Detail & Related papers (2021-03-18T17:59:31Z) - Resonator networks for factoring distributed representations of data
structures [3.46969645559477]
We show how data structures are encoded by combining high-dimensional vectors with operations that together form an algebra on the space of distributed representations.
Our proposed algorithm, called a resonator network, is a new type of recurrent neural network that interleaves VSA multiplication operations and pattern completion.
Re resonator networks open the possibility to apply VSAs to myriad artificial intelligence problems in real-world domains.
arXiv Detail & Related papers (2020-07-07T19:24:27Z) - Convolutional Occupancy Networks [88.48287716452002]
We propose Convolutional Occupancy Networks, a more flexible implicit representation for detailed reconstruction of objects and 3D scenes.
By combining convolutional encoders with implicit occupancy decoders, our model incorporates inductive biases, enabling structured reasoning in 3D space.
We empirically find that our method enables the fine-grained implicit 3D reconstruction of single objects, scales to large indoor scenes, and generalizes well from synthetic to real data.
arXiv Detail & Related papers (2020-03-10T10:17:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.