A Neuromorphic Proto-Object Based Dynamic Visual Saliency Model with an
FPGA Implementation
- URL: http://arxiv.org/abs/2002.11898v3
- Date: Sun, 12 Apr 2020 02:04:47 GMT
- Title: A Neuromorphic Proto-Object Based Dynamic Visual Saliency Model with an
FPGA Implementation
- Authors: Jamal Lottier Molin, Chetan Singh Thakur, Ralph Etienne-Cummings,
Ernst Niebur
- Abstract summary: We present a neuromorphic, bottom-up, dynamic visual saliency model based on the notion of proto-objects.
This model outperforms state-of-the-art dynamic visual saliency models in predicting human eye fixations on a commonly used video dataset.
We introduce a Field-Programmable Gate Array implementation of the model on an Opal Kelly 7350 Kintex-7 board.
- Score: 1.2387676601792899
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability to attend to salient regions of a visual scene is an innate and
necessary preprocessing step for both biological and engineered systems
performing high-level visual tasks (e.g. object detection, tracking, and
classification). Computational efficiency, in regard to processing bandwidth
and speed, is improved by only devoting computational resources to salient
regions of the visual stimuli. In this paper, we first present a neuromorphic,
bottom-up, dynamic visual saliency model based on the notion of proto-objects.
This is achieved by incorporating the temporal characteristics of the visual
stimulus into the model, similarly to the manner in which early stages of the
human visual system extracts temporal information. This neuromorphic model
outperforms state-of-the-art dynamic visual saliency models in predicting human
eye fixations on a commonly used video dataset with associated eye tracking
data. Secondly, for this model to have practical applications, it must be
capable of performing its computations in real-time under low-power,
small-size, and lightweight constraints. To address this, we introduce a
Field-Programmable Gate Array implementation of the model on an Opal Kelly 7350
Kintex-7 board. This novel hardware implementation allows for processing of up
to 23.35 frames per second running on a 100 MHz clock - better than 26x speedup
from the software implementation.
Related papers
- D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video [53.83936023443193]
This paper contributes to the field by introducing a new synthesis method for dynamic novel view from monocular video, such as smartphone captures.
Our approach represents the as a $textitdynamic neural point cloud$, an implicit time-conditioned point cloud that encodes local geometry and appearance in separate hash-encoded neural feature grids.
arXiv Detail & Related papers (2024-06-14T14:35:44Z) - PerSival: Neural-network-based visualisation for pervasive
continuum-mechanical simulations in musculoskeletal biomechanics [1.4272256806865107]
This paper presents a novel neural network architecture for pervasive visualisation of a 3D human upper limb musculoskeletal system model.
We use a sparse grid surrogate to capture the surface deformation of the m.biceps brachii in order to train a deep learning model, used for real-time visualisation of the same muscle.
arXiv Detail & Related papers (2023-12-07T00:07:35Z) - Modelling Human Visual Motion Processing with Trainable Motion Energy
Sensing and a Self-attention Network [1.9458156037869137]
We propose an image-computable model of human motion perception by bridging the gap between biological and computer vision models.
This model architecture aims to capture the computations in V1-MT, the core structure for motion perception in the biological visual system.
In silico neurophysiology reveals that our model's unit responses are similar to mammalian neural recordings regarding motion pooling and speed tuning.
arXiv Detail & Related papers (2023-05-16T04:16:07Z) - Real-time volumetric rendering of dynamic humans [83.08068677139822]
We present a method for fast 3D reconstruction and real-time rendering of dynamic humans from monocular videos.
Our method can reconstruct a dynamic human in less than 3h using a single GPU, compared to recent state-of-the-art alternatives that take up to 72h.
A novel local ray marching rendering allows visualizing the neural human on a mobile VR device at 40 frames per second with minimal loss of visual quality.
arXiv Detail & Related papers (2023-03-21T14:41:25Z) - Differentiable Frequency-based Disentanglement for Aerial Video Action
Recognition [56.91538445510214]
We present a learning algorithm for human activity recognition in videos.
Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras.
We conduct extensive experiments on the UAV Human dataset and the NEC Drone dataset.
arXiv Detail & Related papers (2022-09-15T22:16:52Z) - Fast Dynamic Radiance Fields with Time-Aware Neural Voxels [106.69049089979433]
We propose a radiance field framework by representing scenes with time-aware voxel features, named as TiNeuVox.
Our framework accelerates the optimization of dynamic radiance fields while maintaining high rendering quality.
Our TiNeuVox completes training with only 8 minutes and 8-MB storage cost while showing similar or even better rendering performance than previous dynamic NeRF methods.
arXiv Detail & Related papers (2022-05-30T17:47:31Z) - Activity Detection in Long Surgical Videos using Spatio-Temporal Models [1.2400116527089995]
In this paper, we investigate both the state-of-the-art activity recognition and temporal models.
We benchmark these models on a large-scale activity recognition dataset in the operating room with over 800 full-length surgical videos.
We show that even in the case of limited labeled data, we can outperform the existing work by benefiting from models pre-trained on other tasks.
arXiv Detail & Related papers (2022-05-05T17:34:33Z) - Real-time Neural-MPC: Deep Learning Model Predictive Control for
Quadrotors and Agile Robotic Platforms [59.03426963238452]
We present Real-time Neural MPC, a framework to efficiently integrate large, complex neural network architectures as dynamics models within a model-predictive control pipeline.
We show the feasibility of our framework on real-world problems by reducing the positional tracking error by up to 82% when compared to state-of-the-art MPC approaches without neural network dynamics.
arXiv Detail & Related papers (2022-03-15T09:38:15Z) - STAR: Sparse Transformer-based Action Recognition [61.490243467748314]
This work proposes a novel skeleton-based human action recognition model with sparse attention on the spatial dimension and segmented linear attention on the temporal dimension of data.
Experiments show that our model can achieve comparable performance while utilizing much less trainable parameters and achieve high speed in training and inference.
arXiv Detail & Related papers (2021-07-15T02:53:11Z) - Emergent Properties of Foveated Perceptual Systems [3.3504365823045044]
This work is inspired by the foveated human visual system, which has higher acuity at the center of gaze and texture-like encoding in the periphery.
We introduce models consisting of a first-stage textitfixed image transform followed by a second-stage textitlearnable convolutional neural network.
We find that foveation with peripheral texture-based computations yields an efficient, distinct, and robust representational format of scene information.
arXiv Detail & Related papers (2020-06-14T19:34:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.