Human action recognition with a large-scale brain-inspired photonic
computer
- URL: http://arxiv.org/abs/2004.02545v1
- Date: Mon, 6 Apr 2020 10:39:10 GMT
- Title: Human action recognition with a large-scale brain-inspired photonic
computer
- Authors: Piotr Antonik, Nicolas Marsal, Daniel Brunner, Damien Rontani
- Abstract summary: Recognition of human actions in video streams is a challenging task in computer vision.
Deep learning has shown remarkable results recently, but can be found hard to use in practice.
We propose a scalable photonic neuro-inspired architecture, capable of recognising video-based human actions with state-of-the-art accuracy.
- Score: 0.774229787612056
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recognition of human actions in video streams is a challenging task in
computer vision, with cardinal applications in e.g. brain-computer interface
and surveillance. Deep learning has shown remarkable results recently, but can
be found hard to use in practice, as its training requires large datasets and
special purpose, energy-consuming hardware. In this work, we propose a scalable
photonic neuro-inspired architecture based on the reservoir computing paradigm,
capable of recognising video-based human actions with state-of-the-art
accuracy. Our experimental optical setup comprises off-the-shelf components,
and implements a large parallel recurrent neural network that is easy to train
and can be scaled up to hundreds of thousands of nodes. This work paves the way
towards simply reconfigurable and energy-efficient photonic information
processing systems for real-time video processing.
Related papers
- Optical training of large-scale Transformers and deep neural networks with direct feedback alignment [48.90869997343841]
We experimentally implement a versatile and scalable training algorithm, called direct feedback alignment, on a hybrid electronic-photonic platform.
An optical processing unit performs large-scale random matrix multiplications, which is the central operation of this algorithm, at speeds up to 1500 TeraOps.
We study the compute scaling of our hybrid optical approach, and demonstrate a potential advantage for ultra-deep and wide neural networks.
arXiv Detail & Related papers (2024-09-01T12:48:47Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - EasyVolcap: Accelerating Neural Volumetric Video Research [69.59671164891725]
Volumetric video is a technology that digitally records dynamic events such as artistic performances, sporting events, and remote conversations.
EasyVolcap is a Python & Pytorch library for unifying the process of multi-view data processing, 4D scene reconstruction, and efficient dynamic volumetric video rendering.
arXiv Detail & Related papers (2023-12-11T17:59:46Z) - Deep Photonic Reservoir Computer for Speech Recognition [49.1574468325115]
Speech recognition is a critical task in the field of artificial intelligence and has witnessed remarkable advancements.
Deep reservoir computing is energy efficient but exhibits limitations in performance when compared to more resource-intensive machine learning algorithms.
We propose a photonic-based deep reservoir computer and evaluate its effectiveness on different speech recognition tasks.
arXiv Detail & Related papers (2023-12-11T17:43:58Z) - Deep Neural Networks in Video Human Action Recognition: A Review [21.00217656391331]
Video behavior recognition is one of the most foundational tasks of computer vision.
Deep neural networks are built for recognizing pixel-level information such as images with RGB, RGB-D, or optical flow formats.
In our article, the performance of deep neural networks surpassed most of the techniques in the feature learning and extraction tasks.
arXiv Detail & Related papers (2023-05-25T03:54:41Z) - Design of Convolutional Extreme Learning Machines for Vision-Based
Navigation Around Small Bodies [0.0]
Deep learning architectures such as convolutional neural networks are the standard in computer vision for image processing tasks.
Their accuracy however often comes at the cost of long and computationally expensive training.
A different method known as convolutional extreme learning machine has shown the potential to perform equally with a dramatic decrease in training time.
arXiv Detail & Related papers (2022-10-28T16:24:21Z) - Computational imaging with the human brain [1.614301262383079]
Brain-computer interfaces (BCIs) are enabling a range of new possibilities and routes for augmenting human capability.
We demonstrate ghost imaging of a hidden scene using the human visual system that is combined with an adaptive computational imaging scheme.
This brain-computer connectivity demonstrates a form of augmented human computation that could in the future extend the sensing range of human vision.
arXiv Detail & Related papers (2022-10-07T08:40:18Z) - 11 TeraFLOPs per second photonic convolutional accelerator for deep
learning optical neural networks [0.0]
We demonstrate a universal optical vector convolutional accelerator operating beyond 10 TeraFLOPS (floating point operations per second)
We then use the same hardware to sequentially form a deep optical CNN with ten output neurons, achieving successful recognition of full 10 digits with 900 pixel handwritten digit images with 88% accuracy.
This approach is scalable and trainable to much more complex networks for demanding applications such as unmanned vehicle and real-time video recognition.
arXiv Detail & Related papers (2020-11-14T21:24:01Z) - Distilled Semantics for Comprehensive Scene Understanding from Videos [53.49501208503774]
In this paper, we take an additional step toward holistic scene understanding with monocular cameras by learning depth and motion alongside with semantics.
We address the three tasks jointly by a novel training protocol based on knowledge distillation and self-supervision.
We show that it yields state-of-the-art results for monocular depth estimation, optical flow and motion segmentation.
arXiv Detail & Related papers (2020-03-31T08:52:13Z) - Learning Depth With Very Sparse Supervision [57.911425589947314]
This paper explores the idea that perception gets coupled to 3D properties of the world via interaction with the environment.
We train a specialized global-local network architecture with what would be available to a robot interacting with the environment.
Experiments on several datasets show that, when ground truth is available even for just one of the image pixels, the proposed network can learn monocular dense depth estimation up to 22.5% more accurately than state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-02T10:44:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.