Efficient and Low-Footprint Object Classification using Spatial Contrast
- URL: http://arxiv.org/abs/2311.03422v1
- Date: Mon, 6 Nov 2023 15:24:29 GMT
- Title: Efficient and Low-Footprint Object Classification using Spatial Contrast
- Authors: Matthew Belding, Daniel C. Stumpp, Rajkumar Kubendran
- Abstract summary: Event-based vision sensors traditionally compute temporal contrast that offers potential for low-power and low-latency sensing and computing.
In this research, an alternative paradigm for event-based sensors using localized spatial contrast (SC) is investigated.
- Score: 0.07589017023705934
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Event-based vision sensors traditionally compute temporal contrast that
offers potential for low-power and low-latency sensing and computing. In this
research, an alternative paradigm for event-based sensors using localized
spatial contrast (SC) under two different thresholding techniques, relative and
absolute, is investigated. Given the slow maturity of spatial contrast in
comparison to temporal-based sensors, a theoretical simulated output of such a
hardware sensor is explored. Furthermore, we evaluate traffic sign
classification using the German Traffic Sign dataset (GTSRB) with well-known
Deep Neural Networks (DNNs). This study shows that spatial contrast can
effectively capture salient image features needed for classification using a
Binarized DNN with significant reduction in input data usage (at least 12X) and
memory resources (17.5X), compared to high precision RGB images and DNN, with
only a small loss (~2%) in macro F1-score. Binarized MicronNet achieves an
F1-score of 94.4% using spatial contrast, compared to only 56.3% when using RGB
input images. Thus, SC offers great promise for deployment in power and
resource constrained edge computing environments.
Related papers
- SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised
Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds.
With the development of Transformer, the scale of SIRST models is constantly increasing.
With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z) - Resource-Efficient Gesture Recognition using Low-Resolution Thermal
Camera via Spiking Neural Networks and Sparse Segmentation [1.7758299835471887]
This work proposes a novel approach for hand gesture recognition using an inexpensive, low-resolution (24 x 32) thermal sensor.
Compared to the use of standard RGB cameras, the proposed system is insensitive to lighting variations.
This paper shows that the innovative use of the recently proposed Monostable Multivibrator (MMV) neural networks as a new class of SNN achieves more than one order of magnitude smaller memory and compute complexity.
arXiv Detail & Related papers (2024-01-12T13:20:01Z) - EventTransAct: A video transformer-based framework for Event-camera
based action recognition [52.537021302246664]
Event cameras offer new opportunities compared to standard action recognition in RGB videos.
In this study, we employ a computationally efficient model, namely the video transformer network (VTN), which initially acquires spatial embeddings per event-frame.
In order to better adopt the VTN for the sparse and fine-grained nature of event data, we design Event-Contrastive Loss ($mathcalL_EC$) and event-specific augmentations.
arXiv Detail & Related papers (2023-08-25T23:51:07Z) - A Simulation-Augmented Benchmarking Framework for Automatic RSO Streak
Detection in Single-Frame Space Images [7.457841062817294]
Deep convolutional neural networks (DCNNs) have shown superior performance in object detection when large-scale datasets are available.
We introduce a novel simulation-augmented benchmarking framework for RSO detection (SAB-RSOD)
In our framework, by making the best use of the hardware parameters of the sensor that captures real-world space images, we first develop a high-fidelity RSO simulator.
Then, we use this simulator to generate images that contain diversified RSOs in space and annotate them automatically.
arXiv Detail & Related papers (2023-04-30T07:00:16Z) - Learning Neural Light Fields with Ray-Space Embedding Networks [51.88457861982689]
We propose a novel neural light field representation that is compact and directly predicts integrated radiance along rays.
Our method achieves state-of-the-art quality on dense forward-facing datasets such as the Stanford Light Field dataset.
arXiv Detail & Related papers (2021-12-02T18:59:51Z) - NeighCNN: A CNN based SAR Speckle Reduction using Feature preserving
Loss Function [1.7188280334580193]
NeighCNN is a deep learning-based speckle reduction algorithm that handles multiplicative noise.
Various synthetic, as well as real SAR images, are used for testing the NeighCNN architecture.
arXiv Detail & Related papers (2021-08-26T04:20:07Z) - SignalNet: A Low Resolution Sinusoid Decomposition and Estimation
Network [79.04274563889548]
We propose SignalNet, a neural network architecture that detects the number of sinusoids and estimates their parameters from quantized in-phase and quadrature samples.
We introduce a worst-case learning threshold for comparing the results of our network relative to the underlying data distributions.
In simulation, we find that our algorithm is always able to surpass the threshold for three-bit data but often cannot exceed the threshold for one-bit data.
arXiv Detail & Related papers (2021-06-10T04:21:20Z) - DCT-SNN: Using DCT to Distribute Spatial Information over Time for
Learning Low-Latency Spiking Neural Networks [7.876001630578417]
Spiking Neural Networks (SNNs) offer a promising alternative to traditional deep learning frameworks.
SNNs suffer from high inference latency which is a major bottleneck to their deployment.
We propose a scalable time-based encoding scheme that utilizes the Discrete Cosine Transform (DCT) to reduce the number of timesteps required for inference.
arXiv Detail & Related papers (2020-10-05T05:55:34Z) - EBBINNOT: A Hardware Efficient Hybrid Event-Frame Tracker for Stationary
Dynamic Vision Sensors [5.674895233111088]
This paper presents a hybrid event-frame approach for detecting and tracking objects recorded by a stationary neuromorphic sensor.
To exploit the background removal property of a static DVS, we propose an event-based binary image creation that signals presence or absence of events in a frame duration.
This is the first time a stationary DVS based traffic monitoring solution is extensively compared to simultaneously recorded RGB frame-based methods.
arXiv Detail & Related papers (2020-05-31T03:01:35Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.