Related papers: Efficient and Low-Footprint Object Classification using Spatial Contrast

Efficient and Low-Footprint Object Classification using Spatial Contrast

URL: http://arxiv.org/abs/2311.03422v1
Date: Mon, 6 Nov 2023 15:24:29 GMT
Title: Efficient and Low-Footprint Object Classification using Spatial Contrast
Authors: Matthew Belding, Daniel C. Stumpp, Rajkumar Kubendran
Abstract summary: Event-based vision sensors traditionally compute temporal contrast that offers potential for low-power and low-latency sensing and computing. In this research, an alternative paradigm for event-based sensors using localized spatial contrast (SC) is investigated.
Score: 0.07589017023705934
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Event-based vision sensors traditionally compute temporal contrast that offers potential for low-power and low-latency sensing and computing. In this research, an alternative paradigm for event-based sensors using localized spatial contrast (SC) under two different thresholding techniques, relative and absolute, is investigated. Given the slow maturity of spatial contrast in comparison to temporal-based sensors, a theoretical simulated output of such a hardware sensor is explored. Furthermore, we evaluate traffic sign classification using the German Traffic Sign dataset (GTSRB) with well-known Deep Neural Networks (DNNs). This study shows that spatial contrast can effectively capture salient image features needed for classification using a Binarized DNN with significant reduction in input data usage (at least 12X) and memory resources (17.5X), compared to high precision RGB images and DNN, with only a small loss (~2%) in macro F1-score. Binarized MicronNet achieves an F1-score of 94.4% using spatial contrast, compared to only 56.3% when using RGB input images. Thus, SC offers great promise for deployment in power and resource constrained edge computing environments.

Related papers

How Learnable Grids Recover Fine Detail in Low Dimensions: A Neural Tangent Kernel Analysis of Multigrid Parametric Encodings [106.3726679697804]
We compare the two most common techniques for mitigating this spectral bias: Fourier feature encodings (FFE) and multigrid parametric encodings (MPE) MPEs are seen as the standard for low dimensional mappings, but MPEs often outperform them and learn representations with higher resolution and finer detail. We prove that MPEs improve a network's performance through the structure of their grid and not their learnable embedding.
arXiv Detail & Related papers (2025-04-18T02:18:08Z)
SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds. With the development of Transformer, the scale of SIRST models is constantly increasing. With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z)
Resource-Efficient Gesture Recognition using Low-Resolution Thermal Camera via Spiking Neural Networks and Sparse Segmentation [1.7758299835471887]
This work proposes a novel approach for hand gesture recognition using an inexpensive, low-resolution (24 x 32) thermal sensor. Compared to the use of standard RGB cameras, the proposed system is insensitive to lighting variations. This paper shows that the innovative use of the recently proposed Monostable Multivibrator (MMV) neural networks as a new class of SNN achieves more than one order of magnitude smaller memory and compute complexity.
arXiv Detail & Related papers (2024-01-12T13:20:01Z)
EventTransAct: A video transformer-based framework for Event-camera based action recognition [52.537021302246664]
Event cameras offer new opportunities compared to standard action recognition in RGB videos. In this study, we employ a computationally efficient model, namely the video transformer network (VTN), which initially acquires spatial embeddings per event-frame. In order to better adopt the VTN for the sparse and fine-grained nature of event data, we design Event-Contrastive Loss ($mathcalL_EC$) and event-specific augmentations.
arXiv Detail & Related papers (2023-08-25T23:51:07Z)
A Simulation-Augmented Benchmarking Framework for Automatic RSO Streak Detection in Single-Frame Space Images [7.457841062817294]
Deep convolutional neural networks (DCNNs) have shown superior performance in object detection when large-scale datasets are available. We introduce a novel simulation-augmented benchmarking framework for RSO detection (SAB-RSOD) In our framework, by making the best use of the hardware parameters of the sensor that captures real-world space images, we first develop a high-fidelity RSO simulator. Then, we use this simulator to generate images that contain diversified RSOs in space and annotate them automatically.
arXiv Detail & Related papers (2023-04-30T07:00:16Z)
Learning Neural Light Fields with Ray-Space Embedding Networks [51.88457861982689]
We propose a novel neural light field representation that is compact and directly predicts integrated radiance along rays. Our method achieves state-of-the-art quality on dense forward-facing datasets such as the Stanford Light Field dataset.
arXiv Detail & Related papers (2021-12-02T18:59:51Z)
NeighCNN: A CNN based SAR Speckle Reduction using Feature preserving Loss Function [1.7188280334580193]
NeighCNN is a deep learning-based speckle reduction algorithm that handles multiplicative noise. Various synthetic, as well as real SAR images, are used for testing the NeighCNN architecture.
arXiv Detail & Related papers (2021-08-26T04:20:07Z)
SignalNet: A Low Resolution Sinusoid Decomposition and Estimation Network [79.04274563889548]
We propose SignalNet, a neural network architecture that detects the number of sinusoids and estimates their parameters from quantized in-phase and quadrature samples. We introduce a worst-case learning threshold for comparing the results of our network relative to the underlying data distributions. In simulation, we find that our algorithm is always able to surpass the threshold for three-bit data but often cannot exceed the threshold for one-bit data.
arXiv Detail & Related papers (2021-06-10T04:21:20Z)
DCT-SNN: Using DCT to Distribute Spatial Information over Time for Learning Low-Latency Spiking Neural Networks [7.876001630578417]
Spiking Neural Networks (SNNs) offer a promising alternative to traditional deep learning frameworks. SNNs suffer from high inference latency which is a major bottleneck to their deployment. We propose a scalable time-based encoding scheme that utilizes the Discrete Cosine Transform (DCT) to reduce the number of timesteps required for inference.
arXiv Detail & Related papers (2020-10-05T05:55:34Z)
EBBINNOT: A Hardware Efficient Hybrid Event-Frame Tracker for Stationary Dynamic Vision Sensors [5.674895233111088]
This paper presents a hybrid event-frame approach for detecting and tracking objects recorded by a stationary neuromorphic sensor. To exploit the background removal property of a static DVS, we propose an event-based binary image creation that signals presence or absence of events in a frame duration. This is the first time a stationary DVS based traffic monitoring solution is extensively compared to simultaneously recorded RGB frame-based methods.
arXiv Detail & Related papers (2020-05-31T03:01:35Z)
Real-Time High-Performance Semantic Image Segmentation of Urban Street Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes. The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields. To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss. We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.