Related papers: A 96pJ/Frame/Pixel and 61pJ/Event Anti-UAV System with Hybrid Object Tracking Modes

A 96pJ/Frame/Pixel and 61pJ/Event Anti-UAV System with Hybrid Object Tracking Modes

URL: http://arxiv.org/abs/2512.17939v1
Date: Fri, 12 Dec 2025 13:53:38 GMT
Title: A 96pJ/Frame/Pixel and 61pJ/Event Anti-UAV System with Hybrid Object Tracking Modes
Authors: Yuncheng Lu, Yucen Shi, Aobo Li, Zehao Li, Junying Li, Bo Wang, Tony Tae-Hyoung Kim,
Abstract summary: We present an energy-efficient anti-UAV system that integrates frame-based and event-driven object tracking.<n>The 2 mm2 chip achieves 96 pJ per frame per pixel and 61 pJ per event at 0.8 V, and reaches 98.2 percent recognition accuracy on public UAV datasets.
Score: 5.593237736175593
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present an energy-efficient anti-UAV system that integrates frame-based and event-driven object tracking to enable reliable detection of small and fast-moving drones. The system reconstructs binary event frames using run-length encoding, generates region proposals, and adaptively switches between frame mode and event mode based on object size and velocity. A Fast Object Tracking Unit improves robustness for high-speed targets through adaptive thresholding and trajectory-based classification. The neural processing unit supports both grayscale-patch and trajectory inference with a custom instruction set and a zero-skipping MAC architecture, reducing redundant neural computations by more than 97 percent. Implemented in 40 nm CMOS technology, the 2 mm^2 chip achieves 96 pJ per frame per pixel and 61 pJ per event at 0.8 V, and reaches 98.2 percent recognition accuracy on public UAV datasets across 50 to 400 m ranges and 5 to 80 pixels per second speeds. The results demonstrate state-of-the-art end-to-end energy efficiency for anti-UAV systems.

Related papers

Commercial Vehicle Braking Optimization: A Robust SIFT-Trajectory Approach [6.751326589596112]
A vision-based trajectory analysis solution is proposed to address the "zero-speed braking" issue caused by inaccurate Controller Area Network (CAN) signals.<n>The algorithm utilizes the NVIDIA Jetson AGX Xavier platform to process sequential video frames from a blind spot camera.<n>The deployment on-site shows an 89% reduction in false braking events, a 100% success rate in emergency braking, and a fault rate below 5%.
arXiv Detail & Related papers (2025-12-21T05:06:16Z)
Event-Based Visual Teach-and-Repeat via Fast Fourier-Domain Cross-Correlation [52.46888249268445]
We present the first event-camera-based visual teach-and-repeat system.<n>We develop a frequency-domain cross-correlation framework that transforms the event stream matching problem into computationally efficient space multiplications.<n>Experiments using a Prophesee EVK4 HD event camera mounted on an AgileX Scout Mini robot demonstrate successful autonomous navigation.
arXiv Detail & Related papers (2025-09-21T23:53:31Z)
Exploiting Lightweight Hierarchical ViT and Dynamic Framework for Efficient Visual Tracking [49.07982079554859]
Transformer-based visual trackers have demonstrated significant advancements due to their powerful modeling capabilities.<n>However, their practicality is limited on resource-constrained devices because of their slow processing speeds.<n>We present HiT, a novel family of efficient tracking models that achieve high performance while maintaining fast operation across various devices.
arXiv Detail & Related papers (2025-06-25T12:46:46Z)
Fast-COS: A Fast One-Stage Object Detector Based on Reparameterized Attention Vision Transformer for Autonomous Driving [3.617580194719686]
This paper introduces Fast-COS, a novel single-stage object detection framework crafted specifically for driving scenes.<n> RAViT achieves 81.4% Top-1 accuracy on the ImageNet-1K dataset.<n>It surpasses leading models in efficiency, delivering up to 75.9% faster GPU inference and 1.38 higher throughput on edge devices.
arXiv Detail & Related papers (2025-02-11T09:54:09Z)
Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers [56.37495946212932]
Vision transformers (ViTs) have demonstrated their superior accuracy for computer vision tasks compared to convolutional neural networks (CNNs) This work proposes Quasar-ViT, a hardware-oriented quantization-aware architecture search framework for ViTs.
arXiv Detail & Related papers (2024-07-25T16:35:46Z)
Gesture Recognition for FMCW Radar on the Edge [0.0]
We show that gestures can be characterized efficiently by a set of five features. A recurrent neural network (RNN) based architecture exploits these features to jointly detect and classify five different gestures. The proposed system recognizes gestures with an F1 score of 98.4% on our hold-out test dataset.
arXiv Detail & Related papers (2023-10-13T06:03:07Z)
ColibriUAV: An Ultra-Fast, Energy-Efficient Neuromorphic Edge Processing UAV-Platform with Event-Based and Frame-Based Cameras [14.24529561007139]
ColibriUAV is a UAV platform with both frame-based and event-based cameras interfaces. Kraken is capable of efficiently processing both event data from a DVS camera and frame data from an RGB camera. This paper benchmarks the end-to-end latency and power efficiency of the neuromorphic and event-based UAV subsystem.
arXiv Detail & Related papers (2023-05-27T23:08:22Z)
Ultra-low Power Deep Learning-based Monocular Relative Localization Onboard Nano-quadrotors [64.68349896377629]
This work presents a novel autonomous end-to-end system that addresses the monocular relative localization, through deep neural networks (DNNs), of two peer nano-drones. To cope with the ultra-constrained nano-drone platform, we propose a vertically-integrated framework, including dataset augmentation, quantization, and system optimizations. Experimental results show that our DNN can precisely localize a 10cm-size target nano-drone by employing only low-resolution monochrome images, up to 2m distance.
arXiv Detail & Related papers (2023-03-03T14:14:08Z)
Fast Motion Understanding with Spatiotemporal Neural Networks and Dynamic Vision Sensors [99.94079901071163]
This paper presents a Dynamic Vision Sensor (DVS) based system for reasoning about high speed motion. We consider the case of a robot at rest reacting to a small, fast approaching object at speeds higher than 15m/s. We highlight the results of our system to a toy dart moving at 23.4m/s with a 24.73deg error in $theta$, 18.4mm average discretized radius prediction error, and 25.03% median time to collision prediction error.
arXiv Detail & Related papers (2020-11-18T17:55:07Z)
CoDeNet: Efficient Deployment of Input-Adaptive Object Detection on Embedded FPGAs [41.43273142203345]
We harness the flexibility of FPGAs to develop a novel object detection pipeline with deformable convolutions. With our high-efficiency implementation, our solution reaches 26.9 frames per second with a tiny model size of 0.76 MB. Our model gets to 67.1 AP50 on Pascal VOC with only 2.9 MB of parameters-20.9x smaller but 10% more accurate than Tiny-YOLO.
arXiv Detail & Related papers (2020-06-12T17:56:47Z)
Near-chip Dynamic Vision Filtering for Low-Bandwidth Pedestrian Detection [99.94079901071163]
This paper presents a novel end-to-end system for pedestrian detection using Dynamic Vision Sensors (DVSs) We target applications where multiple sensors transmit data to a local processing unit, which executes a detection algorithm. Our detector is able to perform a detection every 450 ms, with an overall testing F1 score of 83%.
arXiv Detail & Related papers (2020-04-03T17:36:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.