Pushing the Limits of Asynchronous Graph-based Object Detection with
Event Cameras
- URL: http://arxiv.org/abs/2211.12324v1
- Date: Tue, 22 Nov 2022 15:14:20 GMT
- Title: Pushing the Limits of Asynchronous Graph-based Object Detection with
Event Cameras
- Authors: Daniel Gehrig and Davide Scaramuzza
- Abstract summary: We introduce several architecture choices which allow us to scale the depth and complexity of such models while maintaining low computation.
Our method runs 3.7 times faster than a dense graph neural network, taking only 8.4 ms per forward pass.
- Score: 62.70541164894224
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: State-of-the-art machine-learning methods for event cameras treat events as
dense representations and process them with conventional deep neural networks.
Thus, they fail to maintain the sparsity and asynchronous nature of event data,
thereby imposing significant computation and latency constraints on downstream
systems. A recent line of work tackles this issue by modeling events as
spatiotemporally evolving graphs that can be efficiently and asynchronously
processed using graph neural networks. These works showed impressive
computation reductions, yet their accuracy is still limited by the small scale
and shallow depth of their network, both of which are required to reduce
computation. In this work, we break this glass ceiling by introducing several
architecture choices which allow us to scale the depth and complexity of such
models while maintaining low computation. On object detection tasks, our
smallest model shows up to 3.7 times lower computation, while outperforming
state-of-the-art asynchronous methods by 7.4 mAP. Even when scaling to larger
model sizes, we are 13% more efficient than state-of-the-art while
outperforming it by 11.5 mAP. As a result, our method runs 3.7 times faster
than a dense graph neural network, taking only 8.4 ms per forward pass. This
opens the door to efficient, and accurate object detection in edge-case
scenarios.
Related papers
- Memory-Efficient Graph Convolutional Networks for Object Classification
and Detection with Event Cameras [2.3311605203774395]
Graph convolutional networks (GCNs) are a promising approach for analyzing event data.
In this paper, we consider both factors together in order to achieve satisfying results and relatively low model complexity.
Our results show a 450-fold reduction in the number of parameters for the feature extraction module and a 4.5-fold reduction in the size of the data representation.
arXiv Detail & Related papers (2023-07-26T11:44:44Z) - AEGNN: Asynchronous Event-based Graph Neural Networks [54.528926463775946]
Event-based Graph Neural Networks generalize standard GNNs to process events as "evolving"-temporal graphs.
AEGNNs are easily trained on synchronous inputs and can be converted to efficient, "asynchronous" networks at test time.
arXiv Detail & Related papers (2022-03-31T16:21:12Z) - Effective Model Sparsification by Scheduled Grow-and-Prune Methods [73.03533268740605]
We propose a novel scheduled grow-and-prune (GaP) methodology without pre-training the dense models.
Experiments have shown that such models can match or beat the quality of highly optimized dense models at 80% sparsity on a variety of tasks.
arXiv Detail & Related papers (2021-06-18T01:03:13Z) - FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation [81.76975488010213]
Dense optical flow estimation plays a key role in many robotic vision tasks.
Current networks often occupy large number of parameters and require heavy computation costs.
Our proposed FastFlowNet works in the well-known coarse-to-fine manner with following innovations.
arXiv Detail & Related papers (2021-03-08T03:09:37Z) - EDNet: Efficient Disparity Estimation with Cost Volume Combination and
Attention-based Spatial Residual [17.638034176859932]
Existing disparity estimation works mostly leverage the 4D concatenation volume and construct a very deep 3D convolution neural network (CNN) for disparity regression.
In this paper, we propose a network named EDNet for efficient disparity estimation.
Experiments on the Scene Flow and KITTI datasets show that EDNet outperforms the previous 3D CNN based works.
arXiv Detail & Related papers (2020-10-26T04:49:44Z) - Temporal Attention-Augmented Graph Convolutional Network for Efficient
Skeleton-Based Human Action Recognition [97.14064057840089]
Graphal networks (GCNs) have been very successful in modeling non-Euclidean data structures.
Most GCN-based action recognition methods use deep feed-forward networks with high computational complexity to process all skeletons in an action.
We propose a temporal attention module (TAM) for increasing the efficiency in skeleton-based action recognition.
arXiv Detail & Related papers (2020-10-23T08:01:55Z) - Pose Refinement Graph Convolutional Network for Skeleton-based Action
Recognition [21.720764076798904]
We propose a highly efficient graph convolutional network for action recognition.
Our network requires 86%-93% less parameters and reduces the floating point operations by 89%-96%.
It provides a much better trade-off between accuracy, memory footprint and processing time, which makes it suitable for robotics applications.
arXiv Detail & Related papers (2020-10-14T19:06:23Z) - Event-based Asynchronous Sparse Convolutional Networks [54.094244806123235]
Event cameras are bio-inspired sensors that respond to per-pixel brightness changes in the form of asynchronous and sparse "events"
We present a general framework for converting models trained on synchronous image-like event representations into asynchronous models with identical output.
We show both theoretically and experimentally that this drastically reduces the computational complexity and latency of high-capacity, synchronous neural networks.
arXiv Detail & Related papers (2020-03-20T08:39:49Z) - Compression of descriptor models for mobile applications [26.498907514590165]
We evaluate the computational cost, model size, and matching accuracy tradeoffs for deep neural networks.
We observe a significant redundancy in the learned weights, which we exploit through the use of depthwise separable layers.
We propose the Convolution-Depthwise-Pointwise(CDP) layer, which provides a means of interpolating between the standard and depthwise separable convolutions.
arXiv Detail & Related papers (2020-01-09T17:00:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.