EV-VGCNN: A Voxel Graph CNN for Event-based Object Classification
- URL: http://arxiv.org/abs/2106.00216v1
- Date: Tue, 1 Jun 2021 04:07:03 GMT
- Title: EV-VGCNN: A Voxel Graph CNN for Event-based Object Classification
- Authors: Yongjian Deng, Hao Chen, Huiying Chen, Youfu Li
- Abstract summary: Event cameras report sparse intensity changes and hold noticeable advantages of low power consumption, high dynamic range, and high response speed for visual perception and understanding on portable devices.
Event-based learning methods have recently achieved massive success on object recognition by integrating events into dense frame-based representations to apply traditional 2D learning algorithms.
These approaches introduce much redundant information during the sparse-to-dense conversion and necessitate models with heavy-weight and large capacities, limiting the potential of event cameras on real-life applications.
- Score: 18.154951807178943
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Event cameras report sparse intensity changes and hold noticeable advantages
of low power consumption, high dynamic range, and high response speed for
visual perception and understanding on portable devices. Event-based learning
methods have recently achieved massive success on object recognition by
integrating events into dense frame-based representations to apply traditional
2D learning algorithms. However, these approaches introduce much redundant
information during the sparse-to-dense conversion and necessitate models with
heavy-weight and large capacities, limiting the potential of event cameras on
real-life applications. To address the core problem of balancing accuracy and
model complexity for event-based classification models, we (1) construct graph
representations for event data to utilize their sparsity nature better and
design a lightweight end-to-end graph neural network (EV-VGCNN) for
classification; (2) use voxel-wise vertices rather than traditional point-wise
methods to incorporate the information from more points; (3) introduce a
multi-scale feature relational layer (MFRL) to extract semantic and motion cues
from each vertex adaptively concerning its distances to neighbors.
Comprehensive experiments show that our approach advances state-of-the-art
classification accuracy while achieving nearly 20 times parameter reduction
(merely 0.84M parameters).
Related papers
- Leveraging Neural Radiance Field in Descriptor Synthesis for Keypoints Scene Coordinate Regression [1.2974519529978974]
This paper introduces a pipeline for keypoint descriptor synthesis using Neural Radiance Field (NeRF)
generating novel poses and feeding them into a trained NeRF model to create new views, our approach enhances the KSCR's capabilities in data-scarce environments.
The proposed system could significantly improve localization accuracy by up to 50% and cost only a fraction of time for data synthesis.
arXiv Detail & Related papers (2024-03-15T13:40:37Z) - Event Voxel Set Transformer for Spatiotemporal Representation Learning on Event Streams [19.957857885844838]
Event cameras are neuromorphic vision sensors that record a scene as sparse and asynchronous event streams.
We propose an attentionaware model named Event Voxel Set Transformer (EVSTr) for efficient representation learning on event streams.
Experiments show that EVSTr achieves state-of-the-art performance while maintaining low model complexity.
arXiv Detail & Related papers (2023-03-07T12:48:02Z) - A Dynamic Graph CNN with Cross-Representation Distillation for
Event-Based Recognition [21.225945234873745]
We present a new event-based graph learning framework called graph cross-representation distillation (CRD)
CRD provides additional supervision and prior knowledge for the event graph.
Our model and learning framework are effective and generalize well across multiple vision tasks.
arXiv Detail & Related papers (2023-02-08T16:35:39Z) - MultiScale MeshGraphNets [65.26373813797409]
We propose two complementary approaches to improve the framework from MeshGraphNets.
First, we demonstrate that it is possible to learn accurate surrogate dynamics of a high-resolution system on a much coarser mesh.
Second, we introduce a hierarchical approach (MultiScale MeshGraphNets) which passes messages on two different resolutions.
arXiv Detail & Related papers (2022-10-02T20:16:20Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - Scale Attention for Learning Deep Face Representation: A Study Against
Visual Scale Variation [69.45176408639483]
We reform the conv layer by resorting to the scale-space theory.
We build a novel style named SCale AttentioN Conv Neural Network (textbfSCAN-CNN)
As a single-shot scheme, the inference is more efficient than multi-shot fusion.
arXiv Detail & Related papers (2022-09-19T06:35:04Z) - Tackling Oversmoothing of GNNs with Contrastive Learning [35.88575306925201]
Graph neural networks (GNNs) integrate the comprehensive relation of graph data and representation learning capability.
Oversmoothing makes the final representations of nodes indiscriminative, thus deteriorating the node classification and link prediction performance.
We propose the Topology-guided Graph Contrastive Layer, named TGCL, which is the first de-oversmoothing method maintaining all three mentioned metrics.
arXiv Detail & Related papers (2021-10-26T15:56:16Z) - CDN-MEDAL: Two-stage Density and Difference Approximation Framework for
Motion Analysis [3.337126420148156]
We propose a novel, two-stage method of change detection with two convolutional neural networks.
Our two-stage framework contains approximately 3.5K parameters in total but still maintains rapid convergence to intricate motion patterns.
arXiv Detail & Related papers (2021-06-07T16:39:42Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object
Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving.
Current approaches suffer from sparse and partial point clouds of distant and occluded objects.
In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.