Memory-Efficient Graph Convolutional Networks for Object Classification
and Detection with Event Cameras
- URL: http://arxiv.org/abs/2307.14124v1
- Date: Wed, 26 Jul 2023 11:44:44 GMT
- Title: Memory-Efficient Graph Convolutional Networks for Object Classification
and Detection with Event Cameras
- Authors: Kamil Jeziorek, Andrea Pinna, Tomasz Kryjak
- Abstract summary: Graph convolutional networks (GCNs) are a promising approach for analyzing event data.
In this paper, we consider both factors together in order to achieve satisfying results and relatively low model complexity.
Our results show a 450-fold reduction in the number of parameters for the feature extraction module and a 4.5-fold reduction in the size of the data representation.
- Score: 2.3311605203774395
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in event camera research emphasize processing data in its
original sparse form, which allows the use of its unique features such as high
temporal resolution, high dynamic range, low latency, and resistance to image
blur. One promising approach for analyzing event data is through graph
convolutional networks (GCNs). However, current research in this domain
primarily focuses on optimizing computational costs, neglecting the associated
memory costs. In this paper, we consider both factors together in order to
achieve satisfying results and relatively low model complexity. For this
purpose, we performed a comparative analysis of different graph convolution
operations, considering factors such as execution time, the number of trainable
model parameters, data format requirements, and training outcomes. Our results
show a 450-fold reduction in the number of parameters for the feature
extraction module and a 4.5-fold reduction in the size of the data
representation while maintaining a classification accuracy of 52.3%, which is
6.3% higher compared to the operation used in state-of-the-art approaches. To
further evaluate performance, we implemented the object detection architecture
and evaluated its performance on the N-Caltech101 dataset. The results showed
an accuracy of 53.7 % mAP@0.5 and reached an execution rate of 82 graphs per
second.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - A Re-Parameterized Vision Transformer (ReVT) for Domain-Generalized
Semantic Segmentation [24.8695123473653]
We present a new augmentation-driven approach to domain generalization for semantic segmentation.
We achieve state-of-the-art mIoU performance of 47.3% (prior art: 46.3%) for small models and of 50.1% (prior art: 47.8%) for midsized models on commonly used benchmark datasets.
arXiv Detail & Related papers (2023-08-25T12:06:00Z) - Data-Side Efficiencies for Lightweight Convolutional Neural Networks [4.5853328688992905]
We show how four data attributes - number of classes, object color, image resolution, and object scale affect neural network model size and efficiency.
We provide an example, applying the metrics and methods to choose a lightweight model for a robot path planning application.
arXiv Detail & Related papers (2023-08-24T19:50:25Z) - CNN-transformer mixed model for object detection [3.5897534810405403]
In this paper, I propose a convolutional module with a transformer.
It aims to improve the recognition accuracy of the model by fusing the detailed features extracted by CNN with the global features extracted by a transformer.
After 100 rounds of training on the Pascal VOC dataset, the accuracy of the results reached 81%, which is 4.6 better than the faster RCNN[4] using resnet101[5] as the backbone.
arXiv Detail & Related papers (2022-12-13T16:35:35Z) - Pushing the Limits of Asynchronous Graph-based Object Detection with
Event Cameras [62.70541164894224]
We introduce several architecture choices which allow us to scale the depth and complexity of such models while maintaining low computation.
Our method runs 3.7 times faster than a dense graph neural network, taking only 8.4 ms per forward pass.
arXiv Detail & Related papers (2022-11-22T15:14:20Z) - FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose
Estimation with Decoupled Rotation Mechanism [49.89268018642999]
We propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation.
The proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation.
arXiv Detail & Related papers (2021-03-12T03:07:24Z) - Contemplating real-world object classification [53.10151901863263]
We reanalyze the ObjectNet dataset recently proposed by Barbu et al. containing objects in daily life situations.
We find that applying deep models to the isolated objects, rather than the entire scene as is done in the original paper, results in around 20-30% performance improvement.
arXiv Detail & Related papers (2021-03-08T23:29:59Z) - Inception Convolution with Efficient Dilation Search [121.41030859447487]
Dilation convolution is a critical mutant of standard convolution neural network to control effective receptive fields and handle large scale variance of objects.
We propose a new mutant of dilated convolution, namely inception (dilated) convolution where the convolutions have independent dilation among different axes, channels and layers.
We explore a practical method for fitting the complex inception convolution to the data, a simple while effective dilation search algorithm(EDO) based on statistical optimization is developed.
arXiv Detail & Related papers (2020-12-25T14:58:35Z) - se(3)-TrackNet: Data-driven 6D Pose Tracking by Calibrating Image
Residuals in Synthetic Domains [12.71983073907091]
This work proposes a data-driven optimization approach for long-term, 6D pose tracking.
It aims to identify the optimal relative pose given the current RGB-D observation and a synthetic image conditioned on the previous best estimate and the object's model.
The proposed approach achieves consistently robust estimates and outperforms alternatives, even though they have been trained with real images.
arXiv Detail & Related papers (2020-07-27T21:09:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.