Patch-DrosoNet: Classifying Image Partitions With Fly-Inspired Models
For Lightweight Visual Place Recognition
- URL: http://arxiv.org/abs/2305.05256v1
- Date: Tue, 9 May 2023 08:25:49 GMT
- Title: Patch-DrosoNet: Classifying Image Partitions With Fly-Inspired Models
For Lightweight Visual Place Recognition
- Authors: Bruno Arcanjo, Bruno Ferrarini, Michael Milford, Klaus D.
McDonald-Maier, Shoaib Ehsan
- Abstract summary: We present a novel training approach for DrosoNet, wherein separate models are trained on distinct regions of a reference image.
We also introduce a convolutional-like prediction method, in which each DrosoNet unit generates a set of place predictions for each portion of the query image.
Our approach significantly improves upon the VPR performance of previous work while maintaining an extremely compact and lightweight algorithm.
- Score: 22.58641358408613
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual place recognition (VPR) enables autonomous systems to localize
themselves within an environment using image information. While Convolution
Neural Networks (CNNs) currently dominate state-of-the-art VPR performance,
their high computational requirements make them unsuitable for platforms with
budget or size constraints. This has spurred the development of lightweight
algorithms, such as DrosoNet, which employs a voting system based on multiple
bio-inspired units. In this paper, we present a novel training approach for
DrosoNet, wherein separate models are trained on distinct regions of a
reference image, allowing them to specialize in the visual features of that
specific section. Additionally, we introduce a convolutional-like prediction
method, in which each DrosoNet unit generates a set of place predictions for
each portion of the query image. These predictions are then combined using the
previously introduced voting system. Our approach significantly improves upon
the VPR performance of previous work while maintaining an extremely compact and
lightweight algorithm, making it suitable for resource-constrained platforms.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Neural Clustering based Visual Representation Learning [61.72646814537163]
Clustering is one of the most classic approaches in machine learning and data analysis.
We propose feature extraction with clustering (FEC), which views feature extraction as a process of selecting representatives from data.
FEC alternates between grouping pixels into individual clusters to abstract representatives and updating the deep features of pixels with current representatives.
arXiv Detail & Related papers (2024-03-26T06:04:50Z) - Aggregating Multiple Bio-Inspired Image Region Classifiers For Effective
And Lightweight Visual Place Recognition [22.09628302234166]
We propose a novel multi-DrosoNet localization system, dubbed RegionDrosoNet, with significantly improved VPR performance.
Our approach relies on specializing distinct groups of DrosoNets on differently sliced partitions of the original image.
We introduce a novel voting module to combine the outputs of all DrosoNets into the final place prediction.
arXiv Detail & Related papers (2023-12-20T12:57:01Z) - ClusVPR: Efficient Visual Place Recognition with Clustering-based
Weighted Transformer [13.0858576267115]
We present ClusVPR, a novel approach that tackles the specific issues of redundant information in duplicate regions and representations of small objects.
ClusVPR introduces a unique paradigm called Clustering-based weighted Transformer Network (CWTNet)
We also introduce the optimized-VLAD layer that significantly reduces the number of parameters and enhances model efficiency.
arXiv Detail & Related papers (2023-10-06T09:01:15Z) - Deep Learning Computer Vision Algorithms for Real-time UAVs On-board
Camera Image Processing [77.34726150561087]
This paper describes how advanced deep learning based computer vision algorithms are applied to enable real-time on-board sensor processing for small UAVs.
All algorithms have been developed using state-of-the-art image processing methods based on deep neural networks.
arXiv Detail & Related papers (2022-11-02T11:10:42Z) - Glance and Focus Networks for Dynamic Visual Recognition [36.26856080976052]
We formulate the image recognition problem as a sequential coarse-to-fine feature learning process, mimicking the human visual system.
The proposed Glance and Focus Network (GFNet) first extracts a quick global representation of the input image at a low resolution scale, and then strategically attends to a series of salient (small) regions to learn finer features.
It reduces the average latency of the highly efficient MobileNet-V3 on an iPhone XS Max by 1.3x without sacrificing accuracy.
arXiv Detail & Related papers (2022-01-09T14:00:56Z) - Optimising for Interpretability: Convolutional Dynamic Alignment
Networks [108.83345790813445]
We introduce a new family of neural network models called Convolutional Dynamic Alignment Networks (CoDA Nets)
Their core building blocks are Dynamic Alignment Units (DAUs), which are optimised to transform their inputs with dynamically computed weight vectors that align with task-relevant patterns.
CoDA Nets model the classification prediction through a series of input-dependent linear transformations, allowing for linear decomposition of the output into individual input contributions.
arXiv Detail & Related papers (2021-09-27T12:39:46Z) - An Efficient and Scalable Collection of Fly-inspired Voting Units for
Visual Place Recognition in Changing Environments [20.485491385050615]
Low-overhead VPR techniques would enable platforms equipped with low-end, cheap hardware.
Our goal is to provide an algorithm of extreme compactness and efficiency while achieving state-of-the-art robustness to appearance changes and small point-of-view variations.
arXiv Detail & Related papers (2021-09-22T19:01:20Z) - A Deep-Unfolded Reference-Based RPCA Network For Video
Foreground-Background Separation [86.35434065681925]
This paper proposes a new deep-unfolding-based network design for the problem of Robust Principal Component Analysis (RPCA)
Unlike existing designs, our approach focuses on modeling the temporal correlation between the sparse representations of consecutive video frames.
Experimentation using the moving MNIST dataset shows that the proposed network outperforms a recently proposed state-of-the-art RPCA network in the task of video foreground-background separation.
arXiv Detail & Related papers (2020-10-02T11:40:09Z) - Learning to Learn Parameterized Classification Networks for Scalable
Input Images [76.44375136492827]
Convolutional Neural Networks (CNNs) do not have a predictable recognition behavior with respect to the input resolution change.
We employ meta learners to generate convolutional weights of main networks for various input scales.
We further utilize knowledge distillation on the fly over model predictions based on different input resolutions.
arXiv Detail & Related papers (2020-07-13T04:27:25Z) - Contextual Encoder-Decoder Network for Visual Saliency Prediction [42.047816176307066]
We propose an approach based on a convolutional neural network pre-trained on a large-scale image classification task.
We combine the resulting representations with global scene information for accurately predicting visual saliency.
Compared to state of the art approaches, the network is based on a lightweight image classification backbone.
arXiv Detail & Related papers (2019-02-18T16:15:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.