An Efficient and Scalable Collection of Fly-inspired Voting Units for
  Visual Place Recognition in Changing Environments
        - URL: http://arxiv.org/abs/2109.10986v1
- Date: Wed, 22 Sep 2021 19:01:20 GMT
- Title: An Efficient and Scalable Collection of Fly-inspired Voting Units for
  Visual Place Recognition in Changing Environments
- Authors: Bruno Arcanjo, Bruno Ferrarini, Michael Milford, Klaus D.
  McDonald-Maier and Shoaib Ehsan
- Abstract summary: Low-overhead VPR techniques would enable platforms equipped with low-end, cheap hardware.
Our goal is to provide an algorithm of extreme compactness and efficiency while achieving state-of-the-art robustness to appearance changes and small point-of-view variations.
- Score: 20.485491385050615
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   State-of-the-art visual place recognition performance is currently being
achieved utilizing deep learning based approaches. Despite the recent efforts
in designing lightweight convolutional neural network based models, these can
still be too expensive for the most hardware restricted robot applications.
Low-overhead VPR techniques would not only enable platforms equipped with
low-end, cheap hardware but also reduce computation on more powerful systems,
allowing these resources to be allocated for other navigation tasks. In this
work, our goal is to provide an algorithm of extreme compactness and efficiency
while achieving state-of-the-art robustness to appearance changes and small
point-of-view variations. Our first contribution is DrosoNet, an exceptionally
compact model inspired by the odor processing abilities of the fruit fly,
Drosophyla melanogaster. Our second and main contribution is a voting mechanism
that leverages multiple small and efficient classifiers to achieve more robust
and consistent VPR compared to a single one. We use DrosoNet as the baseline
classifier for the voting mechanism and evaluate our models on five benchmark
datasets, assessing moderate to extreme appearance changes and small to
moderate viewpoint variations. We then compare the proposed algorithms to
state-of-the-art methods, both in terms of precision-recall AUC results and
computational efficiency.
 
      
        Related papers
        - DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation   Model [2.163881720692685]
 Learning-based monocular visual odometry (VO) poses robustness, generalization, and efficiency challenges in robotics.<n>Recent advances in visual foundation models, such as DINOv2, have improved robustness and generalization in various vision tasks.<n>We present DINO-VO, a feature-based VO system leveraging DINOv2 visual foundation model for its sparse feature matching.
 arXiv  Detail & Related papers  (2025-07-17T14:09:34Z)
- SimpleGVR: A Simple Baseline for Latent-Cascaded Video Super-Resolution [55.14432034345353]
 We study key design principles for latter cascaded video super-resolution models, which are underexplored currently.<n>First, we propose two strategies to generate training pairs that better mimic the output characteristics of the base model, ensuring alignment between the VSR model and its upstream generator.<n>Second, we provide critical insights into VSR model behavior through systematic analysis of (1) timestep sampling strategies, (2) noise augmentation effects on low-resolution (LR) inputs.
 arXiv  Detail & Related papers  (2025-06-24T17:57:26Z)
- Numerical Pruning for Efficient Autoregressive Models [87.56342118369123]
 This paper focuses on compressing decoder-only transformer-based autoregressive models through structural weight pruning.
Specifically, we propose a training-free pruning method that calculates a numerical score with Newton's method for the Attention and modules, respectively.
To verify the effectiveness of our method, we provide both theoretical support and extensive experiments.
 arXiv  Detail & Related papers  (2024-12-17T01:09:23Z)
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design   Framework of Neural Networks and Edge Deployment [61.20689382879937]
 Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
 arXiv  Detail & Related papers  (2024-10-29T19:02:54Z)
- Aggregating Multiple Bio-Inspired Image Region Classifiers For Effective
  And Lightweight Visual Place Recognition [22.09628302234166]
 We propose a novel multi-DrosoNet localization system, dubbed RegionDrosoNet, with significantly improved VPR performance.
Our approach relies on specializing distinct groups of DrosoNets on differently sliced partitions of the original image.
We introduce a novel voting module to combine the outputs of all DrosoNets into the final place prediction.
 arXiv  Detail & Related papers  (2023-12-20T12:57:01Z)
- Sample Less, Learn More: Efficient Action Recognition via Frame Feature
  Restoration [59.6021678234829]
 We propose a novel method to restore the intermediate features for two sparsely sampled and adjacent video frames.
With the integration of our method, the efficiency of three commonly used baselines has been improved by over 50%, with a mere 0.5% reduction in recognition accuracy.
 arXiv  Detail & Related papers  (2023-07-27T13:52:42Z)
- Patch-DrosoNet: Classifying Image Partitions With Fly-Inspired Models
  For Lightweight Visual Place Recognition [22.58641358408613]
 We present a novel training approach for DrosoNet, wherein separate models are trained on distinct regions of a reference image.
We also introduce a convolutional-like prediction method, in which each DrosoNet unit generates a set of place predictions for each portion of the query image.
Our approach significantly improves upon the VPR performance of previous work while maintaining an extremely compact and lightweight algorithm.
 arXiv  Detail & Related papers  (2023-05-09T08:25:49Z)
- Simple Pooling Front-ends For Efficient Audio Classification [56.59107110017436]
 We show that eliminating the temporal redundancy in the input audio features could be an effective approach for efficient audio classification.
We propose a family of simple pooling front-ends (SimPFs) which use simple non-parametric pooling operations to reduce the redundant information.
SimPFs can achieve a reduction in more than half of the number of floating point operations for off-the-shelf audio neural networks.
 arXiv  Detail & Related papers  (2022-10-03T14:00:41Z)
- A Brain-Inspired Low-Dimensional Computing Classifier for Inference on
  Tiny Devices [17.976792694929063]
 We propose a low-dimensional computing (LDC) alternative to hyperdimensional computing (HDC)
We map our LDC classifier into a neural equivalent network and optimize our model using a principled training approach.
Our LDC classifier offers an overwhelming advantage over the existing brain-inspired HDC models and is particularly suitable for inference on tiny devices.
 arXiv  Detail & Related papers  (2022-03-09T17:20:12Z)
- Enhancing Object Detection for Autonomous Driving by Optimizing Anchor
  Generation and Addressing Class Imbalance [0.0]
 This study presents an enhanced 2D object detector based on Faster R-CNN that is better suited for the context of autonomous vehicles.
The proposed modifications over the Faster R-CNN do not increase computational cost and can easily be extended to optimize other anchor-based detection frameworks.
 arXiv  Detail & Related papers  (2021-04-08T16:58:31Z)
- Adversarial Feature Augmentation and Normalization for Visual
  Recognition [109.6834687220478]
 Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
 arXiv  Detail & Related papers  (2021-03-22T20:36:34Z)
- Binary Neural Networks for Memory-Efficient and Effective Visual Place
  Recognition in Changing Environments [24.674034243725455]
 Visual place recognition (VPR) is a robot's ability to determine whether a place was visited before using visual data.
CNN-based approaches are unsuitable for resource-constrained platforms, such as small robots and drones.
We propose a new class of highly compact models that drastically reduces the memory requirements and computational effort.
 arXiv  Detail & Related papers  (2020-10-01T22:59:34Z)
- Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
 We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
 arXiv  Detail & Related papers  (2020-08-19T13:13:01Z)
- An Image Enhancing Pattern-based Sparsity for Real-time Inference on
  Mobile Devices [58.62801151916888]
 We introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly.
Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms.
 arXiv  Detail & Related papers  (2020-01-20T16:17:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.