Hyperdimensional computing as a framework for systematic aggregation of
image descriptors
- URL: http://arxiv.org/abs/2101.07720v1
- Date: Tue, 19 Jan 2021 16:49:58 GMT
- Title: Hyperdimensional computing as a framework for systematic aggregation of
image descriptors
- Authors: Peer Neubert and Stefan Schubert
- Abstract summary: We use hyperdimensional computing (HDC) as an approach to combine information from a set of vectors in a single vector of the same dimensionality.
We present a HDC implementation that is suitable for processing the output of existing and future (deep-learning based) image descriptors.
- Score: 4.56877715768796
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image and video descriptors are an omnipresent tool in computer vision and
its application fields like mobile robotics. Many hand-crafted and in
particular learned image descriptors are numerical vectors with a potentially
(very) large number of dimensions. Practical considerations like memory
consumption or time for comparisons call for the creation of compact
representations. In this paper, we use hyperdimensional computing (HDC) as an
approach to systematically combine information from a set of vectors in a
single vector of the same dimensionality. HDC is a known technique to perform
symbolic processing with distributed representation in numerical vectors with
thousands of dimensions. We present a HDC implementation that is suitable for
processing the output of existing and future (deep-learning based) image
descriptors. We discuss how this can be used as a framework to process
descriptors together with additional knowledge by simple and fast vector
operations. A concrete outcome is a novel HDC-based approach to aggregate a set
of local image descriptors together with their image positions in a single
holistic descriptor. The comparison to available holistic descriptors and
aggregation methods on a series of standard mobile robotics place recognition
experiments shows a 20% improvement in average performance compared to
runner-up and 3.6x better worst-case performance.
Related papers
- Neural Clustering based Visual Representation Learning [61.72646814537163]
Clustering is one of the most classic approaches in machine learning and data analysis.
We propose feature extraction with clustering (FEC), which views feature extraction as a process of selecting representatives from data.
FEC alternates between grouping pixels into individual clusters to abstract representatives and updating the deep features of pixels with current representatives.
arXiv Detail & Related papers (2024-03-26T06:04:50Z) - Residual Learning for Image Point Descriptors [56.917951170421894]
We propose a very simple and effective approach to learning local image descriptors by using a hand-crafted detector and descriptor.
We optimize the final descriptor by leveraging the knowledge already present in the handcrafted descriptor.
Our approach has potential applications in ensemble learning and learning with non-differentiable functions.
arXiv Detail & Related papers (2023-12-24T12:51:30Z) - Fast and Efficient Scene Categorization for Autonomous Driving using
VAEs [2.694218293356451]
Scene categorization is a useful precursor task that provides prior knowledge for advanced computer vision tasks.
We propose to generate a global descriptor that captures coarse features from the image and use a classification head to map the descriptors to 3 scene categories: Rural, Urban and Suburban.
The proposed global descriptor is very compact with an embedding length of 128, significantly faster to compute, and is robust to seasonal and illuminational changes.
arXiv Detail & Related papers (2022-10-26T18:50:15Z) - Learning-Based Dimensionality Reduction for Computing Compact and
Effective Local Feature Descriptors [101.62384271200169]
A distinctive representation of image patches in form of features is a key component of many computer vision and robotics tasks.
We investigate multi-layer perceptrons (MLPs) to extract low-dimensional but high-quality descriptors.
We consider different applications, including visual localization, patch verification, image matching and retrieval.
arXiv Detail & Related papers (2022-09-27T17:59:04Z) - Efficient Multiscale Object-based Superpixel Framework [62.48475585798724]
We propose a novel superpixel framework, named Superpixels through Iterative CLEarcutting (SICLE)
SICLE exploits object information being able to generate a multiscale segmentation on-the-fly.
It generalizes recent superpixel methods, surpassing them and other state-of-the-art approaches in efficiency and effectiveness according to multiple delineation metrics.
arXiv Detail & Related papers (2022-04-07T15:59:38Z) - ZippyPoint: Fast Interest Point Detection, Description, and Matching
through Mixed Precision Discretization [71.91942002659795]
We investigate and adapt network quantization techniques to accelerate inference and enable its use on compute limited platforms.
ZippyPoint, our efficient quantized network with binary descriptors, improves the network runtime speed, the descriptor matching speed, and the 3D model size.
These improvements come at a minor performance degradation as evaluated on the tasks of homography estimation, visual localization, and map-free visual relocalization.
arXiv Detail & Related papers (2022-03-07T18:59:03Z) - Efficient Scene Compression for Visual-based Localization [5.575448433529451]
Estimating the pose of a camera with respect to a 3D reconstruction or scene representation is a crucial step for many mixed reality and robotics applications.
This work introduces a novel approach that compresses a scene representation by means of a constrained quadratic program (QP)
Our experiments on publicly available datasets show that our approach compresses a scene representation quickly while delivering accurate pose estimates.
arXiv Detail & Related papers (2020-11-27T18:36:06Z) - Learning Feature Descriptors using Camera Pose Supervision [101.56783569070221]
We propose a novel weakly-supervised framework that can learn feature descriptors solely from relative camera poses between images.
Because we no longer need pixel-level ground-truth correspondences, our framework opens up the possibility of training on much larger and more diverse datasets for better and unbiased descriptors.
arXiv Detail & Related papers (2020-04-28T06:35:27Z) - Texture Classification using Block Intensity and Gradient Difference
(BIGD) Descriptor [18.51387789714017]
We present an efficient and distinctive local descriptor, namely block intensity and gradient difference (BIGD)
In an image patch, we randomly sample multi-scale block pairs and utilize the intensity and gradient differences of pairwise blocks to construct the local BIGD descriptor.
Experimental results show that the proposed BIGD descriptor with stronger discriminative power yields 0.12% 6.43% higher classification accuracy than the state-of-the-art texture descriptor.
arXiv Detail & Related papers (2020-02-04T07:03:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.