Dynamic Texture Recognition using PDV Hashing and Dictionary Learning on
Multi-scale Volume Local Binary Pattern
- URL: http://arxiv.org/abs/2111.12315v1
- Date: Wed, 24 Nov 2021 07:57:14 GMT
- Title: Dynamic Texture Recognition using PDV Hashing and Dictionary Learning on
Multi-scale Volume Local Binary Pattern
- Authors: Ruxin Ding, Jianfeng Ren, Heng Yu, Jiawei Li
- Abstract summary: We propose a method for dynamic texture recognition using PDV hashing and dictionary learning on multi-scale volume local binary pattern (PHD-MVLBP)
Instead of forming very high-dimensional LBP histogram features, it first uses hash functions to map the pixel difference vectors (PDVs) to binary vectors, then forms a dictionary using the derived binary vector, and encodes them using the derived dictionary.
- Score: 11.497810572868396
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spatial-temporal local binary pattern (STLBP) has been widely used in dynamic
texture recognition. STLBP often encounters the high-dimension problem as its
dimension increases exponentially, so that STLBP could only utilize a small
neighborhood. To tackle this problem, we propose a method for dynamic texture
recognition using PDV hashing and dictionary learning on multi-scale volume
local binary pattern (PHD-MVLBP). Instead of forming very high-dimensional LBP
histogram features, it first uses hash functions to map the pixel difference
vectors (PDVs) to binary vectors, then forms a dictionary using the derived
binary vector, and encodes them using the derived dictionary. In such a way,
the PDVs are mapped to feature vectors of the size of dictionary, instead of
LBP histograms of very high dimension. Such an encoding scheme could extract
the discriminant information from videos in a much larger neighborhood
effectively. The experimental results on two widely-used dynamic textures
datasets, DynTex++ and UCLA, show the superiority performance of the proposed
approach over the state-of-the-art methods.
Related papers
- Decomposition of Neural Discrete Representations for Large-Scale 3D Mapping [15.085191496726967]
We introduce Decomposition-based Neural Mapping (DNMap)
DNMap is a storage-efficient large-scale 3D mapping method.
We learn low-resolution continuous embeddings that require tiny storage space.
arXiv Detail & Related papers (2024-07-22T11:32:33Z) - Low Rank Multi-Dictionary Selection at Scale [5.827700856320355]
We propose a multi-dictionary atom selection technique for low-rank sparse coding named LRMDS.
We demonstrate the scalability quality of LRMDS in both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-06-11T05:40:45Z) - Text-Based Reasoning About Vector Graphics [76.42082386029206]
We propose the Visually Descriptive Language Model (VDLM), which performs text-based reasoning about vector graphics.
VDLM bridges with pretrained language models through a newly introduced symbolic representation, Primal Visual Description (PVD)
Our framework offers better interpretability due to its disentangled perception and reasoning processes.
arXiv Detail & Related papers (2024-04-09T17:30:18Z) - Fast Machine Learning Method with Vector Embedding on Orthonormal Basis
and Spectral Transform [0.0]
The paper provides examples of word embedding, text chunk embedding, and image embedding, implemented in Julia language with a vector database.
It also investigates unsupervised learning and supervised learning using this method, along with strategies for handling large data volumes.
arXiv Detail & Related papers (2023-10-27T18:48:54Z) - Learning-Based Dimensionality Reduction for Computing Compact and
Effective Local Feature Descriptors [101.62384271200169]
A distinctive representation of image patches in form of features is a key component of many computer vision and robotics tasks.
We investigate multi-layer perceptrons (MLPs) to extract low-dimensional but high-quality descriptors.
We consider different applications, including visual localization, patch verification, image matching and retrieval.
arXiv Detail & Related papers (2022-09-27T17:59:04Z) - Learning Implicit Feature Alignment Function for Semantic Segmentation [51.36809814890326]
Implicit Feature Alignment function (IFA) is inspired by the rapidly expanding topic of implicit neural representations.
We show that IFA implicitly aligns the feature maps at different levels and is capable of producing segmentation maps in arbitrary resolutions.
Our method can be combined with improvement on various architectures, and it achieves state-of-the-art accuracy trade-off on common benchmarks.
arXiv Detail & Related papers (2022-06-17T09:40:14Z) - Dictionary Learning with Uniform Sparse Representations for Anomaly
Detection [2.277447144331876]
We study how dictionary learning (DL) performs in detecting abnormal samples in a dataset of signals.
Numerical simulations show that one can efficiently use this resulted subspace to discriminate the anomalies over the regular data points.
arXiv Detail & Related papers (2022-01-11T10:22:46Z) - ASH: A Modern Framework for Parallel Spatial Hashing in 3D Perception [91.24236600199542]
ASH is a modern and high-performance framework for parallel spatial hashing on GPU.
ASH achieves higher performance, supports richer functionality, and requires fewer lines of code.
ASH and its example applications are open sourced in Open3D.
arXiv Detail & Related papers (2021-10-01T16:25:40Z) - A Holistically-Guided Decoder for Deep Representation Learning with
Applications to Semantic Segmentation and Object Detection [74.88284082187462]
One common strategy is to adopt dilated convolutions in the backbone networks to extract high-resolution feature maps.
We propose one novel holistically-guided decoder which is introduced to obtain the high-resolution semantic-rich feature maps.
arXiv Detail & Related papers (2020-12-18T10:51:49Z) - Appliance identification using a histogram post-processing of 2D local
binary patterns for smart grid applications [2.389598109913753]
We propose a novel method to extract electrical power signatures after transforming the power signal to 2D space.
An improved local binary patterns (LBP) is proposed that relies on improving the discriminative ability of conventional LBP.
A comprehensive performance evaluation is performed on two different datasets.
arXiv Detail & Related papers (2020-10-03T19:23:30Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.