Dynamic Texture Recognition via Nuclear Distances on Kernelized
Scattering Histogram Spaces
- URL: http://arxiv.org/abs/2102.00841v1
- Date: Mon, 1 Feb 2021 13:54:24 GMT
- Title: Dynamic Texture Recognition via Nuclear Distances on Kernelized
Scattering Histogram Spaces
- Authors: Alexander Sagel, Julian W\"ormann, Hao Shen
- Abstract summary: This work proposes to describe dynamic textures as kernelized spaces of frame-wise feature vectors computed using the Scattering transform.
By combining these spaces with a basis-invariant metric, we get a framework that produces competitive results for nearest neighbor classification and state-of-the-art results for nearest class center classification.
- Score: 95.21606283608683
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distance-based dynamic texture recognition is an important research field in
multimedia processing with applications ranging from retrieval to segmentation
of video data. Based on the conjecture that the most distinctive characteristic
of a dynamic texture is the appearance of its individual frames, this work
proposes to describe dynamic textures as kernelized spaces of frame-wise
feature vectors computed using the Scattering transform. By combining these
spaces with a basis-invariant metric, we get a framework that produces
competitive results for nearest neighbor classification and state-of-the-art
results for nearest class center classification.
Related papers
- DNS SLAM: Dense Neural Semantic-Informed SLAM [92.39687553022605]
DNS SLAM is a novel neural RGB-D semantic SLAM approach featuring a hybrid representation.
Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details.
Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking.
arXiv Detail & Related papers (2023-11-30T21:34:44Z) - A geometrically aware auto-encoder for multi-texture synthesis [1.2891210250935146]
We propose an auto-encoder architecture for multi-texture synthesis.
Images are embedded in a compact and geometrically consistent latent space.
Texture synthesis and tasks can be performed directly from these latent codes.
arXiv Detail & Related papers (2023-02-03T09:28:39Z) - Joint Learning of Deep Texture and High-Frequency Features for
Computer-Generated Image Detection [24.098604827919203]
We propose a joint learning strategy with deep texture and high-frequency features for CG image detection.
A semantic segmentation map is generated to guide the affine transformation operation.
The combination of the original image and the high-frequency components of the original and rendered images are fed into a multi-branch neural network equipped with attention mechanisms.
arXiv Detail & Related papers (2022-09-07T17:30:40Z) - Texture image analysis based on joint of multi directions GLCM and local
ternary patterns [0.0]
Texture features can be used in many different applications in commuter vision or machine learning problems.
New approach is proposed based on combination of two texture descriptor, co-occurrence matrix and local ternary patterns.
Experimental results show that proposed approach provide higher classification rate in comparison with some state-of-the-art approaches.
arXiv Detail & Related papers (2022-09-05T09:53:00Z) - Multiscale Analysis for Improving Texture Classification [62.226224120400026]
This paper employs the Gaussian-Laplacian pyramid to treat different spatial frequency bands of a texture separately.
We aggregate features extracted from gray and color texture images using bio-inspired texture descriptors, information-theoretic measures, gray-level co-occurrence matrix features, and Haralick statistical features into a single feature vector.
arXiv Detail & Related papers (2022-04-21T01:32:22Z) - HighlightMe: Detecting Highlights from Human-Centric Videos [62.265410865423]
We present a domain- and user-preference-agnostic approach to detect highlightable excerpts from human-centric videos.
We use an autoencoder network equipped with spatial-temporal graph convolutions to detect human activities and interactions.
We observe a 4-12% improvement in the mean average precision of matching the human-annotated highlights over state-of-the-art methods.
arXiv Detail & Related papers (2021-10-05T01:18:15Z) - Image Synthesis via Semantic Composition [74.68191130898805]
We present a novel approach to synthesize realistic images based on their semantic layouts.
It hypothesizes that for objects with similar appearance, they share similar representation.
Our method establishes dependencies between regions according to their appearance correlation, yielding both spatially variant and associated representations.
arXiv Detail & Related papers (2021-09-15T02:26:07Z) - Multi-modal Visual Place Recognition in Dynamics-Invariant Perception
Space [23.43468556831308]
This letter explores the use of multi-modal fusion of semantic and visual modalities to improve place recognition in dynamic environments.
We achieve this by first designing a novel deep learning architecture to generate the static semantic segmentation.
We then innovatively leverage the spatial-pyramid-matching model to encode the static semantic segmentation into feature vectors.
In parallel, the static image is encoded using the popular Bag-of-words model.
arXiv Detail & Related papers (2021-05-17T13:14:52Z) - Video Frame Interpolation via Structure-Motion based Iterative Fusion [19.499969588931414]
We propose a structure-motion based iterative fusion method for video frame Interpolation.
Inspired by the observation that audiences have different visual preferences on foreground and background objects, we for the first time propose to use saliency masks in the evaluation processes of the task of video frame Interpolation.
arXiv Detail & Related papers (2021-05-11T22:11:17Z) - Towards Analysis-friendly Face Representation with Scalable Feature and
Texture Compression [113.30411004622508]
We show that a universal and collaborative visual information representation can be achieved in a hierarchical way.
Based on the strong generative capability of deep neural networks, the gap between the base feature layer and enhancement layer is further filled with the feature level texture reconstruction.
To improve the efficiency of the proposed framework, the base layer neural network is trained in a multi-task manner.
arXiv Detail & Related papers (2020-04-21T14:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.