Grassmannian learning mutual subspace method for image set recognition
- URL: http://arxiv.org/abs/2111.04352v1
- Date: Mon, 8 Nov 2021 09:16:36 GMT
- Title: Grassmannian learning mutual subspace method for image set recognition
- Authors: Lincon S. Souza, Naoya Sogi, Bernardo B. Gatto, Takumi Kobayashi and
Kazuhiro Fukui
- Abstract summary: This paper addresses the problem of object recognition given a set of images as input (e.g., multiple camera sources and video frames)
We propose the Grassmannian learning mutual subspace method (G-LMSM), a NN layer embedded on top of CNNs as a classifier.
We demonstrate the effectiveness of our proposed method on hand shape recognition, face identification, and facial emotion recognition.
- Score: 43.24089871099157
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the problem of object recognition given a set of images
as input (e.g., multiple camera sources and video frames). Convolutional neural
network (CNN)-based frameworks do not exploit these sets effectively,
processing a pattern as observed, not capturing the underlying feature
distribution as it does not consider the variance of images in the set. To
address this issue, we propose the Grassmannian learning mutual subspace method
(G-LMSM), a NN layer embedded on top of CNNs as a classifier, that can process
image sets more effectively and can be trained in an end-to-end manner. The
image set is represented by a low-dimensional input subspace; and this input
subspace is matched with reference subspaces by a similarity of their canonical
angles, an interpretable and easy to compute metric. The key idea of G-LMSM is
that the reference subspaces are learned as points on the Grassmann manifold,
optimized with Riemannian stochastic gradient descent. This learning is stable,
efficient and theoretically well-grounded. We demonstrate the effectiveness of
our proposed method on hand shape recognition, face identification, and facial
emotion recognition.
Related papers
- SSA-Seg: Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation [11.176993272867396]
In this paper, we propose a novel Semantic and Spatial Adaptive (SSA-Seg) to address the challenges of semantic segmentation.
Specifically, we employ the coarse masks obtained from the fixed prototypes as a guide to adjust the fixed prototype towards the center of the semantic and spatial domains in the test image.
Results show that the proposed SSA-Seg significantly improves the segmentation performance of the baseline models with only a minimal increase in computational cost.
arXiv Detail & Related papers (2024-05-10T15:14:23Z) - Deep Gaussian mixture model for unsupervised image segmentation [1.3654846342364308]
In many tasks sufficient pixel-level labels are very difficult to obtain.
We propose a method which combines a Gaussian mixture model (GMM) with unsupervised deep learning techniques.
We demonstrate the advantages of our method in various experiments on the example of infarct segmentation on multi-sequence MRI images.
arXiv Detail & Related papers (2024-04-18T15:20:59Z) - Pre-training with Random Orthogonal Projection Image Modeling [32.667183132025094]
Masked Image Modeling (MIM) is a powerful self-supervised strategy for visual pre-training without the use of labels.
We propose an Image Modeling framework based on Random Orthogonal Projection Image Modeling (ROPIM)
ROPIM reduces spatially-wise token information under guaranteed bound on the noise variance and can be considered as masking entire spatial image area under locally varying masking degrees.
arXiv Detail & Related papers (2023-10-28T15:42:07Z) - Feature Activation Map: Visual Explanation of Deep Learning Models for
Image Classification [17.373054348176932]
In this work, a post-hoc interpretation tool named feature activation map (FAM) is proposed.
FAM can interpret deep learning models without FC layers as a classifier.
Experiments conducted on ten deep learning models for few-shot image classification, contrastive learning image classification and image retrieval tasks demonstrate the effectiveness of the proposed FAM algorithm.
arXiv Detail & Related papers (2023-07-11T05:33:46Z) - Location-Aware Self-Supervised Transformers [74.76585889813207]
We propose to pretrain networks for semantic segmentation by predicting the relative location of image parts.
We control the difficulty of the task by masking a subset of the reference patch features visible to those of the query.
Our experiments show that this location-aware pretraining leads to representations that transfer competitively to several challenging semantic segmentation benchmarks.
arXiv Detail & Related papers (2022-12-05T16:24:29Z) - Subspace Nonnegative Matrix Factorization for Feature Representation [14.251799988700558]
Nonnegative matrix factorization (NMF) learns a new feature representation on the whole data space, which means treating all features equally.
This paper proposes a new NMF method by introducing adaptive weights to identify key features in the original space so that only a subspace involves generating the new representation.
Experimental results on several real-world datasets demonstrated that the proposed methods can generate a more accurate feature representation than existing methods.
arXiv Detail & Related papers (2022-04-18T16:07:06Z) - A singular Riemannian geometry approach to Deep Neural Networks II.
Reconstruction of 1-D equivalence classes [78.120734120667]
We build the preimage of a point in the output manifold in the input space.
We focus for simplicity on the case of neural networks maps from n-dimensional real spaces to (n - 1)-dimensional real spaces.
arXiv Detail & Related papers (2021-12-17T11:47:45Z) - Semantic Distribution-aware Contrastive Adaptation for Semantic
Segmentation [50.621269117524925]
Domain adaptive semantic segmentation refers to making predictions on a certain target domain with only annotations of a specific source domain.
We present a semantic distribution-aware contrastive adaptation algorithm that enables pixel-wise representation alignment.
We evaluate SDCA on multiple benchmarks, achieving considerable improvements over existing algorithms.
arXiv Detail & Related papers (2021-05-11T13:21:25Z) - Seed the Views: Hierarchical Semantic Alignment for Contrastive
Representation Learning [116.91819311885166]
We propose a hierarchical semantic alignment strategy via expanding the views generated by a single image to textbfCross-samples and Multi-level representation.
Our method, termed as CsMl, has the ability to integrate multi-level visual representations across samples in a robust way.
arXiv Detail & Related papers (2020-12-04T17:26:24Z) - DeepEMD: Differentiable Earth Mover's Distance for Few-Shot Learning [122.51237307910878]
We develop methods for few-shot image classification from a new perspective of optimal matching between image regions.
We employ the Earth Mover's Distance (EMD) as a metric to compute a structural distance between dense image representations.
To generate the important weights of elements in the formulation, we design a cross-reference mechanism.
arXiv Detail & Related papers (2020-03-15T08:13:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.