Orthonormal Product Quantization Network for Scalable Face Image
Retrieval
- URL: http://arxiv.org/abs/2107.00327v4
- Date: Fri, 12 May 2023 11:56:11 GMT
- Title: Orthonormal Product Quantization Network for Scalable Face Image
Retrieval
- Authors: Ming Zhang, Xuefei Zhe, Hong Yan
- Abstract summary: This paper integrates product quantization with orthonormal constraints into an end-to-end deep learning framework to retrieve face images.
A novel scheme that uses predefined orthonormal vectors as codewords is proposed to enhance the quantization informativeness and reduce codewords' redundancy.
Experiments are conducted on four commonly-used face datasets under both seen and unseen identities retrieval settings.
- Score: 14.583846619121427
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Existing deep quantization methods provided an efficient solution for
large-scale image retrieval. However, the significant intra-class variations
like pose, illumination, and expressions in face images, still pose a challenge
for face image retrieval. In light of this, face image retrieval requires
sufficiently powerful learning metrics, which are absent in current deep
quantization works. Moreover, to tackle the growing unseen identities in the
query stage, face image retrieval drives more demands regarding model
generalization and system scalability than general image retrieval tasks. This
paper integrates product quantization with orthonormal constraints into an
end-to-end deep learning framework to effectively retrieve face images.
Specifically, a novel scheme that uses predefined orthonormal vectors as
codewords is proposed to enhance the quantization informativeness and reduce
codewords' redundancy. A tailored loss function maximizes discriminability
among identities in each quantization subspace for both the quantized and
original features. An entropy-based regularization term is imposed to reduce
the quantization error. Experiments are conducted on four commonly-used face
datasets under both seen and unseen identities retrieval settings. Our method
outperforms all the compared deep hashing/quantization state-of-the-arts under
both settings. Results validate the effectiveness of the proposed orthonormal
codewords in improving models' standard retrieval performance and
generalization ability. Combing with further experiments on two general image
datasets, it demonstrates the broad superiority of our method for scalable
image retrieval.
Related papers
- Guided Deep Generative Model-based Spatial Regularization for Multiband
Imaging Inverse Problems [14.908906329456842]
We propose a generic framework able to capitalize on an auxiliary acquisition of high spatial resolution to derive tailored data-driven spatial regularizations.
More precisely, the regularization is conceived as a deep generative network able to encode spatial semantic features contained in this auxiliary image of high spatial resolution.
arXiv Detail & Related papers (2023-06-29T03:48:50Z) - Explainable bilevel optimization: an application to the Helsinki deblur
challenge [1.1470070927586016]
We present a bilevel optimization scheme for the solution of a general image deblurring problem.
A parametric variational-like approach is encapsulated within a machine learning scheme to provide a high quality reconstructed image.
arXiv Detail & Related papers (2022-10-18T11:36:37Z) - Retrieval-based Spatially Adaptive Normalization for Semantic Image
Synthesis [68.1281982092765]
We propose a novel normalization module, termed as REtrieval-based Spatially AdaptIve normaLization (RESAIL)
RESAIL provides pixel level fine-grained guidance to the normalization architecture.
Experiments on several challenging datasets show that our RESAIL performs favorably against state-of-the-arts in terms of quantitative metrics, visual quality, and subjective evaluation.
arXiv Detail & Related papers (2022-04-06T14:21:39Z) - Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity.
We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration.
State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z) - Image De-Quantization Using Generative Models as Priors [4.467248776406006]
De-quantization is the task of reversing the quantization effect and recovering the original multi-chromatic level image.
We develop a de-quantization mechanism through a rigorous mathematical analysis which is based on the classical statistical estimation theory.
arXiv Detail & Related papers (2020-07-15T18:09:00Z) - A Flexible Framework for Designing Trainable Priors with Adaptive
Smoothing and Game Encoding [57.1077544780653]
We introduce a general framework for designing and training neural network layers whose forward passes can be interpreted as solving non-smooth convex optimization problems.
We focus on convex games, solved by local agents represented by the nodes of a graph and interacting through regularization functions.
This approach is appealing for solving imaging problems, as it allows the use of classical image priors within deep models that are trainable end to end.
arXiv Detail & Related papers (2020-06-26T08:34:54Z) - The Power of Triply Complementary Priors for Image Compressive Sensing [89.14144796591685]
We propose a joint low-rank deep (LRD) image model, which contains a pair of complementaryly trip priors.
We then propose a novel hybrid plug-and-play framework based on the LRD model for image CS.
To make the optimization tractable, a simple yet effective algorithm is proposed to solve the proposed H-based image CS problem.
arXiv Detail & Related papers (2020-05-16T08:17:44Z) - Invertible Image Rescaling [118.2653765756915]
We develop an Invertible Rescaling Net (IRN) to produce visually-pleasing low-resolution images.
We capture the distribution of the lost information using a latent variable following a specified distribution in the downscaling process.
arXiv Detail & Related papers (2020-05-12T09:55:53Z) - Deep Attentive Generative Adversarial Network for Photo-Realistic Image
De-Quantization [25.805568996596783]
De-quantization can improve the visual quality of low bit-depth image to display on high bit-depth screen.
This paper proposes DAGAN algorithm to perform super-resolution on image intensity resolution.
DenseResAtt module consists of dense residual blocks armed with self-attention mechanism.
arXiv Detail & Related papers (2020-04-07T06:45:01Z) - Generalized Product Quantization Network for Semi-supervised Image
Retrieval [16.500174965126238]
We propose the first quantization-based semi-supervised image retrieval scheme: Generalized Product Quantization (GPQ) network.
We design a novel metric learning strategy that preserves semantic similarity between labeled data, and employ entropy regularization term to fully exploit inherent potentials of unlabeled data.
Our solution increases the generalization capacity of the quantization network, which allows overcoming previous limitations in the retrieval community.
arXiv Detail & Related papers (2020-02-26T03:36:32Z) - Joint Deep Learning of Facial Expression Synthesis and Recognition [97.19528464266824]
We propose a novel joint deep learning of facial expression synthesis and recognition method for effective FER.
The proposed method involves a two-stage learning procedure. Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions.
In order to alleviate the problem of data bias between the real images and the synthetic images, we propose an intra-class loss with a novel real data-guided back-propagation (RDBP) algorithm.
arXiv Detail & Related papers (2020-02-06T10:56:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.