Related papers: Use of triplet loss for facial restoration in low-resolution images

Use of triplet loss for facial restoration in low-resolution images

URL: http://arxiv.org/abs/2409.03530v1
Date: Thu, 5 Sep 2024 13:42:20 GMT
Title: Use of triplet loss for facial restoration in low-resolution images
Authors: Sebastian Pulgar, Domingo Mery,
Abstract summary: We propose a novel SR model FTLGAN, which focuses on generating high-resolution images that preserve individual identities. The results are compelling, demonstrating a mean value of d' 21% above the best current state-of-the-art models.
Score: 5.448070998907116
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In recent years, facial recognition (FR) models have become the most widely used biometric tool, achieving impressive results on numerous datasets. However, inherent hardware challenges or shooting distances often result in low-resolution images, which significantly impact the performance of FR models. To address this issue, several solutions have been proposed, including super-resolution (SR) models that generate highly realistic faces. Despite these efforts, significant improvements in FR algorithms have not been achieved. We propose a novel SR model FTLGAN, which focuses on generating high-resolution images that preserve individual identities rather than merely improving image quality, thereby maximizing the performance of FR models. The results are compelling, demonstrating a mean value of d' 21% above the best current state-of-the-art models, specifically having a value of d' = 1.099 and AUC = 0.78 for 14x14 pixels, d' = 2.112 and AUC = 0.92 for 28x28 pixels, and d' = 3.049 and AUC = 0.98 for 56x56 pixels. The contributions of this study are significant in several key areas. Firstly, a notable improvement in facial recognition performance has been achieved in low-resolution images, specifically at resolutions of 14x14, 28x28, and 56x56 pixels. Secondly, the enhancements demonstrated by FTLGAN show a consistent response across all resolutions, delivering outstanding performance uniformly, unlike other comparative models. Thirdly, an innovative approach has been implemented using triplet loss logic, enabling the training of the super-resolution model solely with real images, contrasting with current models, and expanding potential real-world applications. Lastly, this study introduces a novel model that specifically addresses the challenge of improving classification performance in facial recognition systems by integrating facial recognition quality as a loss during model training.

Related papers

Native-Resolution Image Synthesis [79.73854557930089]
We introduce native-resolution image synthesis, a novel generative modeling paradigm that enables the synthesis of images at arbitrary resolutions and aspect ratios.<n>A single NiT model simultaneously achieves the state-of-the-art performance on both ImageNet-256x256 and 512x512 benchmarks.<n>Surprisingly, akin to the robust zero-shot capabilities seen in advanced large language models, NiT, trained solely on ImageNet, demonstrates excellent zero-shot generalization performance.
arXiv Detail & Related papers (2025-06-03T17:57:33Z)
PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition [54.642714288448744]
PETALface is the first work leveraging the powers of PEFT for low resolution face recognition. We introduce two low-rank adaptation modules to the backbone, with weights adjusted based on the input image quality to account for the difference in quality for the gallery and probe images. Experiments demonstrate that the proposed method outperforms full fine-tuning on low-resolution datasets while preserving performance on high-resolution and mixed-quality datasets.
arXiv Detail & Related papers (2024-12-10T18:59:45Z)
Large-Scale Data-Free Knowledge Distillation for ImageNet via Multi-Resolution Data Generation [53.95204595640208]
Data-Free Knowledge Distillation (DFKD) is an advanced technique that enables knowledge transfer from a teacher model to a student model without relying on original training data. Previous approaches have generated synthetic images at high resolutions without leveraging information from real images. MUSE generates images at lower resolutions while using Class Activation Maps (CAMs) to ensure that the generated images retain critical, class-specific features.
arXiv Detail & Related papers (2024-11-26T02:23:31Z)
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis [62.06970466554273]
We present Meissonic, which non-autoregressive masked image modeling (MIM) text-to-image elevates to a level comparable with state-of-the-art diffusion models like SDXL. We leverage high-quality training data, integrate micro-conditions informed by human preference scores, and employ feature compression layers to further enhance image fidelity and resolution. Our model not only matches but often exceeds the performance of existing models like SDXL in generating high-quality, high-resolution images.
arXiv Detail & Related papers (2024-10-10T17:59:17Z)
DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance [11.44012694656102]
Large-scale generative models, such as text-to-image diffusion models, have garnered widespread attention across diverse domains. Existing large-scale diffusion models are confined to generating images of up to 1K resolution. We propose a novel progressive approach that fully utilizes generated low-resolution images to guide the generation of higher-resolution images.
arXiv Detail & Related papers (2024-06-26T16:10:31Z)
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation [52.509092010267665]
We introduce LlamaGen, a new family of image generation models that apply original next-token prediction'' paradigm of large language models to visual generation domain. It is an affirmative answer to whether vanilla autoregressive models, e.g., Llama, without inductive biases on visual signals can achieve state-of-the-art image generation performance if scaling properly.
arXiv Detail & Related papers (2024-06-10T17:59:52Z)
ACDMSR: Accelerated Conditional Diffusion Models for Single Image Super-Resolution [84.73658185158222]
We propose a diffusion model-based super-resolution method called ACDMSR. Our method adapts the standard diffusion model to perform super-resolution through a deterministic iterative denoising process. Our approach generates more visually realistic counterparts for low-resolution images, emphasizing its effectiveness in practical scenarios.
arXiv Detail & Related papers (2023-07-03T06:49:04Z)
Octuplet Loss: Make Face Recognition Robust to Image Resolution [5.257115841810258]
We propose a novel combination of the popular triplet loss to improve robustness against image resolution. We leverage the relationship between high-resolution images and their synthetically down-sampled variants jointly with their identity labels. Fine-tuning several state-of-the-art approaches with our method proves that we can significantly boost performance for cross-resolution (high-to-low resolution) face verification.
arXiv Detail & Related papers (2022-07-14T08:22:58Z)
Uncovering the Over-smoothing Challenge in Image Super-Resolution: Entropy-based Quantification and Contrastive Optimization [67.99082021804145]
We propose an explicit solution to the COO problem, called Detail Enhanced Contrastive Loss (DECLoss) DECLoss utilizes the clustering property of contrastive learning to directly reduce the variance of the potential high-resolution distribution. We evaluate DECLoss on multiple super-resolution benchmarks and demonstrate that it improves the perceptual quality of PSNR-oriented models.
arXiv Detail & Related papers (2022-01-04T08:30:09Z)
Cross-Quality LFW: A Database for Analyzing Cross-Resolution Image Face Recognition in Unconstrained Environments [8.368543987898732]
Real-world face recognition applications often deal with suboptimal image quality or resolution due to different capturing conditions. Recent cross-resolution face recognition approaches used simple, arbitrary, and unrealistic down- and up-scaling techniques to measure distances against real-world edge-cases in image quality. We propose a new standardized benchmark dataset and evaluation protocol derived from the famous Labeled Faces in the Wild.
arXiv Detail & Related papers (2021-08-23T17:04:32Z)
Efficient texture-aware multi-GAN for image inpainting [5.33024001730262]
Recent GAN-based (Generative adversarial networks) inpainting methods show remarkable improvements. We propose a multi-GAN architecture improving both the performance and rendering efficiency.
arXiv Detail & Related papers (2020-09-30T14:58:03Z)
Improved Techniques for Training Score-Based Generative Models [104.20217659157701]
We provide a new theoretical analysis of learning and sampling from score models in high dimensional spaces. We can effortlessly scale score-based generative models to images with unprecedented resolutions. Our score-based models can generate high-fidelity samples that rival best-in-class GANs on various image datasets.
arXiv Detail & Related papers (2020-06-16T09:17:17Z)
Feature Super-Resolution Based Facial Expression Recognition for Multi-scale Low-Resolution Faces [7.634398926381845]
Super-resolution method is often used to enhance low-resolution images, but the performance on FER task is limited when on images of very low resolution. In this work, inspired by feature super-resolution methods for object detection, we proposed a novel generative adversary network-based super-resolution method for robust facial expression recognition.
arXiv Detail & Related papers (2020-04-05T15:38:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.