Related papers: Distillation Improves Visual Place Recognition for Low-Quality Queries

Distillation Improves Visual Place Recognition for Low-Quality Queries

URL: http://arxiv.org/abs/2310.06906v1
Date: Tue, 10 Oct 2023 18:03:29 GMT
Title: Distillation Improves Visual Place Recognition for Low-Quality Queries
Authors: Anbang Yang, Yao Wang, John-Ross Rizzo, Chen Feng
Abstract summary: Streaming query images/videos to a server for visual place recognition can result in reduced resolution or increased quantization. We present a method that uses high-quality queries only during training to distill better feature representations for deep-learning-based VPR. We achieve notable VPR recall-rate improvements over low-quality queries, as demonstrated in our experimental results.
Score: 11.383202263053379
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The shift to online computing for real-time visual localization often requires streaming query images/videos to a server for visual place recognition (VPR), where fast video transmission may result in reduced resolution or increased quantization. This compromises the quality of global image descriptors, leading to decreased VPR performance. To improve the low recall rate for low-quality query images, we present a simple yet effective method that uses high-quality queries only during training to distill better feature representations for deep-learning-based VPR, such as NetVLAD. Specifically, we use mean squared error (MSE) loss between the global descriptors of queries with different qualities, and inter-channel correlation knowledge distillation (ICKD) loss over their corresponding intermediate features. We validate our approach using the both Pittsburgh 250k dataset and our own indoor dataset with varying quantization levels. By fine-tuning NetVLAD parameters with our distillation-augmented losses, we achieve notable VPR recall-rate improvements over low-quality queries, as demonstrated in our extensive experimental results. We believe this work not only pushes forward the VPR research but also provides valuable insights for applications needing dependable place recognition under resource-limited conditions.

Related papers

Range Image-Based Implicit Neural Compression for LiDAR Point Clouds [10.143205531474907]
We focus on 2D range images(RIs) as a lightweight format for representing 3D LiDAR observations. We propose a novel implicit neural representation(INR)--based RI compression method that effectively handles floating-point valued pixels. Experiments on the KITTI dataset show that the proposed method outperforms existing image, point cloud, RI, and INR-based compression methods in terms of 3D reconstruction and detection quality.
arXiv Detail & Related papers (2025-04-24T03:41:57Z)
Decouple to Reconstruct: High Quality UHD Restoration via Active Feature Disentanglement and Reversible Fusion [77.08942160610478]
Ultra-high-definition (UHD) image restoration often faces computational bottlenecks and information loss due to its extremely high resolution. We propose a Controlled Differential Disentangled VAE that discards easily recoverable background information while encoding more difficult-to-recover degraded information into latent space. Our method effectively alleviates the information loss problem in VAE models while ensuring computational efficiency, significantly improving the quality of UHD image restoration, and achieves state-of-the-art results in six UHD restoration tasks with only 1M parameters.
arXiv Detail & Related papers (2025-03-17T02:55:18Z)
Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
This study presents an enhanced neural compression method designed for optimal visual fidelity. We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss. Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
arXiv Detail & Related papers (2024-01-25T08:11:27Z)
VCISR: Blind Single Image Super-Resolution with Video Compression Synthetic Data [18.877077302923713]
We present a video compression-based degradation model to synthesize low-resolution image data in the blind SISR task. Our proposed image synthesizing method is widely applicable to existing image datasets. By introducing video coding artifacts to SISR degradation models, neural networks can super-resolve images with the ability to restore video compression degradations.
arXiv Detail & Related papers (2023-11-02T05:24:19Z)
Kernel Inversed Pyramidal Resizing Network for Efficient Pavement Distress Recognition [9.927965682734069]
A light network named the Kernel Inversed Pyramidal Resizing Network (KIPRN) is introduced for image resizing. In KIPRN, pyramidal convolution and kernel inversed convolution are specifically designed to mine discriminative information. Extensive results demonstrate that KIPRN can generally improve the pavement distress recognition of CNN models.
arXiv Detail & Related papers (2022-12-04T10:40:40Z)
Analysis of the Effect of Low-Overhead Lossy Image Compression on the Performance of Visual Crowd Counting for Smart City Applications [78.55896581882595]
Lossy image compression techniques can reduce the quality of the images, leading to accuracy degradation. In this paper, we analyze the effect of applying low-overhead lossy image compression methods on the accuracy of visual crowd counting.
arXiv Detail & Related papers (2022-07-20T19:20:03Z)
Identity Preserving Loss for Learned Image Compression [0.0]
This work proposes an end-to-end image compression framework that learns domain-specific features to achieve higher compression ratios. We present a novel Identity Preserving Reconstruction (IPR) loss function which achieves Bits-Per-Pixel (BPP) values that are 38% and 42% of CRF-23 HEVC compression. We show at-par recognition performance on the LFW dataset with an unseen recognition model while retaining a lower BPP value of 38% of CRF-23 HEVC compression.
arXiv Detail & Related papers (2022-04-22T18:01:01Z)
Reducing Redundancy in the Bottleneck Representation of the Autoencoders [98.78384185493624]
Autoencoders are a type of unsupervised neural networks, which can be used to solve various tasks. We propose a scheme to explicitly penalize feature redundancies in the bottleneck representation. We tested our approach across different tasks: dimensionality reduction using three different dataset, image compression using the MNIST dataset, and image denoising using fashion MNIST.
arXiv Detail & Related papers (2022-02-09T18:48:02Z)
Recognition-Aware Learned Image Compression [0.5801044612920815]
We propose a recognition-aware learned compression method, which optimize a rate-distortion loss alongside a task-specific loss. Our method achieves 26% higher recognition accuracy at equivalents compared to traditional methods such as BPG.
arXiv Detail & Related papers (2022-02-01T03:33:51Z)
Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment [157.1292674649519]
We propose a practical solution named degraded-reference IQA (DR-IQA) DR-IQA exploits the inputs of IR models, degraded images, as references. Our results can even be close to the performance of full-reference settings.
arXiv Detail & Related papers (2021-08-18T02:35:08Z)
Attention Based Real Image Restoration [48.933507352496726]
Deep convolutional neural networks perform better on images containing synthetic degradations. This paper proposes a novel single-stage blind real image restoration network (R$2$Net)
arXiv Detail & Related papers (2020-04-26T04:21:49Z)
Discernible Image Compression [124.08063151879173]
This paper aims to produce compressed images by pursuing both appearance and perceptual consistency. Based on the encoder-decoder framework, we propose using a pre-trained CNN to extract features of the original and compressed images. Experiments on benchmarks demonstrate that images compressed by using the proposed method can also be well recognized by subsequent visual recognition and detection models.
arXiv Detail & Related papers (2020-02-17T07:35:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.