The Unreasonable Effectiveness of Linear Prediction as a Perceptual
Metric
- URL: http://arxiv.org/abs/2310.05986v1
- Date: Fri, 6 Oct 2023 19:02:00 GMT
- Title: The Unreasonable Effectiveness of Linear Prediction as a Perceptual
Metric
- Authors: Daniel Severo, Lucas Theis, Johannes Ball\'e
- Abstract summary: We show how perceptual embeddings of the visual system can be constructed at inference-time with no training data or deep neural network features.
Our perceptual embeddings are solutions to a weighted least squares (WLS) problem, defined at the pixel-level, and solved at inference-time.
- Score: 6.1693649058046764
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We show how perceptual embeddings of the visual system can be constructed at
inference-time with no training data or deep neural network features. Our
perceptual embeddings are solutions to a weighted least squares (WLS) problem,
defined at the pixel-level, and solved at inference-time, that can capture
global and local image characteristics. The distance in embedding space is used
to define a perceptual similarity metric which we call LASI: Linear
Autoregressive Similarity Index. Experiments on full-reference image quality
assessment datasets show LASI performs competitively with learned deep feature
based methods like LPIPS (Zhang et al., 2018) and PIM (Bhardwaj et al., 2020),
at a similar computational cost to hand-crafted methods such as MS-SSIM (Wang
et al., 2003). We found that increasing the dimensionality of the embedding
space consistently reduces the WLS loss while increasing performance on
perceptual tasks, at the cost of increasing the computational complexity. LASI
is fully differentiable, scales cubically with the number of embedding
dimensions, and can be parallelized at the pixel-level. A Maximum
Differentiation (MAD) competition (Wang & Simoncelli, 2008) between LASI and
LPIPS shows that both methods are capable of finding failure points for the
other, suggesting these metrics can be combined.
Related papers
- Structured Uncertainty Similarity Score (SUSS): Learning a Probabilistic, Interpretable, Perceptual Metric Between Images [3.1296300934639327]
Perceptual similarity scores that align with human vision are critical for both training and evaluating computer vision models.<n>We introduce the Structured Uncertainty Similarity Score (SUSS); it models each image through a set of perceptual components.<n>The final score is a weighted sum of component log-probabilities with weights learned from human perceptual datasets.
arXiv Detail & Related papers (2025-12-03T11:48:59Z) - IM-LUT: Interpolation Mixing Look-Up Tables for Image Super-Resolution [21.982964666527646]
Look-up table (LUT)-based approaches have attracted interest due to their efficiency and performance.<n>Existing ASISR techniques often employ implicit neural representations, which come with considerable computational cost and memory demands.<n>We propose Interpolation Mixing LUT (IM-LUT), a novel framework that operates ASISR by learning to blend multiple functions to maximize their capacity.
arXiv Detail & Related papers (2025-07-14T05:02:57Z) - NDCG-Consistent Softmax Approximation with Accelerated Convergence [67.10365329542365]
We propose novel loss formulations that align directly with ranking metrics.<n>We integrate the proposed RG losses with the highly efficient Alternating Least Squares (ALS) optimization method.<n> Empirical evaluations on real-world datasets demonstrate that our approach achieves comparable or superior ranking performance.
arXiv Detail & Related papers (2025-06-11T06:59:17Z) - Uncertainty-aware retinal layer segmentation in OCT through probabilistic signed distance functions [6.765624289092461]
We present a new approach for uncertainty-aware retinal layer segmentation in Optical Coherence Tomography ( OCT) scans.
Our methodology refines the segmentation by predicting a signed distance function (SDF) that effectively parameterizes the retinal layer shape via level set.
This ensures a robust representation of the retinal layer even in the presence of ambiguous input, imaging noise, and unreliable segmentations.
arXiv Detail & Related papers (2024-12-06T10:44:11Z) - SimO Loss: Anchor-Free Contrastive Loss for Fine-Grained Supervised Contrastive Learning [0.0]
We introduce a novel anchor-free contrastive learning (L) method leveraging our proposed Similarity-Orthogonality (SimO) loss.
Our approach minimizes a semi-metric discriminative loss function that simultaneously optimize two key objectives.
We provide visualizations that demonstrate the impact of SimO loss on the embedding space.
arXiv Detail & Related papers (2024-10-07T17:41:10Z) - Generalizable Non-Line-of-Sight Imaging with Learnable Physical Priors [52.195637608631955]
Non-line-of-sight (NLOS) imaging has attracted increasing attention due to its potential applications.
Existing NLOS reconstruction approaches are constrained by the reliance on empirical physical priors.
We introduce a novel learning-based solution, comprising two key designs: Learnable Path Compensation (LPC) and Adaptive Phasor Field (APF)
arXiv Detail & Related papers (2024-09-21T04:39:45Z) - Efficient Learnable Collaborative Attention for Single Image Super-Resolution [18.955369476815136]
Non-Local Attention (NLA) is a powerful technique for capturing long-range feature correlations in deep single image super-resolution (SR)
We propose a novel Learnable Collaborative Attention (LCoA) that introduces inductive bias into non-local modeling.
Our LCoA can reduce the non-local modeling time by about 83% in the inference stage.
arXiv Detail & Related papers (2024-04-07T11:25:04Z) - Towards Better Gradient Consistency for Neural Signed Distance Functions
via Level Set Alignment [50.892158511845466]
We show that gradient consistency in the field, indicated by the parallelism of level sets, is the key factor affecting the inference accuracy.
We propose a level set alignment loss to evaluate the parallelism of level sets, which can be minimized to achieve better gradient consistency.
arXiv Detail & Related papers (2023-05-19T11:28:05Z) - Towards Scale Consistent Monocular Visual Odometry by Learning from the
Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data.
We first train a scale-aware disparity network using both monocular real images and stereo virtual data.
The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z) - Transductive Few-Shot Classification on the Oblique Manifold [5.115651633703363]
Few-shot learning attempts to learn with limited data.
In this work, we perform the feature extraction in the Euclidean space.
We also propose a non-parametric Region Self-attention with Spatial Pyramid Pooling.
arXiv Detail & Related papers (2021-08-09T13:01:03Z) - Hierarchical Deep CNN Feature Set-Based Representation Learning for
Robust Cross-Resolution Face Recognition [59.29808528182607]
Cross-resolution face recognition (CRFR) is important in intelligent surveillance and biometric forensics.
Existing shallow learning-based and deep learning-based methods focus on mapping the HR-LR face pairs into a joint feature space.
In this study, we desire to fully exploit the multi-level deep convolutional neural network (CNN) feature set for robust CRFR.
arXiv Detail & Related papers (2021-03-25T14:03:42Z) - Leveraging Spatial and Photometric Context for Calibrated Non-Lambertian
Photometric Stereo [61.6260594326246]
We introduce an efficient fully-convolutional architecture that can leverage both spatial and photometric context simultaneously.
Using separable 4D convolutions and 2D heat-maps reduces the size and makes more efficient.
arXiv Detail & Related papers (2021-03-22T18:06:58Z) - Inter-class Discrepancy Alignment for Face Recognition [55.578063356210144]
We propose a unified framework calledInter-class DiscrepancyAlignment(IDA)
IDA-DAO is used to align the similarity scores considering the discrepancy between the images and its neighbors.
IDA-SSE can provide convincing inter-class neighbors by introducing virtual candidate images generated with GAN.
arXiv Detail & Related papers (2021-03-02T08:20:08Z) - Deep Probabilistic Feature-metric Tracking [27.137827823264942]
We propose a new framework to learn a pixel-wise deep feature map and a deep feature-metric uncertainty map.
CNN predicts a deep initial pose for faster and more reliable convergence.
Experimental results demonstrate state-of-the-art performances on the TUM RGB-D dataset and the 3D rigid object tracking dataset.
arXiv Detail & Related papers (2020-08-31T11:47:59Z) - Scan-based Semantic Segmentation of LiDAR Point Clouds: An Experimental
Study [2.6205925938720833]
State of the art methods use deep neural networks to predict semantic classes for each point in a LiDAR scan.
A powerful and efficient way to process LiDAR measurements is to use two-dimensional, image-like projections.
We demonstrate various techniques to boost the performance and to improve runtime as well as memory constraints.
arXiv Detail & Related papers (2020-04-06T11:08:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.