Low-Pass Filtering Improves Behavioral Alignment of Vision Models
- URL: http://arxiv.org/abs/2602.13859v1
- Date: Sat, 14 Feb 2026 19:42:57 GMT
- Title: Low-Pass Filtering Improves Behavioral Alignment of Vision Models
- Authors: Max Wolff, Thomas Klein, Evgenia Rusak, Felix Wichmann, Wieland Brendel,
- Abstract summary: We show that generative models can be largely explained by a seemingly innocuous operation in the generative model which effectively acts as a low-pass filter.<n>We show that removing high-frequency spatial information from discriminative models like CLIP drastically increases their behavioral alignment.<n>Low-pass filters are likely optimal, which we demonstrate by directly optimizing filters for alignment.
- Score: 24.72922224210244
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite their impressive performance on computer vision benchmarks, Deep Neural Networks (DNNs) still fall short of adequately modeling human visual behavior, as measured by error consistency and shape bias. Recent work hypothesized that behavioral alignment can be drastically improved through \emph{generative} -- rather than \emph{discriminative} -- classifiers, with far-reaching implications for models of human vision. Here, we instead show that the increased alignment of generative models can be largely explained by a seemingly innocuous resizing operation in the generative model which effectively acts as a low-pass filter. In a series of controlled experiments, we show that removing high-frequency spatial information from discriminative models like CLIP drastically increases their behavioral alignment. Simply blurring images at test-time -- rather than training on blurred images -- achieves a new state-of-the-art score on the model-vs-human benchmark, halving the current alignment gap between DNNs and human observers. Furthermore, low-pass filters are likely optimal, which we demonstrate by directly optimizing filters for alignment. To contextualize the performance of optimal filters, we compute the frontier of all possible pareto-optimal solutions to the benchmark, which was formerly unknown. We explain our findings by observing that the frequency spectrum of optimal Gaussian filters roughly matches the spectrum of band-pass filters implemented by the human visual system. We show that the contrast sensitivity function, describing the inverse of the contrast threshold required for humans to detect a sinusoidal grating as a function of spatiotemporal frequency, is approximated well by Gaussian filters of the specific width that also maximizes error consistency.
Related papers
- From Filters to VLMs: Benchmarking Defogging Methods through Object Detection and Segmentation Performance [2.0524609401792397]
We present a structured empirical study that benchmarks a comprehensive set of pipelines.<n>We assess both image quality and downstream performance on object detection (mAP) and segmentation (PQ, RQ, SQ)<n>Our analysis reveals when defogging helps, when chaining yields synergy or degradation, and how VLM-based editors compare to dedicated approaches.
arXiv Detail & Related papers (2025-10-04T19:05:04Z) - Solving Inverse Problems with FLAIR [68.87167940623318]
We present FLAIR, a training-free variational framework that leverages flow-based generative models as prior for inverse problems.<n>Results on standard imaging benchmarks demonstrate that FLAIR consistently outperforms existing diffusion- and flow-based methods in terms of reconstruction quality and sample diversity.
arXiv Detail & Related papers (2025-06-03T09:29:47Z) - Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.<n>In this paper, we investigate how detection performance varies across model backbones, types, and datasets.<n>We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z) - Dual-Frequency Filtering Self-aware Graph Neural Networks for Homophilic and Heterophilic Graphs [60.82508765185161]
We propose Dual-Frequency Filtering Self-aware Graph Neural Networks (DFGNN)
DFGNN integrates low-pass and high-pass filters to extract smooth and detailed topological features.
It dynamically adjusts filtering ratios to accommodate both homophilic and heterophilic graphs.
arXiv Detail & Related papers (2024-11-18T04:57:05Z) - Closed-form Filtering for Non-linear Systems [83.91296397912218]
We propose a new class of filters based on Gaussian PSD Models, which offer several advantages in terms of density approximation and computational efficiency.
We show that filtering can be efficiently performed in closed form when transitions and observations are Gaussian PSD Models.
Our proposed estimator enjoys strong theoretical guarantees, with estimation error that depends on the quality of the approximation and is adaptive to the regularity of the transition probabilities.
arXiv Detail & Related papers (2024-02-15T08:51:49Z) - Frequency Compensated Diffusion Model for Real-scene Dehazing [6.105813272271171]
We consider a dehazing framework based on conditional diffusion models for improved generalization to real haze.
The proposed dehazing diffusion model significantly outperforms state-of-the-art methods on real-world images.
arXiv Detail & Related papers (2023-08-21T06:50:44Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Computational Doob's h-transforms for Online Filtering of Discretely
Observed Diffusions [65.74069050283998]
We propose a computational framework to approximate Doob's $h$-transforms.
The proposed approach can be orders of magnitude more efficient than state-of-the-art particle filters.
arXiv Detail & Related papers (2022-06-07T15:03:05Z) - Can we integrate spatial verification methods into neural-network loss
functions for atmospheric science? [0.030458514384586396]
Neural networks (NNs) in atmospheric science are almost always trained to optimize pixelwise loss functions.
This establishes a disconnect between model verification during vs. after training.
We develop spatially enhanced loss functions (SELF) and demonstrate their use for a real-world problem: predicting the occurrence of thunderstorms.
arXiv Detail & Related papers (2022-03-21T17:18:43Z) - Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning
Optimization Landscape [15.362190838843915]
We show that LPF-SGD converges to a better optimal point with smaller generalization error than SGD.
We show that our algorithm achieves superior generalization performance compared to the common DL training strategies.
arXiv Detail & Related papers (2022-01-20T07:13:04Z) - DAAIN: Detection of Anomalous and Adversarial Input using Normalizing
Flows [52.31831255787147]
We introduce a novel technique, DAAIN, to detect out-of-distribution (OOD) inputs and adversarial attacks (AA)
Our approach monitors the inner workings of a neural network and learns a density estimator of the activation distribution.
Our model can be trained on a single GPU making it compute efficient and deployable without requiring specialized accelerators.
arXiv Detail & Related papers (2021-05-30T22:07:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.