Related papers: Value bounds and Convergence Analysis for Averages of LRP attributions

Value bounds and Convergence Analysis for Averages of LRP attributions

URL: http://arxiv.org/abs/2509.08963v1
Date: Wed, 10 Sep 2025 19:50:00 GMT
Title: Value bounds and Convergence Analysis for Averages of LRP attributions
Authors: Alexander Binder, Nastaran Takmil-Homayouni, Urun Dogan,
Abstract summary: We analyze numerical properties of Layer-wise relevance propagation (LRP)-type attribution methods by representing them as a product of modified gradient matrices.<n>In particular, our analysis reveals that the constants for LRP-beta remain independent of weight norms, a significant distinction from both gradient-based methods and LRP-epsilon.
Score: 44.992386137813014
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We analyze numerical properties of Layer-wise relevance propagation (LRP)-type attribution methods by representing them as a product of modified gradient matrices. This representation creates an analogy to matrix multiplications of Jacobi-matrices which arise from the chain rule of differentiation. In order to shed light on the distribution of attribution values, we derive upper bounds for singular values. Furthermore we derive component-wise bounds for attribution map values. As a main result, we apply these component-wise bounds to obtain multiplicative constants. These constants govern the convergence of empirical means of attributions to expectations of attribution maps. This finding has important implications for scenarios where multiple non-geometric data augmentations are applied to individual test samples, as well as for Smoothgrad-type attribution methods. In particular, our analysis reveals that the constants for LRP-beta remain independent of weight norms, a significant distinction from both gradient-based methods and LRP-epsilon.

Related papers

A Random Matrix Theory Perspective on the Consistency of Diffusion Models [31.63433424187031]
Diffusion models trained on different subsets of a dataset often produce strikingly similar outputs when given the same noise seed.<n>We develop a random matrix theory (RMT) framework that quantifies how finite shape the expectation and variance of the learned denoiser and sampling map.<n>We validate its predictions on UNet and DiT architectures in their non-memorization regime.
arXiv Detail & Related papers (2026-02-02T23:30:28Z)
Towards Spectral Convergence of Locally Linear Embedding on Manifolds with Boundary [0.0]
We study the eigenvalues and eigenfunctions of a differential operator that governs the behavior of the unsupervised learning algorithm known as Locally Linear Embedding.<n>We show that a natural regularity condition on the eigenfunctions imposes a consistent boundary condition and use the Frobenius method to estimate pointwise behavior.
arXiv Detail & Related papers (2025-01-16T14:45:53Z)
Asymptotics of Linear Regression with Linearly Dependent Data [28.005935031887038]
We study the computations of linear regression in settings with non-Gaussian covariates.<n>We show how dependencies influence estimation error and the choice of regularization parameters.
arXiv Detail & Related papers (2024-12-04T20:31:47Z)
High-Dimensional Kernel Methods under Covariate Shift: Data-Dependent Implicit Regularization [83.06112052443233]
This paper studies kernel ridge regression in high dimensions under covariate shifts. By a bias-variance decomposition, we theoretically demonstrate that the re-weighting strategy allows for decreasing the variance. For bias, we analyze the regularization of the arbitrary or well-chosen scale, showing that the bias can behave very differently under different regularization scales.
arXiv Detail & Related papers (2024-06-05T12:03:27Z)
Entrywise error bounds for low-rank approximations of kernel matrices [55.524284152242096]
We derive entrywise error bounds for low-rank approximations of kernel matrices obtained using the truncated eigen-decomposition. A key technical innovation is a delocalisation result for the eigenvectors of the kernel matrix corresponding to small eigenvalues. We validate our theory with an empirical study of a collection of synthetic and real-world datasets.
arXiv Detail & Related papers (2024-05-23T12:26:25Z)
Synergistic eigenanalysis of covariance and Hessian matrices for enhanced binary classification [72.77513633290056]
We present a novel approach that combines the eigenanalysis of a covariance matrix evaluated on a training set with a Hessian matrix evaluated on a deep learning model. Our method captures intricate patterns and relationships, enhancing classification performance.
arXiv Detail & Related papers (2024-02-14T16:10:42Z)
Improving Expressive Power of Spectral Graph Neural Networks with Eigenvalue Correction [55.57072563835959]
We propose an eigenvalue correction strategy that can free filters from the constraints of repeated eigenvalue inputs.<n>Concretely, the proposed eigenvalue correction strategy enhances the uniform distribution of eigenvalues, and improves the fitting capacity and expressive power of filters.
arXiv Detail & Related papers (2024-01-28T08:12:00Z)
Convergence and concentration properties of constant step-size SGD through Markov chains [0.0]
We consider the optimization of a smooth and strongly convex objective using constant step-size gradient descent (SGD) We show that, for unbiased gradient estimates with mildly controlled variance, the iteration converges to an invariant distribution in total variation distance. All our results are non-asymptotic and their consequences are discussed through a few applications.
arXiv Detail & Related papers (2023-06-20T12:36:28Z)
Enriching Disentanglement: From Logical Definitions to Quantitative Metrics [59.12308034729482]
Disentangling the explanatory factors in complex data is a promising approach for data-efficient representation learning. We establish relationships between logical definitions and quantitative metrics to derive theoretically grounded disentanglement metrics. We empirically demonstrate the effectiveness of the proposed metrics by isolating different aspects of disentangled representations.
arXiv Detail & Related papers (2023-05-19T08:22:23Z)
Quantitative deterministic equivalent of sample covariance matrices with a general dependence structure [0.0]
We prove quantitative bounds involving both the dimensions and the spectral parameter, in particular allowing it to get closer to the real positive semi-line. As applications, we obtain a new bound for the convergence in Kolmogorov distance of the empirical spectral distributions of these general models.
arXiv Detail & Related papers (2022-11-23T15:50:31Z)
On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration [115.1954841020189]
We study the inequality and non-asymptotic properties of approximation procedures with Polyak-Ruppert averaging. We prove a central limit theorem (CLT) for the averaged iterates with fixed step size and number of iterations going to infinity.
arXiv Detail & Related papers (2020-04-09T17:54:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.