Unified theory for joint covariance properties under geometric image transformations for spatio-temporal receptive fields according to the generalized Gaussian derivative model for visual receptive fields
- URL: http://arxiv.org/abs/2311.10543v7
- Date: Thu, 11 Jul 2024 14:19:52 GMT
- Title: Unified theory for joint covariance properties under geometric image transformations for spatio-temporal receptive fields according to the generalized Gaussian derivative model for visual receptive fields
- Authors: Tony Lindeberg,
- Abstract summary: We show how the parameters of receptive fields need to be transformed, in order to match the output from receptive fields under atemporal image relations.
We conclude with geometric analysis, showing how derived joint covariance properties make it possible to relate or match receptive field responses.
- Score: 0.5439020425819
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The influence of natural image transformations on receptive field responses is crucial for modelling visual operations in computer vision and biological vision. In this regard, covariance properties with respect to geometric image transformations in the earliest layers of the visual hierarchy are essential for expressing robust image operations, and for formulating invariant visual operations at higher levels. This paper defines and proves a set of joint covariance properties for spatio-temporal receptive fields in terms of spatio-temporal derivative operators applied to spatio-temporally smoothed image data under compositions of spatial scaling transformations, spatial affine transformations, Galilean transformations and temporal scaling transformations. Specifically, the derived relations show how the parameters of the receptive fields need to be transformed, in order to match the output from spatio-temporal receptive fields under composed spatio-temporal image transformations. For this purpose, we also fundamentally extend the notion of scale-normalized derivatives to affine-normalized derivatives, that are computed based on spatial smoothing with affine Gaussian kernels, and analyze the covariance properties of the resulting affine-normalized derivatives for the affine group as well as for important subgroups thereof. We conclude with a geometric analysis, showing how the derived joint covariance properties make it possible to relate or match spatio-temporal receptive field responses, when observing, possibly moving, local surface patches from different views, under locally linearized perspective or projective transformations, as well as when observing different instances of spatio-temporal events, that may occur either faster or slower between different views of similar spatio-temporal events.
Related papers
- Relationships between the degrees of freedom in the affine Gaussian derivative model for visual receptive fields and 2-D affine image transformations, with application to covariance properties of simple cells in the primary visual cortex [0.5439020425819]
This paper presents a theoretical analysis of relationships between the degrees of freedom in 2-D spatial affine image transformations and the degrees of freedom in the affine Gaussian derivative model for visual receptive fields.
We consider whether we could regard the biological receptive fields in the primary visual cortex of higher mammals as being able to span the degrees of freedom of 2-D spatial affine transformations.
arXiv Detail & Related papers (2024-11-08T16:24:20Z) - Relative Representations: Topological and Geometric Perspectives [53.88896255693922]
Relative representations are an established approach to zero-shot model stitching.
We introduce a normalization procedure in the relative transformation, resulting in invariance to non-isotropic rescalings and permutations.
Second, we propose to deploy topological densification when fine-tuning relative representations, a topological regularization loss encouraging clustering within classes.
arXiv Detail & Related papers (2024-09-17T08:09:22Z) - Varying Manifolds in Diffusion: From Time-varying Geometries to Visual Saliency [25.632973225129728]
We study the geometric properties of the diffusion model, whose forward diffusion process and reverse generation process construct a series of distributions on a manifold.
We show that the generation rate is highly correlated with intuitive visual properties, such as visual saliency, of the image component.
We propose an efficient and differentiable scheme to estimate the generation rate for a given image component over time, giving rise to a generation curve.
arXiv Detail & Related papers (2024-06-07T07:32:41Z) - A Theory of Topological Derivatives for Inverse Rendering of Geometry [87.49881303178061]
We introduce a theoretical framework for differentiable surface evolution that allows discrete topology changes through the use of topological derivatives.
We validate the proposed theory with optimization of closed curves in 2D and surfaces in 3D to lend insights into limitations of current methods.
arXiv Detail & Related papers (2023-08-19T00:55:55Z) - Tensor Component Analysis for Interpreting the Latent Space of GANs [41.020230946351816]
This paper addresses the problem of finding interpretable directions in the latent space of pre-trained Generative Adversarial Networks (GANs)
Our scheme allows for both linear edits corresponding to the individual modes of the tensor, and non-linear ones that model the multiplicative interactions between them.
We show experimentally that we can utilise the former to better separate style- from geometry-based transformations, and the latter to generate an extended set of possible transformations.
arXiv Detail & Related papers (2021-11-23T09:14:39Z) - Topographic VAEs learn Equivariant Capsules [84.33745072274942]
We introduce the Topographic VAE: a novel method for efficiently training deep generative models with topographically organized latent variables.
We show that such a model indeed learns to organize its activations according to salient characteristics such as digit class, width, and style on MNIST.
We demonstrate approximate equivariance to complex transformations, expanding upon the capabilities of existing group equivariant neural networks.
arXiv Detail & Related papers (2021-09-03T09:25:57Z) - Self-Supervised Graph Representation Learning via Topology
Transformations [61.870882736758624]
We present the Topology Transformation Equivariant Representation learning, a general paradigm of self-supervised learning for node representations of graph data.
In experiments, we apply the proposed model to the downstream node and graph classification tasks, and results show that the proposed method outperforms the state-of-the-art unsupervised approaches.
arXiv Detail & Related papers (2021-05-25T06:11:03Z) - Image Morphing with Perceptual Constraints and STN Alignment [70.38273150435928]
We propose a conditional GAN morphing framework operating on a pair of input images.
A special training protocol produces sequences of frames, combined with a perceptual similarity loss, promote smooth transformation over time.
We provide comparisons to classic as well as latent space morphing techniques, and demonstrate that, given a set of images for self-supervision, our network learns to generate visually pleasing morphing effects.
arXiv Detail & Related papers (2020-04-29T10:49:10Z) - Generalizing Convolutional Neural Networks for Equivariance to Lie
Groups on Arbitrary Continuous Data [52.78581260260455]
We propose a general method to construct a convolutional layer that is equivariant to transformations from any specified Lie group.
We apply the same model architecture to images, ball-and-stick molecular data, and Hamiltonian dynamical systems.
arXiv Detail & Related papers (2020-02-25T17:40:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.