Scale Equivariance Improves Siamese Tracking
- URL: http://arxiv.org/abs/2007.09115v2
- Date: Fri, 6 Nov 2020 12:29:44 GMT
- Title: Scale Equivariance Improves Siamese Tracking
- Authors: Ivan Sosnovik, Artem Moskalev, Arnold Smeulders
- Abstract summary: Siamese trackers turn tracking into similarity estimation between a template and the candidate regions in the frame.
Non-translation-equivariant architectures induce a positional bias during training.
We present SE-SiamFC, a scale-equivariant variant of SiamFC built according to the recipe.
- Score: 1.7188280334580197
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Siamese trackers turn tracking into similarity estimation between a template
and the candidate regions in the frame. Mathematically, one of the key
ingredients of success of the similarity function is translation equivariance.
Non-translation-equivariant architectures induce a positional bias during
training, so the location of the target will be hard to recover from the
feature space. In real life scenarios, objects undergoe various transformations
other than translation, such as rotation or scaling. Unless the model has an
internal mechanism to handle them, the similarity may degrade. In this paper,
we focus on scaling and we aim to equip the Siamese network with additional
built-in scale equivariance to capture the natural variations of the target a
priori. We develop the theory for scale-equivariant Siamese trackers, and
provide a simple recipe for how to make a wide range of existing trackers
scale-equivariant. We present SE-SiamFC, a scale-equivariant variant of SiamFC
built according to the recipe. We conduct experiments on OTB and VOT benchmarks
and on the synthetically generated T-MNIST and S-MNIST datasets. We demonstrate
that a built-in additional scale equivariance is useful for visual object
tracking.
Related papers
- NormNet: Scale Normalization for 6D Pose Estimation in Stacked Scenarios [4.515332570030772]
This paper proposes a new 6DoF OPE network (NormNet) for different scale objects in stacked scenarios.
All objects in the stacked scenario are normalized into the same scale through semantic segmentation and affine transformation.
Finally, they are fed into a shared pose estimator to recover their 6D poses.
arXiv Detail & Related papers (2023-11-15T12:02:57Z) - Equivariant Similarity for Vision-Language Foundation Models [134.77524524140168]
This study focuses on the multimodal similarity function that is not only the major training objective but also the core delivery to support downstream tasks.
We propose EqSim, a regularization loss that can be efficiently calculated from any two matched training pairs.
Compared to the existing evaluation sets, EqBen is the first to focus on "visual-minimal change"
arXiv Detail & Related papers (2023-03-25T13:22:56Z) - Scale Equivariant U-Net [0.0]
This paper introduces the Scale Equivariant U-Net (SEU-Net), a U-Net that is made approximately equivariant to a semigroup of scales and translations.
The proposed SEU-Net is trained for semantic segmentation of the Oxford Pet IIIT and the DIC-C2DH-HeLa dataset for cell segmentation.
The generalization metric to unseen scales is dramatically improved in comparison to the U-Net, even when the U-Net is trained with scale jittering.
arXiv Detail & Related papers (2022-10-10T09:19:40Z) - The Lie Derivative for Measuring Learned Equivariance [84.29366874540217]
We study the equivariance properties of hundreds of pretrained models, spanning CNNs, transformers, and Mixer architectures.
We find that many violations of equivariance can be linked to spatial aliasing in ubiquitous network layers, such as pointwise non-linearities.
For example, transformers can be more equivariant than convolutional neural networks after training.
arXiv Detail & Related papers (2022-10-06T15:20:55Z) - Equivariance versus Augmentation for Spherical Images [0.7388859384645262]
We analyze the role of rotational equivariance in convolutional neural networks (CNNs) applied to spherical images.
We compare the performance of the group equivariant networks known as S2CNNs and standard non-equivariant CNNs trained with an increasing amount of data augmentation.
arXiv Detail & Related papers (2022-02-08T16:49:30Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - Frame Averaging for Invariant and Equivariant Network Design [50.87023773850824]
We introduce Frame Averaging (FA), a framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types.
We show that FA-based models have maximal expressive power in a broad setting.
We propose a new class of universal Graph Neural Networks (GNNs), universal Euclidean motion invariant point cloud networks, and Euclidean motion invariant Message Passing (MP) GNNs.
arXiv Detail & Related papers (2021-10-07T11:05:23Z) - Group Equivariant Subsampling [60.53371517247382]
Subsampling is used in convolutional neural networks (CNNs) in the form of pooling or strided convolutions.
We first introduce translation equivariant subsampling/upsampling layers that can be used to construct exact translation equivariant CNNs.
We then generalise these layers beyond translations to general groups, thus proposing group equivariant subsampling/upsampling.
arXiv Detail & Related papers (2021-06-10T16:14:00Z) - Improving Few-shot Learning by Spatially-aware Matching and
CrossTransformer [116.46533207849619]
We study the impact of scale and location mismatch in the few-shot learning scenario.
We propose a novel Spatially-aware Matching scheme to effectively perform matching across multiple scales and locations.
arXiv Detail & Related papers (2020-01-06T14:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.