Soft Task-Aware Routing of Experts for Equivariant Representation Learning
- URL: http://arxiv.org/abs/2510.27222v1
- Date: Fri, 31 Oct 2025 06:34:30 GMT
- Title: Soft Task-Aware Routing of Experts for Equivariant Representation Learning
- Authors: Jaebyeong Jeon, Hyeonseo Jang, Jy-yong Sohn, Kibok Lee,
- Abstract summary: Equivariant representation learning aims to capture variations induced by input transformations in the representation space.<n>Invariant representation learning encodes semantic information by disregarding such transformations.<n>We introduce Soft Task-Aware Routing (STAR), a routing strategy for projection heads that models them as experts.
- Score: 9.358653432067298
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Equivariant representation learning aims to capture variations induced by input transformations in the representation space, whereas invariant representation learning encodes semantic information by disregarding such transformations. Recent studies have shown that jointly learning both types of representations is often beneficial for downstream tasks, typically by employing separate projection heads. However, this design overlooks information shared between invariant and equivariant learning, which leads to redundant feature learning and inefficient use of model capacity. To address this, we introduce Soft Task-Aware Routing (STAR), a routing strategy for projection heads that models them as experts. STAR induces the experts to specialize in capturing either shared or task-specific information, thereby reducing redundant feature learning. We validate this effect by observing lower canonical correlations between invariant and equivariant embeddings. Experimental results show consistent improvements across diverse transfer learning tasks. The code is available at https://github.com/YonseiML/star.
Related papers
- Understanding the Role of Equivariance in Self-supervised Learning [51.56331245499712]
equivariant self-supervised learning (E-SSL) learns features to be augmentation-aware.
We identify a critical explaining-away effect in E-SSL that creates a synergy between the equivariant and classification tasks.
We reveal several principles for practical designs of E-SSL.
arXiv Detail & Related papers (2024-11-10T16:09:47Z) - Interpreting Equivariant Representations [4.738231680800414]
In this paper, we demonstrate that the inductive bias imposed on the by an equivariant model must also be taken into account when using latent representations.<n>We show how not accounting for the inductive biases leads to decreased performance on downstream tasks, and vice versa.
arXiv Detail & Related papers (2024-01-23T09:43:30Z) - Self-Supervised Learning for Group Equivariant Neural Networks [75.62232699377877]
Group equivariant neural networks are the models whose structure is restricted to commute with the transformations on the input.
We propose two concepts for self-supervised tasks: equivariant pretext labels and invariant contrastive loss.
Experiments on standard image recognition benchmarks demonstrate that the equivariant neural networks exploit the proposed self-supervised tasks.
arXiv Detail & Related papers (2023-03-08T08:11:26Z) - CIPER: Combining Invariant and Equivariant Representations Using
Contrastive and Predictive Learning [6.117084972237769]
We introduce Contrastive Invariant and Predictive Equivariant Representation learning (CIPER)
CIPER comprises both invariant and equivariant learning objectives using one shared encoder and two different output heads on top of the encoder.
We evaluate our method on static image tasks and time-augmented image datasets.
arXiv Detail & Related papers (2023-02-05T07:50:46Z) - Equivariance with Learned Canonicalization Functions [77.32483958400282]
We show that learning a small neural network to perform canonicalization is better than using predefineds.
Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks.
arXiv Detail & Related papers (2022-11-11T21:58:15Z) - Unsupervised Learning of Group Invariant and Equivariant Representations [10.252723257176566]
We extend group invariant and equivariant representation learning to the field of unsupervised deep learning.
We propose a general learning strategy based on an encoder-decoder framework in which the latent representation is separated in an invariant term and an equivariant group action component.
The key idea is that the network learns to encode and decode data to and from a group-invariant representation by additionally learning to predict the appropriate group action to align input and output pose to solve the reconstruction task.
arXiv Detail & Related papers (2022-02-15T16:44:21Z) - Why Do Self-Supervised Models Transfer? Investigating the Impact of
Invariance on Downstream Tasks [79.13089902898848]
Self-supervised learning is a powerful paradigm for representation learning on unlabelled images.
We show that different tasks in computer vision require features to encode different (in)variances.
arXiv Detail & Related papers (2021-11-22T18:16:35Z) - Exploring Complementary Strengths of Invariant and Equivariant
Representations for Few-Shot Learning [96.75889543560497]
In many real-world problems, collecting a large number of labeled samples is infeasible.
Few-shot learning is the dominant approach to address this issue, where the objective is to quickly adapt to novel categories in presence of a limited number of samples.
We propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations.
arXiv Detail & Related papers (2021-03-01T21:14:33Z) - Improving Transformation Invariance in Contrastive Representation
Learning [31.223892428863238]
We introduce a training objective for contrastive learning that uses a novel regularizer to control how the representation changes under transformation.
Second, we propose a change to how test time representations are generated by introducing a feature averaging approach that combines encodings from multiple transformations of the original input.
Third, we introduce the novel Spirograph dataset to explore our ideas in the context of a differentiable generative process with multiple downstream tasks.
arXiv Detail & Related papers (2020-10-19T13:49:29Z) - What Should Not Be Contrastive in Contrastive Learning [110.14159883496859]
We introduce a contrastive learning framework which does not require prior knowledge of specific, task-dependent invariances.
Our model learns to capture varying and invariant factors for visual representations by constructing separate embedding spaces.
We use a multi-head network with a shared backbone which captures information across each augmentation and alone outperforms all baselines on downstream tasks.
arXiv Detail & Related papers (2020-08-13T03:02:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.