Fugu-MT 論文翻訳(概要): Soft Equivariance Regularization for Invariant Self-Supervised Learning

論文の概要: Soft Equivariance Regularization for Invariant Self-Supervised Learning

arxiv url: http://arxiv.org/abs/2603.06693v1
Date: Wed, 04 Mar 2026 13:36:17 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-10 15:13:12.884087
Title: Soft Equivariance Regularization for Invariant Self-Supervised Learning
Title（参考訳）: 変分自己教師付き学習のためのソフト等分散規則化
Authors: Joohyung Lee, Changhun Kim, Hyunsu Kim, Kwanhyung Lee, Juho Lee,
Abstract要約: 自己教師付き学習(SSL)は通常、意味保存強化に不変な表現を学習する。本研究では,不等分散と等分散を両立させるプラグイン正規化器であるソフト等分散正規化(SER)を提案する。 SERはサンプルごとの変換コードやラベルを学習/予測し、補助的な変換予測ヘッドを必要としない。
参考スコア（独自算出の注目度）: 23.047550451521662
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Self-supervised learning (SSL) typically learns representations invariant to semantic-preserving augmentations. While effective for recognition, enforcing strong invariance can suppress transformation-dependent structure that is useful for robustness to geometric perturbations and spatially sensitive transfer. A growing body of work, therefore, augments invariance-based SSL with equivariance objectives, but these objectives are often imposed on the same final representation. We empirically observe a trade-off in this coupled setting: pushing equivariance regularization toward deeper layers improves equivariance scores but degrades ImageNet-1k linear evaluation, motivating a layer-decoupled design. Motivated by this trade-off, we propose Soft Equivariance Regularization (SER), a plug-in regularizer that decouples where invariance and equivariance are enforced: we keep the base SSL objective unchanged on the final embedding, while softly encouraging equivariance on an intermediate spatial token map via analytically specified group actions $ρ_g$ applied directly in feature space. SER learns/predicts no per-sample transformation codes/labels, requires no auxiliary transformation-prediction head, and adds only 1.008x training FLOPs. On ImageNet-1k ViT-S/16 pretraining, SER improves MoCo-v3 by +0.84 Top-1 in linear evaluation under a strictly matched 2-view setting and consistently improves DINO and Barlow Twins; under matched view counts, SER achieves the best ImageNet-1k linear-eval Top-1 among the compared invariance+equivariance add-ons. SER further improves ImageNet-C/P by +1.11/+1.22 Top-1 and frozen-backbone COCO detection by +1.7 mAP. Finally, applying the same layer-decoupling recipe to existing invariance+equivariance baselinesimproves their accuracy, suggesting layer decoupling as a general design principle for combining invariance and equivariance.
Abstract（参考訳）: 自己教師付き学習(SSL)は通常、意味保存強化に不変な表現を学習する。認識には有効であるが、強い不変性を強制することは、幾何学的摂動に対する堅牢性や空間的に敏感な伝達に有用な変換依存構造を抑制することができる。したがって、増大する作業の主体は不変性に基づくSSLを均等な目的で強化するが、これらの目的はしばしば同じ最終表現に課される。より深い層への等分散正則化を推し進めることで、等分散スコアは向上するが、ImageNet-1k線形評価は低下し、層分離設計の動機となる。このトレードオフに動機づけられたSER (Soft Equivariance Regularization) は,不等式と等式が適用される場所を分離するプラグイン正規化器である。我々は,基本SSLの目的を最終埋め込みで一定に保ちつつ,解析的に指定されたグループアクション$ρ_g$を特徴空間に直接適用することで,中間空間トークンマップ上での等式をソフトに奨励する。 SERはサンプルごとの変換コードやラベルを学習/予測し、補助的な変換予測ヘッドを必要としない。 ImageNet-1k ViT-S/16事前トレーニングでは、SERは厳密にマッチした2ビュー設定でMoCo-v3を+0.84 Top-1に改善し、DINOとBarlow Twinsを一貫して改善する。 SERはさらにImageNet-C/Pを+1.11/+1.22 Top-1で改善し、冷凍バックボーンCOCOを+1.7 mAPで検出した。最後に,同層分離法を既存不分散+等分散ベースラインに適用すると,その精度が向上し,不分散と等分散を組み合わせるための一般的な設計原理として層分離法が提案される。

論文の概要: Soft Equivariance Regularization for Invariant Self-Supervised Learning

関連論文リスト