Fugu-MT 論文翻訳(概要): Dual Perspectives on Non-Contrastive Self-Supervised Learning

論文の概要: Dual Perspectives on Non-Contrastive Self-Supervised Learning

arxiv url: http://arxiv.org/abs/2507.01028v1
Date: Wed, 18 Jun 2025 07:46:51 GMT
ステータス: 翻訳完了
システム内更新日: 2025-07-07 02:47:44.4214
Title: Dual Perspectives on Non-Contrastive Self-Supervised Learning
Title（参考訳）: 非コントラスト型自己監督型学習の両面的展望
Authors: Jean Ponce, Martial Hebert, Basile Terver,
Abstract要約: 停止勾配と指数移動平均反復手順は、表現の崩壊を避けるために一般的に用いられる。本発表では、最適化と力学系の2つの理論的視点からこれらの手順を考察する。
参考スコア（独自算出の注目度）: 40.79287810164605
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The objective of non-contrastive approaches to self-supervised learning is to train on pairs of different views of the data an encoder and a predictor that minimize the mean discrepancy between the code predicted from the embedding of the first view and the embedding of the second one. In this setting, the stop gradient and exponential moving average iterative procedures are commonly used to avoid representation collapse, with excellent performance in downstream supervised applications. This presentation investigates these procedures from the dual theoretical viewpoints of optimization and dynamical systems. We first show that, in general, although they do not optimize the original objective, or for that matter, any other smooth function, they do avoid collapse. Following Tian et al. [2021], but without any of the extra assumptions used in their proofs, we then show using a dynamical system perspective that, in the linear case, minimizing the original objective function without the use of a stop gradient or exponential moving average always leads to collapse. Conversely, we finally show that the limit points of the dynamical systems associated with these two procedures are, in general, asymptotically stable equilibria, with no risk of degenerating to trivial solutions.
Abstract（参考訳）: 自己教師あり学習における非コントラスト的アプローチの目的は、エンコーダと予測器が、第1のビューの埋め込みと第2のビューの埋め込みから予測されるコード間の平均的な差を最小限に抑えるように、データの異なるビューのペアをトレーニングすることである。この設定では、停止勾配と指数移動平均反復手順は、下流監視アプリケーションにおいて優れた性能を示し、表現の崩壊を避けるために一般的に使用される。本発表では、最適化と力学系の2つの理論的視点からこれらの手順を考察する。一般には、それらは元の目的を最適化しないが、その点において、他の滑らかな関数は、崩壊を避けることを最初に示している。 Tian et al [2021] に従うが、それらの証明で使われる余分な仮定がなければ、線形の場合、停止勾配や指数移動平均を使わずに元の目的関数を最小化することが常に崩壊する、力学系の観点から示される。逆に、これらの2つの手順に関連する力学系の極限点は、一般に漸近的に安定な平衡であり、自明な解に退化するリスクはない。

関連論文リスト

On exploration of an interior mirror descent flow for stochastic nonconvex constrained problem [3.4376560669160394]
ヘッセン障壁法とミラー降下法は連続流の離散近似として解釈できることを示す。厳密な相補性条件が成立すれば、これらの急激な定常点を回避できるような2つの十分な条件を提供する。
論文参考訳（メタデータ） (2025-07-21T05:58:52Z)
Contrastive Matrix Completion with Denoising and Augmented Graph Views for Robust Recommendation [1.0128808054306186]
マトリックス補完は推薦システムにおいて広く採用されているフレームワークである。コントラスト学習(MCCL)を用いた行列補完法を提案する。提案手法は,予測スコアの数値精度を向上するだけでなく,ランキング指標の最大36%を向上する上で,優れたランキングを生成する。
論文参考訳（メタデータ） (2025-06-12T12:47:35Z)
Long-Sequence Recommendation Models Need Decoupled Embeddings [49.410906935283585]
我々は、既存の長期推薦モデルにおいて無視された欠陥を識別し、特徴付ける。埋め込みの単一のセットは、注意と表現の両方を学ぶのに苦労し、これら2つのプロセス間の干渉につながります。本稿では,2つの異なる埋め込みテーブルを別々に学習し,注意と表現を完全に分離する,DARE(Decoupled Attention and Representation Embeddings)モデルを提案する。
論文参考訳（メタデータ） (2024-10-03T15:45:15Z)
Visual Prompt Tuning in Null Space for Continual Learning [51.96411454304625]
既存のプロンプトチューニング手法は、継続学習(CL)における印象的な性能を示す。本稿では,従来のタスクの特徴に代表される部分空間に直交する方向のプロンプトを調整し,各タスクを学習することを目的とする。実際には、即時勾配予測を実装するために、実効的なヌル空間に基づく近似解が提案されている。
論文参考訳（メタデータ） (2024-06-09T05:57:40Z)
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning [132.7040981721302]
本研究では,訪問尺度の凸関数を最小化することを目的として,制約付き凸決定プロセス(MDP)について検討する。制約付き凸MDPの設計アルゴリズムは、大きな状態空間を扱うなど、いくつかの課題に直面している。
論文参考訳（メタデータ） (2024-02-16T16:35:18Z)
Private (Stochastic) Non-Convex Optimization Revisited: Second-Order Stationary Points and Excess Risks [34.79650838578354]
2つの異なる種類のオラクルを利用する新しいフレームワークを導入する。指数的メカニズムは高い集団リスクバウンドを達成でき、ほぼ一致する低いバウンドを提供することを示す。
論文参考訳（メタデータ） (2023-02-20T00:11:19Z)
Beyond the Edge of Stability via Two-step Gradient Updates [49.03389279816152]
Gradient Descent(GD)は、現代の機械学習の強力な仕事場である。 GDが局所最小値を見つける能力は、リプシッツ勾配の損失に対してのみ保証される。この研究は、2段階の勾配更新の分析を通じて、単純だが代表的でありながら、学習上の問題に焦点をあてる。
論文参考訳（メタデータ） (2022-06-08T21:32:50Z)
Efficient Iterative Amortized Inference for Learning Symmetric and Disentangled Multi-Object Representations [8.163697683448811]
本稿では,オブジェクト中心表現の教師なし学習のための効率的なフレームワークであるEfficientMORLを紹介する。対称性と非絡み合いの両方を必要とすることによる最適化の課題は、高コスト反復的償却推論によって解決できることを示す。標準のマルチオブジェクト・ベンチマークでは,強いオブジェクト分解と歪みを示しながら,ほぼ1桁の高速なトレーニングとテスト時間推定を実現している。
論文参考訳（メタデータ） (2021-06-07T14:02:49Z)
On dissipative symplectic integration with applications to gradient-based optimization [77.34726150561087]
本稿では,離散化を体系的に実現する幾何学的枠組みを提案する。我々は、シンプレクティックな非保守的、特に散逸的なハミルトン系への一般化が、制御された誤差まで収束率を維持することができることを示す。
論文参考訳（メタデータ） (2020-04-15T00:36:49Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。