Fugu-MT 論文翻訳(概要): ConeSep: Cone-based Robust Noise-Unlearning Compositional Network for Composed Image Retrieval

論文の概要: ConeSep: Cone-based Robust Noise-Unlearning Compositional Network for Composed Image Retrieval

arxiv url: http://arxiv.org/abs/2604.20358v1
Date: Wed, 22 Apr 2026 08:59:21 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-23 15:36:11.054508
Title: ConeSep: Cone-based Robust Noise-Unlearning Compositional Network for Composed Image Retrieval
Title（参考訳）: ConeSep:合成画像検索のためのコーンベースロバスト雑音学習合成ネットワーク
Authors: Zixu Li, Yupeng Hu, Zhiwei Chen, Mingyu Zhang, Zhiheng Fu, Liqiang Nie,
Abstract要約: 本稿では,アノテーションによるノイズトリプル対応 (NTC) 問題を系統的に検討する。我々は,これらの課題に対処するために,コーンベースのrobuSt noisE-unlearning comPositional network (ConeSep)を提案する。
参考スコア（独自算出の注目度）: 60.051600134831226
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Composed Image Retrieval (CIR) task provides a flexible retrieval paradigm via a reference image and modification text, but it heavily relies on expensive and error-prone triplet annotations. This paper systematically investigates the Noisy Triplet Correspondence (NTC) problem introduced by annotations. We find that NTC noise, particularly ``hard noise'' (i.e., the reference and target images are highly similar but the modification text is incorrect), poses a unique challenge to existing Noise Correspondence Learning (NCL) methods because it breaks the traditional ``small loss hypothesis''. We identify and elucidate three key, yet overlooked, challenges in the NTC task, namely (C1) Modality Suppression, (C2) Negative Anchor Deficiency, and (C3) Unlearning Backlash. To address these challenges, we propose a Cone-based robuSt noisE-unlearning comPositional network (ConeSep). Specifically, we first propose Geometric Fidelity Quantization, theoretically establishing and practically estimating a noise boundary to precisely locate noisy correspondence. Next, we introduce Negative Boundary Learning, which learns a ``diagonal negative combination'' for each query as its explicit semantic opposite-anchor in the embedding space. Finally, we design Boundary-based Targeted Unlearning, which models the noisy correction process as an optimal transport problem, elegantly avoiding Unlearning Backlash. Extensive experiments on benchmark datasets (FashionIQ and CIRR) demonstrate that ConeSep significantly outperforms current state-of-the-art methods, which fully demonstrates the effectiveness and robustness of our method.
Abstract（参考訳）: Composed Image Retrieval (CIR) タスクは参照画像と修正テキストを通じて柔軟な検索パラダイムを提供するが、高価でエラーを起こしやすい三重項アノテーションに大きく依存している。本稿では,アノテーションによるノイズトリプル対応 (NTC) 問題を系統的に検討する。 NTCノイズ、特に「ハードノイズ」(参照画像とターゲット画像は極めて類似しているが修正テキストは誤り)は、従来の「小さい損失仮説」を破り、既存のノイズ対応学習(NCL)手法に固有の課題をもたらす。我々は,NTCタスクにおける3つの重要な課題,すなわち (C1) Modality Suppression, (C2) 負のアンカー障害, (C3) 未学習のバックラッシュを特定し,解明する。これらの課題に対処するため,我々はConeSep(ConeSep)を用いたrobuSt noisE-unlearning comPositional networkを提案する。具体的には、まず、ノイズ境界を理論的に推定し、ノイズの対応を正確に特定する幾何学的フィデリティ量子化を提案する。次に、負境界学習を導入し、各クエリの'対角的負の組合せ'を、埋め込み空間におけるその明示的なセマンティックアンカーとして学習する。最後に,境界に基づくTargeted Unlearningを設計し,ノイズ補正プロセスを最適な輸送問題としてモデル化し,Unlearning Backlashをエレガントに回避する。ベンチマークデータセット(FashionIQとCIRR)の大規模な実験により、ConeSepは現在の最先端手法よりも大幅に優れており、この手法の有効性とロバスト性を十分に示している。

論文の概要: ConeSep: Cone-based Robust Noise-Unlearning Compositional Network for Composed Image Retrieval

関連論文リスト