Fugu-MT 論文翻訳(概要): Adaptive Contextual Perception: How to Generalize to New Backgrounds and Ambiguous Objects

論文の概要: Adaptive Contextual Perception: How to Generalize to New Backgrounds and Ambiguous Objects

arxiv url: http://arxiv.org/abs/2306.05963v2
Date: Fri, 27 Oct 2023 19:35:32 GMT
ステータス: 翻訳完了
システム内更新日: 2023-10-31 21:24:44.172540
Title: Adaptive Contextual Perception: How to Generalize to New Backgrounds and Ambiguous Objects
Title（参考訳）: 適応的文脈知覚:新しい背景と曖昧な対象に一般化する方法
Authors: Zhuofan Ying, Peter Hase, Mohit Bansal
Abstract要約: 本研究では,視覚モデルが分布外一般化の文脈をどのように適応的に利用するかを検討する。 1つの設定で優れているモデルは、もう1つの設定で苦労する傾向があります。生物学的視覚の一般化能力を再現するためには、コンピュータビジョンモデルは背景表現に対して分解対象を持つ必要がある。
参考スコア（独自算出の注目度）: 75.15563723169234
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Biological vision systems make adaptive use of context to recognize objects in new settings with novel contexts as well as occluded or blurry objects in familiar settings. In this paper, we investigate how vision models adaptively use context for out-of-distribution (OOD) generalization and leverage our analysis results to improve model OOD generalization. First, we formulate two distinct OOD settings where the contexts are either irrelevant (Background-Invariance) or beneficial (Object-Disambiguation), reflecting the diverse contextual challenges faced in biological vision. We then analyze model performance in these two different OOD settings and demonstrate that models that excel in one setting tend to struggle in the other. Notably, prior works on learning causal features improve on one setting but hurt in the other. This underscores the importance of generalizing across both OOD settings, as this ability is crucial for both human cognition and robust AI systems. Next, to better understand the model properties contributing to OOD generalization, we use representational geometry analysis and our own probing methods to examine a population of models, and we discover that those with more factorized representations and appropriate feature weighting are more successful in handling Background-Invariance and Object-Disambiguation tests. We further validate these findings through causal intervention on representation factorization and feature weighting to demonstrate their causal effect on performance. Lastly, we propose new augmentation methods to enhance model generalization. These methods outperform strong baselines, yielding improvements in both in-distribution and OOD tests. In conclusion, to replicate the generalization abilities of biological vision, computer vision models must have factorized object vs. background representations and appropriately weight both kinds of features.
Abstract（参考訳）: 生物学的視覚システムは、新しいコンテキストを持つ新しい設定におけるオブジェクトを認識するためにコンテキストを適応的に利用する。本稿では,視覚モデルがどのようにコンテキストをオフ・オブ・ディストリビューション(OOD)の一般化に適応的に利用するかを検討した。まず,文脈が無関係(背景不変性)か有益(対象曖昧性)のどちらかである2つの異なるOOD設定を定式化し,生物学的視覚において直面する多様な文脈的課題を反映する。次に、これらの2つの異なるOOD設定でモデルパフォーマンスを分析し、一方で優れたモデルが他方で苦労する傾向があることを示す。特に、因果的特徴の学習に関する事前の作業は、ある設定では改善されるが、もう一方では傷つく。これは、人間の認識と堅牢なAIシステムの両方にとって、この能力が不可欠であるため、OOD設定の両方にまたがって一般化することの重要性を強調している。次に,ood一般化に寄与するモデル特性をよりよく理解するために,表現幾何学解析と独自の探索法を用いてモデル集団を調査し,より因子化された表現と適切な特徴重み付けを持つモデルが,背景非分散テストやオブジェクト非曖昧化テストの処理に成功していることを発見した。さらに,表現因子化と特徴重み付けに因果的介入を行い,それらの要因がパフォーマンスに与える影響を検証した。最後に,モデル一般化を強化する新しい拡張手法を提案する。これらの手法は強いベースラインを上回り、分配試験とOOD試験の両方の改善をもたらす。結論として、生体視覚の一般化能力を再現するには、コンピュータビジョンモデルは、対象と背景表現を分解し、両方の特徴を適切に重み付けなければならない。

関連論文リスト

Cross-modal Associations in Vision and Language Models: Revisiting the bouba-kiki effect [0.10923877073891446]
そこで我々は,「ブバ」や「キキ」といった疑似語を丸い形と確実に関連付ける「ブバキキ効果」を再評価する。視覚・言語モデル(VLM)がブバ・キキ効果を連続的に示さないことを示す。
論文参考訳（メタデータ） (2025-07-14T07:48:54Z)
From Efficiency to Equity: Measuring Fairness in Preference Learning [3.2132738637761027]
不平等とロウルシアン正義の経済理論に触発された嗜好学習モデルの公平性を評価する。 Gini Coefficient, Atkinson Index, Kuznets Ratio を用いて,これらのモデルの公平性を定量化するための指標を提案する。
論文参考訳（メタデータ） (2024-10-24T15:25:56Z)
Benchmarking the Attribution Quality of Vision Models [13.255247017616687]
本稿では,広く使用されているインクリメンタル削除プロトコルの2つの基本的な限界を克服する新しい評価プロトコルを提案する。これにより、23の帰属手法と8つの異なる視覚モデルの設計選択が帰属品質にどのように影響するかを評価することができる。本研究は,本質的に説明可能なモデルが標準モデルより優れており,生の帰属値が従来よりも高い帰属品質を示すことを発見した。
論文参考訳（メタデータ） (2024-07-16T17:02:20Z)
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms [91.19304518033144]
検索システムにおける視覚モデルと人間の審美基準の整合を図る。本研究では、視覚モデルと人間の美学をよりよく整合させるために、視覚モデルを微調整する嗜好に基づく強化学習手法を提案する。
論文参考訳（メタデータ） (2024-06-13T17:59:20Z)
Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation [87.50120181861362]
VisionPreferは高品質できめ細かい選好データセットで、複数の選好面をキャプチャする。我々は、VisionPrefer上で報酬モデルVP-Scoreをトレーニングし、テキストから画像への生成モデルのトレーニングを指導し、VP-Scoreの嗜好予測精度は人間のアノテーションに匹敵する。
論文参考訳（メタデータ） (2024-04-23T14:53:15Z)
Corpus Considerations for Annotator Modeling and Scaling [9.263562546969695]
一般的に使われているユーザトークンモデルは、より複雑なモデルよりも一貫して優れています。以上の結果から,コーパス統計とアノテータモデリング性能の関係が明らかになった。
論文参考訳（メタデータ） (2024-04-02T22:27:24Z)
Generalization properties of contrastive world models [10.806958747213976]
我々は、対照的な世界モデルの一般化特性について広範な研究を行う。実験の結果, 異なるOODテストの下では, 対照的な世界モデルでは一般化できないことがわかった。我々の研究は、一般化のためのオブジェクト中心表現の重要性を強調し、現在のモデルは人間レベルの一般化に必要な表現を学習する能力に制限されている。
論文参考訳（メタデータ） (2023-12-29T19:25:34Z)
Separate-and-Enhance: Compositional Finetuning for Text2Image Diffusion Models [58.46926334842161]
この研究は、注意力の低いアクティベーションスコアとマスクオーバーラップに関連する問題を指摘し、このような不一致の根本的な理由を照らしている。本稿では,物体マスクの重なりを低減し,注目度を最大化する2つの新しい目的,分離損失とエンハンス損失を提案する。提案手法は従来のテスト時間適応手法と異なり,拡張性と一般化性を高める重要なパラメータの微調整に重点を置いている。
論文参考訳（メタデータ） (2023-12-10T22:07:42Z)
Spurious Feature Diversification Improves Out-of-distribution Generalization [43.84284578270031]
アウト・オブ・ディストリビューション(OOD)データへの一般化は、機械学習において重要な課題である。トレーニング済みモデルと微調整済みモデルの間を補間する一般的な重量空間アンサンブル法であるWiSE-FTについて検討する。 We observed an unexpected FalseFalseTrue, where WiSE-FT successfully corrects many case that each each model makes wrong corrects。
論文参考訳（メタデータ） (2023-09-29T13:29:22Z)
On the Robustness of Aspect-based Sentiment Analysis: Rethinking Model, Data, and Training [109.9218185711916]
アスペクトベースの感情分析(ABSA)は、ソーシャルメディアのテキストやレビューの背後にある製品やサービスの特定の側面に対して、特定の感情の極性を自動的に推測することを目的としている。我々は、モデル、データ、トレーニングを含むあらゆる可能な角度からボトルネックを体系的に再考することで、ABSAの堅牢性を高めることを提案する。
論文参考訳（メタデータ） (2023-04-19T11:07:43Z)
Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study [81.11161697133095]
NERタスクをテストベッドとして、異なる視点から既存モデルの一般化挙動を分析する。詳細な分析による実験は、既存のニューラルNERモデルのボトルネックを診断する。本論文の副産物として,最近のNER論文の包括的要約を含むプロジェクトをオープンソース化した。
論文参考訳（メタデータ） (2020-01-12T04:33:53Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。