Fugu-MT 論文翻訳(概要): GlideNet: Global, Local and Intrinsic based Dense Embedding NETwork for Multi-category Attributes Prediction

論文の概要: GlideNet: Global, Local and Intrinsic based Dense Embedding NETwork for Multi-category Attributes Prediction

arxiv url: http://arxiv.org/abs/2203.03079v1
Date: Mon, 7 Mar 2022 00:32:37 GMT
ステータス: 翻訳完了
システム内更新日: 2022-03-08 15:22:07.466650
Title: GlideNet: Global, Local and Intrinsic based Dense Embedding NETwork for Multi-category Attributes Prediction
Title（参考訳）: GlideNet:マルチカテゴリ属性予測のためのグローバル,ローカル,イントロなDense EmbeddingNETwork
Authors: Kareem Metwaly, Aerin Kim, Elliot Branson and Vishal Monga
Abstract要約: 我々はGlideNetという新しい属性予測アーキテクチャを提案する。 GlideNetには3つの異なる特徴抽出器が含まれている。最新の2つのデータセットと挑戦的なデータセットに対して、魅力的な結果を得ることができる。
参考スコア（独自算出の注目度）: 27.561424604521026
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Attaching attributes (such as color, shape, state, action) to object categories is an important computer vision problem. Attribute prediction has seen exciting recent progress and is often formulated as a multi-label classification problem. Yet significant challenges remain in: 1) predicting diverse attributes over multiple categories, 2) modeling attributes-category dependency, 3) capturing both global and local scene context, and 4) predicting attributes of objects with low pixel-count. To address these issues, we propose a novel multi-category attribute prediction deep architecture named GlideNet, which contains three distinct feature extractors. A global feature extractor recognizes what objects are present in a scene, whereas a local one focuses on the area surrounding the object of interest. Meanwhile, an intrinsic feature extractor uses an extension of standard convolution dubbed Informed Convolution to retrieve features of objects with low pixel-count. GlideNet uses gating mechanisms with binary masks and its self-learned category embedding to combine the dense embeddings. Collectively, the Global-Local-Intrinsic blocks comprehend the scene's global context while attending to the characteristics of the local object of interest. Finally, using the combined features, an interpreter predicts the attributes, and the length of the output is determined by the category, thereby removing unnecessary attributes. GlideNet can achieve compelling results on two recent and challenging datasets -- VAW and CAR -- for large-scale attribute prediction. For instance, it obtains more than 5\% gain over state of the art in the mean recall (mR) metric. GlideNet's advantages are especially apparent when predicting attributes of objects with low pixel counts as well as attributes that demand global context understanding. Finally, we show that GlideNet excels in training starved real-world scenarios.
Abstract（参考訳）: 属性(色、形、状態、動作など)をオブジェクトカテゴリにアタッチすることは重要なコンピュータビジョン問題である。属性予測は近年エキサイティングな進歩を見せており、しばしば多ラベル分類問題として定式化されている。しかし、重要な課題が残っている。 1)複数のカテゴリにわたる多様な属性の予測 2)属性-カテゴリ依存性のモデリング 3)グローバルシーンとローカルシーンの両方をキャプチャし、 4) 画素数の低いオブジェクトの属性の予測。これらの問題に対処するため,我々は3つの特徴抽出器を含む新しい多カテゴリー属性予測ディープアーキテクチャ glidenet を提案する。グローバル特徴抽出器はシーン内に存在するオブジェクトを認識するが、ローカル特徴抽出器は関心対象を囲む領域に注目している。一方、本質的な特徴抽出器は、Informed Convolutionと呼ばれる標準の畳み込みを拡張して、低いピクセル数を持つオブジェクトの特徴を検索する。 glidenetでは、バイナリマスクと自己学習したカテゴリ埋め込みによるゲーティング機構を使用して、密結合を組み合わせる。総じて、グローバル・ローカル・インタリンシックブロックは、興味のあるローカル・オブジェクトの特性に順応しながら、シーンのグローバル・コンテキストを理解する。最後に、組み合わせた特徴を用いて、インタプリタが属性を予測し、出力の長さがカテゴリによって決定され、不要な属性を除去する。 GlideNetは、大規模な属性予測のために、最近の2つの挑戦的なデータセット(VAWとCAR)で魅力的な結果を得ることができる。例えば、平均リコール(mr)メトリックにおいて、アートの状態よりも5\%以上のゲインが得られる。 glidenetの利点は、ピクセル数の低いオブジェクトの属性や、グローバルなコンテキスト理解を必要とする属性を予測する場合に特に顕著である。最後に、GlideNetは実世界のシナリオの訓練に優れていることを示す。

関連論文リスト

Dual Feature Augmentation Network for Generalized Zero-shot Learning [14.410978100610489]
ゼロショット学習 (ZSL) は,見知らぬクラスから知識を伝達することによって,サンプルを訓練せずに新しいクラスを推論することを目的としている。 ZSLの既存の埋め込みベースのアプローチは、画像上の属性を見つけるために注意機構を用いるのが一般的である。本稿では,2つの機能拡張モジュールからなる新しいDual Feature Augmentation Network (DFAN)を提案する。
論文参考訳（メタデータ） (2023-09-25T02:37:52Z)
Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
本稿では,オブジェクト中心表現に対する分散アプローチとして,複合オートエンコーダを提案する。このシンプルで効率的なアプローチは、単純なマルチオブジェクトデータセット上の等価な実数値オートエンコーダよりも、より良い再構成性能を実現することを示す。また、2つのデータセット上のSlotAttentionモデルと競合しないオブジェクト発見性能を実現し、SlotAttentionが失敗する第3のデータセットでオブジェクトをアンタングルする。
論文参考訳（メタデータ） (2022-04-05T09:25:28Z)
Attribute Prototype Network for Any-Shot Learning [113.50220968583353]
属性ローカライズ機能を統合した画像表現は、任意のショット、すなわちゼロショットと少数ショットのイメージ分類タスクに有用である、と我々は主張する。クラスレベルの属性のみを用いてグローバルな特徴とローカルな特徴を共同で学習する新しい表現学習フレームワークを提案する。
論文参考訳（メタデータ） (2022-04-04T02:25:40Z)
CAR -- Cityscapes Attributes Recognition A Multi-category Attributes Dataset for Autonomous Vehicles [30.024877502540665]
属性認識のための新しいデータセット -- Cityscapes Attributes Recognition (CAR) を提案する。新しいデータセットは、よく知られたデータセットであるCityscapesを拡張し、各イメージにオブジェクトの属性のアノテーション層を追加する。データセットは、それぞれのカテゴリが独自の属性セットを持つ、構造化され、調整された分類を持っている。
論文参考訳（メタデータ） (2021-11-16T06:00:43Z)
Improving Object Detection and Attribute Recognition by Feature Entanglement Reduction [26.20319853343761]
オブジェクト検出は属性非依存であるべきであり、属性は主にオブジェクト非依存であることを示す。我々は、カテゴリと属性の特徴を独立に計算する2ストリームモデルを用いて、それらを分離するが、分類ヘッドは興味の領域(RoIs)を共有する。従来のシングルストリームモデルと比較すると,Visual GenomeのサブセットであるVG-20よりも,教師付きタスクと属性転送タスクにおいて大幅な改善が見られた。
論文参考訳（メタデータ） (2021-08-25T22:27:06Z)
Learning to Predict Visual Attributes in the Wild [43.91237738107603]
260K以上のオブジェクトインスタンスに対して,927K以上の属性アノテーションからなる大規模なウィジェット内属性予測データセットを導入する。本稿では,低レベルCNN機能と高レベルCNN機能の両方を利用するベースモデルを含む,これらの課題に体系的に対処する手法を提案する。これらの技術を用いることで,現状よりも3.7mAP,5.7ポイントのF1点が向上した。
論文参考訳（メタデータ） (2021-06-17T17:58:02Z)
Attribute Prototype Network for Zero-Shot Learning [113.50220968583353]
差別的グローバルな特徴と局所的な特徴を共同で学習するゼロショット表現学習フレームワークを提案する。本モデルでは,画像中の属性の視覚的証拠を指摘し,画像表現の属性ローカライゼーション能力の向上を確認した。
論文参考訳（メタデータ） (2020-08-19T06:46:35Z)
CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language Learning [78.3857991931479]
本稿では,属性を用いたグラウンドド言語学習のための評価フレームワークGROLLAを提案する。また、学習したニューラル表現の品質を評価するためのフレームワークの例として、新しいデータセットCompGuessWhat!?を提案する。
論文参考訳（メタデータ） (2020-06-03T11:21:42Z)
Learning to Predict Context-adaptive Convolution for Semantic Segmentation [66.27139797427147]
長距離コンテキスト情報は、高性能なセマンティックセグメンテーションを実現するために不可欠である。空間的に変化する特徴重み付けベクトルを予測するためのコンテキスト適応畳み込みネットワーク(CaC-Net)を提案する。当社のCaC-Netは,3つの公開データセット上でのセグメンテーション性能に優れています。
論文参考訳（メタデータ） (2020-04-17T13:09:17Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。