Fugu-MT 論文翻訳(概要): Compact Deep Aggregation for Set Retrieval

論文の概要: Compact Deep Aggregation for Set Retrieval

arxiv url: http://arxiv.org/abs/2003.11794v1
Date: Thu, 26 Mar 2020 08:43:15 GMT
ステータス: 翻訳完了
システム内更新日: 2022-12-19 21:49:56.489076
Title: Compact Deep Aggregation for Set Retrieval
Title（参考訳）: 集合検索のための小型深部集合
Authors: Yujie Zhong, Relja Arandjelovi\'c, Andrew Zisserman
Abstract要約: 画像の大規模データセットから複数の顔を含む画像を取得することに焦点を当てる。ここでは、セットは各画像の顔記述子で構成され、複数のIDに対するクエリが与えられた後、すべてのIDを含む画像を取得することが目標である。このコンパクトディスクリプタは,画像毎に最大2面まで識別性の低下が最小限に抑えられ,その後徐々に劣化することを示す。
参考スコア（独自算出の注目度）: 87.52470995031997
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The objective of this work is to learn a compact embedding of a set of descriptors that is suitable for efficient retrieval and ranking, whilst maintaining discriminability of the individual descriptors. We focus on a specific example of this general problem -- that of retrieving images containing multiple faces from a large scale dataset of images. Here the set consists of the face descriptors in each image, and given a query for multiple identities, the goal is then to retrieve, in order, images which contain all the identities, all but one, \etc To this end, we make the following contributions: first, we propose a CNN architecture -- {\em SetNet} -- to achieve the objective: it learns face descriptors and their aggregation over a set to produce a compact fixed length descriptor designed for set retrieval, and the score of an image is a count of the number of identities that match the query; second, we show that this compact descriptor has minimal loss of discriminability up to two faces per image, and degrades slowly after that -- far exceeding a number of baselines; third, we explore the speed vs.\ retrieval quality trade-off for set retrieval using this compact descriptor; and, finally, we collect and annotate a large dataset of images containing various number of celebrities, which we use for evaluation and is publicly released.
Abstract（参考訳）: 本研究の目的は,個々のディスクリプタの識別性を維持しつつ,効率的な検索とランキングに適したディスクリプタの集合のコンパクトな組込みを学ぶことである。 We focus on a specific example of this general problem -- that of retrieving images containing multiple faces from a large scale dataset of images. Here the set consists of the face descriptors in each image, and given a query for multiple identities, the goal is then to retrieve, in order, images which contain all the identities, all but one, \etc To this end, we make the following contributions: first, we propose a CNN architecture -- {\em SetNet} -- to achieve the objective: it learns face descriptors and their aggregation over a set to produce a compact fixed length descriptor designed for set retrieval, and the score of an image is a count of the number of identities that match the query; second, we show that this compact descriptor has minimal loss of discriminability up to two faces per image, and degrades slowly after that -- far exceeding a number of baselines; third, we explore the speed vs. このコンパクトディスクリプタを用いた集合検索における検索品質トレードオフについて検討し,最後に,様々なセレブを含む画像の膨大なデータセットを収集し,注釈を付けて評価し,公開する。

関連論文リスト

Composed Object Retrieval: Object-level Retrieval via Composed Expressions [71.47650333199628]
Composed Object Retrieval (COR)は、画像レベルの検索を超えてオブジェクトレベルの精度を達成するための新しいタスクである。 COR127KはCORの最初の大規模ベンチマークであり、408カテゴリの様々な意味変換を持つ127,166個の検索三重項を含む。また、参照領域エンコーディング、適応型視覚・テキストインタラクション、および領域レベルのコントラスト学習を統合した統合エンドツーエンドモデルであるCOREを提案する。
論文参考訳（メタデータ） (2025-08-06T13:11:40Z)
ReSeDis: A Dataset for Referring-based Object Search across Large-Scale Image Collections [14.076781094343362]
Referring Search and Discovery (ReSeDis) は,コーパスレベルの検索と画素レベルのグラウンド化を統合化する最初のタスクである。厳密な研究を可能にするために、我々は、全ての記述が大きな多様なコーパスに散在するオブジェクトインスタンスに一意にマッピングされるベンチマークをキュレートする。 ReSeDisは、次世代の堅牢でスケーラブルなマルチモーダル検索システムを構築するための、現実的でエンドツーエンドのテストベッドを提供する。
論文参考訳（メタデータ） (2025-06-18T06:52:10Z)
Find your Needle: Small Object Image Retrieval via Multi-Object Attention Optimization [5.2337753974570616]
本研究では,小物体画像検索(SoIR)の課題に対処する。その目的は,特定の小物体を含む画像を,散らばったシーンで検索することである。主な課題は、画像内のすべてのオブジェクトを効果的に表現する、スケーラブルで効率的な検索のための単一のイメージ記述子を構築することである。専用多目的事前学習フェーズを組み込んだ新しい検索フレームワークであるMaO(Multi-object Attention Optimization)を導入する。
論文参考訳（メタデータ） (2025-03-10T08:27:02Z)
Advancing Image Retrieval with Few-Shot Learning and Relevance Feedback [5.770351255180495]
Image Retrieval with Relevance Feedback (IRRF) は、検索プロセス中に反復的なヒューマンインタラクションを伴う。本稿では,タスクに適したハイパーネットワークに基づく新しいスキームを提案し,ユーザフィードバックの迅速な調整を容易にする。提案手法は,数発の1クラス分類でSoTAを達成でき,数発のオープンセット認識のバイナリ分類タスクで同等の結果が得られることを示す。
論文参考訳（メタデータ） (2023-12-18T10:20:28Z)
RAFIC: Retrieval-Augmented Few-shot Image Classification [0.0]
少ないショット画像分類は、見えない画像を互いに排他的なクラスに分類するタスクである。我々は,検索した画像の付加集合を用いて,Kの集合を増大させる手法を開発した。我々は,RAFICが2つの挑戦的データセットをまたいだ数ショット画像分類の性能を著しく向上させることを実証した。
論文参考訳（メタデータ） (2023-12-11T22:28:51Z)
Object-Centric Open-Vocabulary Image-Retrieval with Aggregated Features [12.14013374452918]
本稿では,オブジェクト中心のオープン語彙画像検索に対して,シンプルながら効果的なアプローチを提案する。提案手法は,CLIPから抽出した濃密な埋め込みをコンパクトな表現に集約する。 3つのデータセットのグローバルな特徴的アプローチよりもはるかに優れた結果を得ることで,タスクに対する提案手法の有効性を示す。
論文参考訳（メタデータ） (2023-09-26T15:13:09Z)
Collaborative Group: Composed Image Retrieval via Consensus Learning from Noisy Annotations [67.92679668612858]
我々は,集団が個人より優れているという心理的概念に触発されたコンセンサスネットワーク(Css-Net)を提案する。 Css-Netは,(1)コンセンサスモジュールと4つのコンセンサスモジュール,(2)コンセンサス間の相互作用の学習を促進するKulback-Leibler分散損失の2つのコアコンポーネントから構成される。ベンチマークデータセット、特にFashionIQでは、Css-Netが大幅に改善されている。特に、R@10が2.77%、R@50が6.67%増加し、リコールが大幅に向上している。
論文参考訳（メタデータ） (2023-06-03T11:50:44Z)
Self-supervised Multi-view Disentanglement for Expansion of Visual Collections [6.944742823561]
類似した画像に対する問い合わせが画像の集合から導出される設定について考察する。ビジュアルサーチでは、類似度の測定は複数の軸に沿って行うか、スタイルや色などのビューで行うことができる。本研究の目的は,複数のビューからの表現に対して計算された類似性を効果的に組み合わせた検索アルゴリズムを設計することである。
論文参考訳（メタデータ） (2023-02-04T22:09:17Z)
Compositional Sketch Search [91.84489055347585]
フリーハンドスケッチを用いて画像コレクションを検索するアルゴリズムを提案する。シーン構成全体を特定するための簡潔で直感的な表現として描画を利用する。
論文参考訳（メタデータ） (2021-06-15T09:38:09Z)
SCNet: Enhancing Few-Shot Semantic Segmentation by Self-Contrastive Background Prototypes [56.387647750094466]
Few-shot セマンティックセマンティックセマンティクスは,クエリイメージ内の新規クラスオブジェクトを,アノテーション付きの例で分割することを目的としている。先進的なソリューションのほとんどは、各ピクセルを学習した前景のプロトタイプに合わせることでセグメンテーションを行うメトリクス学習フレームワークを利用している。このフレームワークは、前景プロトタイプのみとのサンプルペアの不完全な構築のために偏った分類に苦しんでいます。
論文参考訳（メタデータ） (2021-04-19T11:21:47Z)
Dense Relational Image Captioning via Multi-task Triple-Stream Networks [95.0476489266988]
視覚的な場面におけるオブジェクト間の情報に関して,キャプションを生成することを目的とした新しいタスクである。このフレームワークは、多様性と情報の量の両方において有利であり、包括的なイメージ理解につながる。
論文参考訳（メタデータ） (2020-10-08T09:17:55Z)
Tasks Integrated Networks: Joint Detection and Retrieval for Image Search [99.49021025124405]
多くの現実世界の探索シナリオ(例えばビデオ監視)では、オブジェクトは正確に検出または注釈付けされることはめったにない。まず、エンド・ツー・エンド統合ネット(I-Net)を紹介します。さらに,2つの新しいコントリビューションを行うDC-I-Netという改良されたI-Netを提案する。
論文参考訳（メタデータ） (2020-09-03T03:57:50Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。