The success of fully supervised saliency detection models depends on a large
number of pixel-wise labeling. In this paper, we work on bounding-box based
weakly-supervised saliency detection to relieve the labeling effort. Given the
bounding box annotation, we observe that pixels inside the bounding box may
contain extensive labeling noise. However, as a large amount of background is
excluded, the foreground bounding box region contains a less complex
background, making it possible to perform handcrafted features-based saliency
detection with only the cropped foreground region. As the conventional
handcrafted features are not representative enough, leading to noisy saliency
maps, we further introduce structure-aware self-supervised loss to regularize
the structure of the prediction. Further, we claim that pixels outside the
bounding box should be background, thus partial cross-entropy loss function can
be used to accurately localize the accurate background region. Experimental
results on six benchmark RGB saliency datasets illustrate the effectiveness of
our model.
However, as the large amount of background is excluded, the foreground bounding box region contains less complex background, making it possible to perform handcrafted features based saliency detection with only the cropped foreground region.
As the conventional handcrafted features are not representative enough, leading to noisy saliency maps, we further introduce structure-aware self-supervised loss to regularize the structure of the prediction.
Further, we claim that pixels outside bounding box should be background, thus partial cross-entropy loss function can be used to accurately localize the accurate background region.
Experimental results on six benchmark RGB saliency dataset illustrate effectiveness of our model.
6つのベンチマークRGBサリエンシデータセットの実験結果から,本モデルの有効性が示された。
0.60
Index Terms— Weakly supervised learning, Bounding-
指標項 -弱教師付き学習, 境界-
0.57
box annotation, Structure-aware self-supervised loss
box アノテーション、構造対応の自己管理損失
0.61
1. INTRODUCTION Salient object detection aims to localize the full scope of the salient foreground, which is usually defined as a binary segmentation task.
Most of the conventional techniques are fully-supervised [1, 2, 3], where the pixel-wise annotations are needed as supervision to train a mapping from input image space to output saliency space.
We find that the strong dependency of pixel-wise labeling pose both efficiency and budget challenges for existing fully-supervised saliency detection techniques.
Although there still exists background region within the
背景の地域は 今もありますが
0.65
Image bndBox
画像 bndボックス
0.72
GrabCut Fig. 1.
グラブカット 図1。
0.34
Given image (“Image”) and it’s bounding box annotation (“bndBox”), GrabCut leads to an pseudo segmentation result (“GrabCut”), serving as pseudo label for the proposed weakly supervised saliency detection model.
bounding-box foreground annotation, the significantly reduced background region makes it possible to perform conventional handcrafted feature based saliency detection methods on the cropped foreground region.
In Fig 1, we show the generated pseudo label by combining bounding-box annotation with the conventional handcrafted feature based method, which clearly show it’s superiority.
As a deep neural network can fit any types of noise [15], directly using the generated pseudo label as ground truth will lead to biased model, over-fitting on the noisy pseudo label.
To further constrain the structure of the prediction, we present structure-aware self-supervised loss function.
予測の構造をさらに制約するため,構造認識型自己教師付き損失関数を提案する。
0.68
The main goal of this loss function is constrain the prediction with structure well-aligned with the input image.
この損失関数の主な目標は、入力画像とよく一致した構造で予測を制限することである。
0.74
In this way, we aim to obtain structure accurate predictions (see Fig 3).
このようにして、構造的正確な予測を求める(図3参照)。
0.77
2. RELATED WORK Fully-supervised Saliency models: The main focus of fullysupervised saliency detection models is to achieve effective feature aggregation [16] [17] [18].
Due to the use of stride operation, the resulting saliency maps are usually with low resolution.
ストライド演算を使用するため、結果として得られるサリエンシマップは通常低解像度である。
0.71
To produce structure accurate saliency prediction, some method use edge supervision to learn more feature about the object boundary to refine the saliency predictions using better structure of the object [19] [20] [21].
Due to labeling noise, we introduce a foreground structureaware loss Lf ore, namely the smoothness loss [22] to further constrain the structure of the salient foreground (regions within the bounding box area).
As the background of the pseudo saliency map is accurate, we introduce background loss Lback using the partial cross-entropy loss to constrain model prediction in the background bounding box area.
Let’s define the training data set as: D = {xi, yi}N i=1 of size N, where xi and yi are the input RGB image and it’s corresponding bounding box supervision, and i indexes the images.
d = {xi, yi}n i=1 of size n, ここで xi と yi は入力 rgb イメージであり、対応するバウンディングボックスの監督であり、イメージをインデックス化する。 訳抜け防止モード: トレーニングデータセットを次のように定義しましょう。 サイズ n の yi}n i=1 である。 xiとyiは入力rgb画像です そしてそれは、それに対応するバウンディングボックスの監督だ。 画像をインデックス化します
0.69
To generate yi, each salient non-overlap salient instance is annotated with separated bounding box, and we generate one single bounding box for the overlapped salient instances.
In this way, the foreground of yi contains different level of noise depending on the position and shape of the bounding box, and the background of yi is accurate background.
Four main steps are included in our method: (1) a persudo saliency map generator which use the Grabcut to generate pseudo saliency maps given bounding box supervision; (2) a saliency prediction network (SPN) to produce a saliency map, which is supervised with above pseudo saliency maps; (3) structure-aware loss to optimize the foreground predictions; (4) partial-cross entropy based background loss to optimize the background prediction.
Given bounding box supervision, we first generate pseudo saliency maps with Grabcut (see Fig 1).
境界ボックスの監督を考えると、まずグラブカットによる擬似給与マップを生成する(図1参照)。
0.64
Compared with the direct bounding box supervision yi, the generated pseudo saliency map gi is more accurate in structure, making it suitable to serve as pseudo label for saliency prediction.
Given the backbone feature fθ1(x) = {sk}4 k=1 (θ1 is parameters of the encoder (the backbone model), the saliency prediction network aims to generate saliency map s = fθ(x), where θ = {θ1, θ2} with θ2 as parameters of the decoder.
s and g are the model prediction and pseudo saliency map with GrabCut.
s と g は GrabCut を用いたモデル予測と擬似唾液マップである。
0.84
3.4. Foreground Structure Constrain
3.4. 前景構造制約
0.50
Although GrabCut can generate relatively better pseudo label compared with the original bounding box supervision, the complex saliency foreground leads to noisy supervision after GrabCut.
s2 + 1e−6 to avoid calculatwhere Ψ is defined as Ψ(s) = ing the square root of zero, Iu,v is the image intensity value at pixel (u, v), d indicates the partial derivatives on the −→x and −→y directions.
Different with the conventional smoothness loss in [22], we introduce gate y to the calculation of smoothness loss to pay attention to the bounding box foreground region inspired by [4].
従来の [22] の平滑性損失と異なり, [4] に触発された境界ボックス前景領域に注意を払うために, 平滑性損失の計算にゲート y を導入する。
0.84
3.5. Background Accuracy Constrain
3.5. 背景精度の制約
0.54
As background of the bounding box supervision is accurate background for saliency prediction, we adopt partial crossentropy loss to constrain accuracy of prediction within the
Fig. 3. Visual comparison with benchmark saliency detection models, where F3Net [1], MSFNet [36] and CTDNet[37] are fully-supervised saliency detection models, and SSAL [4] is a scribble-supervised weakly-supervised saliency detection model.
Specifically, given bounding box annotation y and model prediction from the saliency prediction network s, we define the background loss as:
具体的には,有界ボックスアノテーション y とサリエンシ予測ネットワーク s からモデル予測を行い,背景損失を次のように定義する。
0.82
Lback = γLce(s ∗ (1 − y), 0),
Lback = γLce(s ∗ (1 − y), 0)
0.39
(3) where γ = H∗W H∗W−z (H and W represent image size, z is the number of pixels that are covered in the foreground bounding box), 0 is an all-zero matrix of the same size as s.
(3) γ = H∗W H∗W−z (H と W は画像サイズを表し、z は前景境界箱で覆われたピクセルの数) が 0 は s と同じ大きさのゼロ行列である。
0.62
3.6. Training the Model With both the saliency regression loss Lspn, foreground structure loss Lf ore and background accuracy loss Lback, our final loss function is defined as:
Setup Data set: We train our models using the DUTS training dataset [5] D = {xi, yi}N i=1 of size N = 10, 553, and test on six other widely used datasets: the DUTS testing dataset, ECSSD [27], DUT [28], HKU-IS [29], PASCAL-S [30] and the SOD testing dataset [31].
データセットの設定: DUTSトレーニングデータセット [5] D = {xi, yi}N i=1 of size N = 10, 553を使用してモデルをトレーニングし、他の広く使用されているデータセットとして、DUTSテストデータセット、ECSSD [27]、DUT [28]、HKU-IS [29]、PASCAL-S [30]、SODテストデータセット [31]を使用する。
0.84
The supervision yi in our case is bounding box annotation.
私たちの場合の監督yiはバウンディングボックスアノテーションです。
0.50
Evaluation Metrics: Four evaluation metrics are used, including Mean Absolute Error (MAE M), Mean F-measure (Fβ), mean E-measure (Eξ) and the S-measure Sα [41]
評価指標: 平均絶対誤差(mae m)、平均f-測定(fβ)、平均e-測定(e)、s-測定sα[41]の4つの評価指標が使用される。 訳抜け防止モード: 評価指標 : 平均絶対誤差(MAE M)を含む4つの評価指標を用いる。 平均 F - 測度(Fβ)、平均 E - 測度(E) S-測度 Sα [41 ]
0.82
4.2. Performance Comparison Quantitative comparison: We show performance of our model in Table 1, where models in the top two blocks are fully supervised models (models in the middle block are transformer [42] based), and models in the last block (not “Ours”) are weakly supervised models.
Performance comparison show competitive performance of our model, leading to an alternative weakly supervised saliency detection model.
性能比較は,本モデルの競合性能を示し,弱教師付き塩分濃度検出モデルを実現する。
0.80
Qualitative comparison: We compare predictions of our model with four benchmark models and show results in Fig. 3, which further explain that with both the foreground and background constrains, our weakly supervised model can obtain relative structure accurate predictions.
Performance of ablation study related experiments.
アブレーション研究関連実験の性能
0.69
Image GT BndBox
画像 GT bndボックス
0.61
GCut FGCut Ours
ガッツ FGCut 我々の
0.48
Fig. 4. Visual comparison of the ablation study related experiments, where each model (“BndBox”, “GCut” and “FGCut”) is introduced in the ablation study section.
test time, we can produce saliency maps with the saliency prediction network, leading to an average inference time of 0.02s/image, which is comparable with existing techniques.
We conducted further experiments to explain the contribution of each component of the proposed model, and show performance of the related experiments in Table 2.
Training directly with bounding box supervision yi: Given bounding box supervision, a straight-forward solution is training directly with the binary bounding box as supervision.
Training directly with pseudo label from GrabCut: With the refined pseudo label gi using GrabCut, we can train another model with gi as pseudo label directly.
Contribution of foreground structure loss: We further add the foreground structure loss to “GCut”, leading to “FGCut”.
前景構造損失の寄与: さらに「GCut」に前景構造損失を加え、「FGCut」に導く。
0.66
Analysis: As shown in Table 2, directly training with bounding box supervision yields unsatisfactory results, where the model learns to regress the bounding box region (see “BndBox” in Fig 4).
Although the pseudo saliency map with GrabCut is noisy, the model based on it can still generate reasonable predictions (see “GCut” in both Table 2 and Fig 4.).
We further performed experiments with both longer and shorter training epochs, and observed similar conclusion.
さらに,より長く短いトレーニング期間で実験を行い,同様の結論を得た。
0.78
We will investigate the optimal training epochs for better performance.
我々は,パフォーマンス向上のための最適トレーニング期間について検討する。
0.62
Impact of the feat channel C for new backbonf feature generation in the “saliency prediction network”: For dimension reduction, we feed the backbone feature {sk}4 k=1 to four different 3 × 3 convolutional layers to generate the new backbone feature {s(cid:48) k=1 of channel size C = 64.
次元減少のために、バックボーン特徴 {sk}4 k=1 を 4 つの異なる 3 × 3 個の畳み込み層に供給し、チャネルサイズ C = 64 の新しいバックボーン特徴 {s(cid:48) k=1 を生成する。 訳抜け防止モード: サリエンシ予測ネットワーク」における新しいバックボンフ特徴生成に対するオープニングチャネルCの影響 : 次元縮小のために, 背骨の特徴 { sk}4 k=1 を4つの異なる3×3畳み込み層に供給する。 to generate the new backbone feature { s(cid:48 ) k=1 of channel size C = 64。
0.87
We find that model performance is influenced with C. The larger C leads to better overall performance.
モデルパフォーマンスはCの影響を受けています。
0.39
However, the size of the model is also significantly enlarged.
しかし、モデルのサイズも大幅に拡大されている。
0.77
To achieve trade off between model performance and training/testing time, we set C = 64.
モデルパフォーマンスとトレーニング/テスト時間のトレードオフを達成するために、C = 64を設定しました。 訳抜け防止モード: モデル性能とトレーニング/テスト時間のトレードオフを実現する。 C = 64 とする。
0.79
We will investigate the optimal C in the future.
今後最適なCについて検討する。
0.54
Edge detection as auxiliary task for prediction structure recovery: [4] introduced auxiliary edge detection module to their weakly supervised learning framework for structure recovery.
We have tried the same strategy and observed no significant performance improvement in our setting.
我々は同じ戦略を試し、我々の設定では大きなパフォーマンス改善は見つからなかった。
0.74
As a multi-task learning framework, the convergence rate of each task is especially important for the final performance, and more sophisticated multi-task learning solution is important to fully explore the contribution of auxiliary edge detection for weakly supervised learning.
In this paper, we introduce a bounding box based weakly supervised saliency detection model.
本稿では,境界ボックスに基づく弱教師付き唾液度検出モデルを提案する。
0.75
Due to the different accuracy of foreground and background, we introduce two sets of loss functions to constrain the predictions within the foreground and background bounding box regions.
and focus for salient object detection,” in AAAI, 2020.
と、2020年のaaaiで述べている。
0.28
[2] Bo Wang, Quan Chen, Min Zhou, Zhiqiang Zhang, Xiaogang Jin, and Kun Gai, “Progressive feature polishing network for salient object detection.,” in AAAI, 2020, pp. 12128–12135.
[2] Bo Wang, Quan Chen, Min Zhou, Zhiqiang Zhang, Xiaogang Jin, Kun Gai, “Progressive feature polishing network for salient object detection.”, AAAI, 2020, pp. 12128–12135。 訳抜け防止モード: [2]ボー・ワン、クァン・チェン、ミン・ジュ Zhiqiang Zhang, Xiaogang Jin, Kun Gai, “Progressive feature polishing network for salient object detection ”。 AAAI, 2020, pp. 12128–12135。
0.71
[3] Jun Wei, Shuhui Wang, Zhe Wu, Chi Su, Qingming Huang, and Qi Tian, “Label decoupling framework for salient object detection,” in CVPR, June 2020.
CVPRの2020年6月号によると、[3]Jun Wei, Shuhui Wang, Zhe Wu, Chi Su, Qingming Huang, Qi Tian, “Label decoupling framework for salient object detection”。
0.83
[4] Jing Zhang, Xin Yu, Aixuan Li, Peipei Song, Bowen Liu, and Yuchao Dai, “Weakly-supervised salient object detection via scribble annotations,” in CVPR, 2020.
[4]Jing Zhang, Xin Yu, Aixuan Li, Peipei Song, Bowen Liu, and Yuchao Dai, “Weakly-supervised salient object detection via scribble annotations” in CVPR, 2020。 訳抜け防止モード: [4 ]ジン・チャン、シン・ユ、アジュアン・リー、 Peipei Song, Bowen Liu, and Yuchao Dai, “Wakly - scribbleアノテーションによる有能な物体検出を監督する”。 CVPR、2020年。
0.69
[5] Lijun Wang, Huchuan Lu, Yifan Wang, Mengyang Feng, Dong Wang, Baocai Yin, and Xiang Ruan, “Learning to detect salient objects with image-level supervision,” in CVPR, 2017, pp. 136–145.
[5] wang, huchuan lu, yifan wang, mengyang feng, dong wang, baocai yin, xiang ruan, “learning to detect salient objects with image-level supervisor” cvpr, 2017 pp. 136–145 に記載されている。
0.78
[6] Duc Tam Nguyen, Maximilian Dax, Chaithanya Kumar Mummadi, Thi-Phuong-Nhung Ngo, Thi Hoai Phuong Nguyen, Zhongyu Lou, and Thomas Brox, “Deepusps: Deep robust unsupervised saliency prediction via self-supervision,” in NeurIPS, 2019, pp. 204–214.
6] Duc Tam Nguyen, Maximilian Dax, Chaithanya Kumar Mummadi, Thi-Phuong-Nhung Ngo, Thi Hoai Phuong Nguyen, Zhongyu Lou, Thomas Brox, “Deepusps: Deep robust unsupervised saliency prediction via self-supervision”, NeurIPS, 2019, pp. 204–214。
0.44
[7] Jing Zhang, Tong Zhang, Yuchao Dai, Mehrtash Harandi, and Richard Hartley, “Deep unsupervised saliency detection: A multiple noisy labeling perspective,” in CVPR, 2018, pp. 9029–9038.
Jing Zhang, Tong Zhang, Yuchao Dai, Mehrtash Harandi, Richard Hartley, “Deep unsupervised saliency detection: A multiple noisy labeling perspective” in CVPR, 2018, pp. 9029–9038。 訳抜け防止モード: 《7 ] ジン・チャン, トン・チャン, ユチャオ・ダイ, mehrtash harandi, richard hartley, “deep unsupervised saliency detection: a multiple noise labeling perspective” cvpr、2018年、p.9029-9038。
0.64
[8] Jing Zhang, Jianwen Xie, and Nick Barnes, “Learning noise-aware encoder-decoder from noisy labels by alternating back-propagation for saliency detection,” in ECCV, 2020.
“Supervision by fusion: Towards unsupervised learning of deep salient object detector,” in ICCV, 2017, pp. 4048–4056.
ICCV, 2017, pp. 4048–4056. “Supervision by fusion: Towards unsupervised learning of Deep Salient object detector”. ICCV, 2017. 4048–4056 訳抜け防止モード: 「核融合による監督 : 深層物質検知器の教師なし学習を目指して」 ICCV, 2017, pp. 4048–4056。
[12] Yuxuan Liu, Pengjie Wang, Ying Cao, Zijian Liang, and Rynson W. H. Lau, “Weakly-supervised salient object detection with saliency bounding boxes,” TIP, pp. 4423–4435, 2021.
12]Yuxuan Liu, Pengjie Wang, Ying Cao, Zijian Liang, and Rynson W. H. Lau, “Weakly-supervised salient object detection with saliency bounding box”, TIP, pp. 4423–4435, 2021。 訳抜け防止モード: 12 ]ユキサン・リウ、ペンジー・ワン、ヤン・カオ、 zijian liang, and rynson w. h. lau, “weakly - supervised salient object detection with saliency bounding box” (英語) tip , pp 4423-4435 , 2021 。
0.68
[13] Siyue Yu, Bingfeng Zhang, Jimin Xiao, and Eng Gee Lim, “Structureconsistent weakly supervised salient object detection with local saliency coherence,” in AAAI, 2021, pp. 3234–3242.
[13]Siyue Yu, Bingfeng Zhang, Jimin Xiao, Eng Gee Lim, “Structureconsistent weaklysupervised salient object detection with local saliency coherence” in AAAI, 2021, pp. 3234–3242。 訳抜け防止モード: 《13歳]シユエ・ユ,ビンフェン・ジャン,ジミン・シャオ, そしてeng gee limは、”構造的に弱い教師付きサルエント物体検出と局所的な塩分コヒーレンス”だ。 aaai , 2021 , pp. 3234-3242 において。
0.50
[14] Jing Zhang, Yuchao Dai, Tong Zhang, Mehrtash Harandi, Nick Barnes, and Richard Hartley, “Learning saliency from single noisy labelling: A robust model fitting perspective,” TPAMI, vol.
[14]Jing Zhang, Yuchao Dai, Tong Zhang, Mehrtash Harandi, Nick Barnes, Richard Hartley, “Learning saliency from single noisy labelling: A robust model fit perspective”, TPAMI, vol. 訳抜け防止モード: 14 ] チャン・チャン、ユチャオ・ダイ、トン・チャン、 mehrtash harandi, nick barnes, richard hartley, “learning saliency from single noise labelling: a robust model fit perspective, ” tpami, vol. ](英語)
0.61
43, no. 8, pp. 2866– 2873, 2020.
43, no. 8, pp. 2866–2873, 2020。
0.85
[15] Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q. Weinberger, “On calibration of modern neural networks,” in ICML, 2017, pp. 1321– 1330.
[15] Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q. Weinberger, “On calibration of modern neural network” in ICML, 2017, pp. 1321–1330。 訳抜け防止モード: [15 ]中安弘、Geoff Pleiss、Yu Sun、 Kilian Q. Weinbergerは曰く、“現代のニューラルネットワークの校正について”。 ICML、2017年、p.1321-1330。
0.72
[16] Zuyao Chen, Qianqian Xu, Runmin Cong, and Qingming Huang, “Global context-aware progressive aggregation network for salient object detection,” in AAAI, 2020, pp. 10599–10606.
[17] Xiaoning Zhang, Tiantian Wang, Jinqing Qi, Huchuan Lu, and Gang Wang, “Progressive attention guided recurrent network for salient object detection,” in CVPR, 2018, pp. 714–722.
[17]Xiaoning Zhang, Tiantian Wang, Jinqing Qi, Huchuan Lu, Gang Wang, “Progressive attention guided recurrent network for salient object detection” in CVPR, 2018, pp. 714–722。 訳抜け防止モード: [17]Xiaoning Zhang, Tiantian Wang, Jinqing Qi, Huchuan LuとGang Wangは、“プログレッシブアテンションは、有能な物体検出のためのリカレントネットワークをガイドした”。 CVPR, 2018, pp. 714–722。
0.76
[18] Youwei Pang, Xiaoqi Zhao, Lihe Zhang, and Huchuan Lu, “Multiscale interactive network for salient object detection,” in CVPR, 2020, pp. 9413–9422.
He18] Youwei Pang, Xiaoqi Zhao, Lihe Zhang, and Huchuan Lu, “Multiscale interactive network for salient object detection” in CVPR, 2020, pp. 9413–9422。 訳抜け防止モード: 18 ] youwei pang, xiaoqi zhao, lihe zhang そしてhuchuan luは、”salient object detectionのためのマルチスケールインタラクティブネットワーク”だ。 cvpr、2020年、p.9413-9422。
0.64
[19] Siyue Yu, Bingfeng Zhang, Jimin Xiao, and Eng Gee Lim, “Structureconsistent weakly supervised salient object detection with local saliency coherence,” in AAAI, 2021.
[19]2021年、AAAIのSyue Yu, Bingfeng Zhang, Jimin Xiao, Eng Gee Lim, “Structureconsistent weakly supervised salient object detection with local saliency coherence”。 訳抜け防止モード: 《19歳]シユエ・ユ,ビンフェン・ジャン,ジミン・シャオ, そしてeng gee limは、”構造的に弱い教師付きサルエント物体検出と局所的な塩分コヒーレンス”だ。 昭和2021年(2021年)。
0.46
[20] Wenguan Wang, Shuyang Zhao, Jianbing Shen, Steven CH Hoi, and Ali Borji, “Salient object detection with pyramid attention and salient edges,” in CVPR, 2019, pp. 1448–1457.
Wenguan Wang, Shuyang Zhao, Jianbing Shen, Steven CH Hoi, Ali Borji, “Salient object detection with pyramid attention and salient edges” in CVPR, 2019, pp. 1448–1457. [20] Wenguan Wang, Shuyang Zhao, Jianbing Shen, Steven CH Hoi, and Ali Borji。 訳抜け防止モード: [20]ウェングアン・ワン、周陽、ジャンビン・シェン Steven CH Hoi,Ali Borji両氏は,“ピラミッドの注意と健全なエッジによる健全な物体検出”について述べている。 CVPR, 2019, pp. 1448-1457。
0.69
[21] Jia-Xing Zhao, Jiang-Jiang Liu, Deng-Ping Fan, Yang Cao, Jufeng Yang, and Ming-Ming Cheng, “Egnet: Edge guidance network for salient object detection,” in ICCV, 2019, pp. 8779–8788.
[21]Jia-Xing Zhao, Jiang-Jiang Liu, Deng-Ping Fan, Yang Cao, Jufeng Yang, Ming-Ming Cheng, “Egnet: Edge guidance network for salient object detection” in ICCV, 2019, pp. 8779–8788。 訳抜け防止モード: 〔21〕治阿-清宗、江-江理、 deng - pingファン、yang cao、jufeng yang。 and ming - ming cheng, “egnet: edge guidance network for salient object detection” (英語) iccv, 2019, pp. 8779-8788。
0.66
[22] Yang Wang, Yi Yang, Zhenheng Yang, Liang Zhao, Peng Wang, and Wei Xu, “Occlusion aware unsupervised learning of optical flow,” in CVPR, 2018, pp. 4884–4893.
CVPR, 2018, pp. 4884–4893.[22] Yang Wang, Yi Yang, Zhenhenheng Yang, Liang Zhao, Peng Wang, Wei Xu, “Occlusion aware unsupervised learning of optical flow” in CVPR, 2018, pp. 4884–4893. 訳抜け防止モード: [22 ]ヤン・ワン、イ・ヤン、チェン・ヤン Liang Zhao, Peng Wang, Wei Xu, “Occlusion aware unsupervised learning of optical flow”。 CVPR, 2018 , pp. 4884–4893。
0.67
[23] Carsten Rother, Vladimir Kolmogorov, and Andrew Blake, “” grabcut” interactive foreground extraction using iterated graph cuts,” ACM transactions on graphics (TOG), vol.
ACM transaction on graphics (TOG) vol.[23] Carsten Rother, Vladimir Kolmogorov, and Andrew Blake, ““” grabcut” Interactive foreground extract using Iterated graph cuts”, ACM transactions on graphics (TOG). 訳抜け防止モード: 23 ]カーステン・ロザー、ウラジーミル・コルモゴロフ、アンドリュー・ブレイク 反復グラフカットを用いた対話型フォアグラウンド抽出 acm transactions on graphics (tog ) , vol。
0.59
23, no. 3, pp. 309–314, 2004.
23, 3, pp. 309-314, 2004。
0.72
[24] Cheng-Chun Hsu, Kuang-Jui Hsu, Chung-Chi Tsai, Yen-Yu Lin, and Yung-Yu Chuang, “Weakly supervised instance segmentation using the bounding box tightness prior,” in NeurIPS, 2019, vol.
[27] Qiong Yan, Li Xu, Jianping Shi, and Jiaya Jia, “Hierarchical saliency
[27]イオンヤン、李Xu、ジャンピング・シー、ジヤ・ジア「ヒエラルキー・サリエンシー」
0.58
detection,” in CVPR, 2013, pp. 1155–1162.
検出」はcvpr, 2013 pp. 1155-1162。
0.59
[28] C. Yang, L. Zhang, H. Lu, X. Ruan, and M. Yang, “Saliency detection
[28]C. Yang, L. Zhang, H. Lu, X. Ruan, M. Yang, “Saliency detection”
0.46
via graph-based manifold ranking,” in CVPR, 2013, pp. 3166–3173.
via graph-based manifold ranking” in cvpr, 2013 pp. 3166–3173. (英語)
0.76
[29] Guanbin Li and Yizhou Yu, “Visual saliency based on multiscale deep
[29]guanbin li,yizhou yu,『多スケール深層に基づく視覚塩分』
0.65
features,” in CVPR, 2015, pp. 5455–5463.
特色」はcvpr, 2015 pp. 5455-5463。
0.62
[30] Yin Li, Xiaodi Hou, Christof Koch, James M Rehg, and Alan L Yuille, “The secrets of salient object segmentation,” in CVPR, 2014, pp. 280– 287.
[30] Yin Li, Xiaodi Hou, Christof Koch, James M Rehg, and Alan L Yuille, “The Secrets of salient object segmentation” in CVPR, 2014 pp. 280–287。 訳抜け防止モード: [30 ]Yin Li,Xiaodi Hou,Christof Koch, James M Rehg, and Alan L Yuille, “The Secrets of salient object segmentation, ” in CVPR, 2014, pp . 280 – 287.”. インターネット・ムービー・データベース(英語)
0.82
[31] Vida Movahedi and James H Elder, “Design and perceptual validation of performance measures for salient object segmentation,” in CVPR Workshop, 2010, pp. 49–56.
[31] vida movahedi と james h elder は,cvpr workshop, 2010, pp. 49-56 において, “salient object segmentation のパフォーマンス指標の設計と知覚の検証” を行った。
0.77
[32] Zhe Wu, Li Su, and Qingming Huang, “Stacked cross refinement netin ICCV, 2019, pp.
[32]Zhe Wu, Li Su, and Qingming Huang, “Stacked cross refinement netin ICCV, 2019, pp。 訳抜け防止モード: [32 ]周武、李周、清明黄 積み重ねクロスリファインネットICCV, 2019, pp。
0.47
work for edge-aware salient object detection,” 7264–7273.
edge-aware salient object detection" 7264-7273。
0.36
[33] Jun Wei, Shuhui Wang, and Qingming Huang, “F3net: Fusion, feedback and focus for salient object detection,” in AAAI, 2020, pp. 12321– 12328.
[33]Jun Wei, Shuhui Wang, Qingming Huang, “F3net: Fusion, feedback and focus for salient object detection” in AAAI, 2020, pp. 12321–12328。 訳抜け防止モード: 〔33歳〕文順、王秀文、王清明。 講演「f3net: fusion, feedback and focus for salient object detection」 aaai , 2020 , pp. 12321 – 12328。
0.70
[34] Huajun Zhou, Xiaohua Xie, Jian-Huang Lai, Zixuan Chen, and Lingxiao Yang, “Interactive two-stream decoder for accurate and fast saliency detection,” in CVPR, 2020, pp. 9141–9150.
[34] huajun zhou, xiaohua xie, jian-huang lai, zixuan chen, lingxiao yang, “interactive two-stream decoder for accurate and fast saliency detection” cvpr, 2020, pp. 9141–9150 に記載されている。 訳抜け防止モード: [34 ]Hhuajun Zhou,Xiaohua Xie,Jian-Huang Lai, Zixuan Chen, Lingxiao Yang, “Interactive Two – Stream Decoder for accurate and fast saliency detection” と題している。 CVPR, 2020, pp. 9141-9150。
0.85
[35] Binwei Xu, Haoran Liang, Ronghua Liang, and Peng Chen, “Locate globally, segment locally: A progressive architecture with knowledge review network for salient object detection,” in AAAI, 2021, pp. 3004– 3012.
sual saliency transformer,” in ICCV, 2021, pp. 4722–4732.
sual saliency transformer” in iccv, 2021, pp. 4722-4732。
0.39
[39] Jing Zhang, Jianwen Xie, Nick Barnes, and Ping Li, “Learning generative vision transformer with energy-based latent space for saliency prediction,” in NeurIPS, 2021, vol.
Jing Zhang, Jianwen Xie, Nick Barnes, Ping Li, “Learning Generative Vision transformer with energy-based latent space for saliency prediction” in NeurIPS, 2021, vol.[39] Jing Zhang, Jianwen Xie, Nick Barnes, Ping Li。 訳抜け防止モード: [39 ]Jing Zhang, Jianwen Xie, Nick Barnes, そしてPing Li氏は,“エネルギーをベースとした生成的視覚変換器の学習 – 給与予測のための潜在空間”だ。 in NeurIPS , 2021 , vol 。
0.80
34. [40] Xin Li, Fan Yang, Hong Cheng, Wei Liu, and Dinggang Shen, “Contour knowledge transfer for salient object detection,” in ECCV, 2018, pp. 355–370.
34. [40]Xin Li, Fan Yang, Hong Cheng, Wei Liu, Dinggang Shen, “Contour knowledge transfer for salient object detection” in ECCV, 2018, pp. 355–370。 訳抜け防止モード: 34. 40 ] 李新、陽ファン、ホンチョン wei liu, and dinggang shen, “salient object detectionのための輪郭知識伝達” eccv , 2018 , pp . 355-370。
0.50
[41] Deng-Ping Fan, Ge-Peng Ji, Xuebin Qin, and Ming-Ming Cheng, “Cognitive vision inspired object segmentation metric and loss function,” SCIENTIA SINICA Informationis, 2021.
[41]Deng-Ping Fan, Ge-Peng Ji, Xuebin Qin, Ming-Ming Cheng, “Cognitive Vision inspired Object segmentation metric and Los Function”, SCIENTIA SINICA Informationis, 2021。 訳抜け防止モード: [41 ]Deng-Ping Fan, Ge-Peng Ji, Xuebin Qin, Ming - Ming Cheng, “認知ビジョンはオブジェクトセグメンテーションのメトリクスと損失関数に影響を与えた”。 SCIENTIA SINICA Informationis, 2021年。
0.75
[42] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin, “Attention is all you need,” in NeurIPS, 2017.
a b [42] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, s ukasz Kaiser, Illia Polosukhin, “Attention is all you need” in NeurIPS, 2017 訳抜け防止モード: [42 ]Ashish Vaswani,Noam Shazeer,Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, s ukasz Kaiser Illia Polosukhin氏は2017年のNeurIPSで次のように述べている。