PPGN: Phrase-Guided Proposal Generation Network For Referring Expression
Comprehension
- URL: http://arxiv.org/abs/2012.10890v1
- Date: Sun, 20 Dec 2020 11:21:06 GMT
- Title: PPGN: Phrase-Guided Proposal Generation Network For Referring Expression
Comprehension
- Authors: Chao Yang, Guoqing Wang, Dongsheng Li, Huawei Shen, Su Feng, Bin Jiang
- Abstract summary: We propose a novel phrase-guided proposal generation network ( PPGN)
The main implementation principle of PPGN is refining visual features with text and generate proposals through regression.
Experiments show that our method is effective and achieve SOTA performance in benchmark datasets.
- Score: 31.39505099600821
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reference expression comprehension (REC) aims to find the location that the
phrase refer to in a given image. Proposal generation and proposal
representation are two effective techniques in many two-stage REC methods.
However, most of the existing works only focus on proposal representation and
neglect the importance of proposal generation. As a result, the low-quality
proposals generated by these methods become the performance bottleneck in REC
tasks. In this paper, we reconsider the problem of proposal generation, and
propose a novel phrase-guided proposal generation network (PPGN). The main
implementation principle of PPGN is refining visual features with text and
generate proposals through regression. Experiments show that our method is
effective and achieve SOTA performance in benchmark datasets.
Related papers
- Referring Expression Generation in Visually Grounded Dialogue with Discourse-aware Comprehension Guiding [3.8673630752805446]
We propose an approach to referring expression generation (REG) that is meant to produce referring expressions (REs) that are both discriminative and discourse-appropriate.
Results from our human evaluation indicate that our proposed two-stage approach is effective in producing discriminative REs.
arXiv Detail & Related papers (2024-09-09T15:33:07Z) - Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised
Referring Expression Grounding [214.8003571700285]
Weakly supervised Referring Expression Grounding (REG) aims to ground a particular target in an image described by a language expression.
We design an entity-enhanced adaptive reconstruction network (EARN)
EARN includes three modules: entity enhancement, adaptive grounding, and collaborative reconstruction.
arXiv Detail & Related papers (2022-07-18T05:30:45Z) - ProposalCLIP: Unsupervised Open-Category Object Proposal Generation via
Exploiting CLIP Cues [49.88590455664064]
ProposalCLIP is able to predict proposals for a large variety of object categories without annotations.
ProposalCLIP also shows benefits for downstream tasks, such as unsupervised object detection.
arXiv Detail & Related papers (2022-01-18T01:51:35Z) - Contrastive Proposal Extension with LSTM Network for Weakly Supervised
Object Detection [52.86681130880647]
Weakly supervised object detection (WSOD) has attracted more and more attention since it only uses image-level labels and can save huge annotation costs.
We propose a new method by comparing the initial proposals and the extension ones to optimize those initial proposals.
Experiments on PASCAL VOC 2007, VOC 2012 and MS-COCO datasets show that our method has achieved the state-of-the-art results.
arXiv Detail & Related papers (2021-10-14T16:31:57Z) - Natural Language Video Localization with Learnable Moment Proposals [40.91060659795612]
We propose a novel model termed LPNet (Learnable Proposal Network for NLVL) with a fixed set of learnable moment proposals.
In this paper, we demonstrate the effectiveness of LPNet over existing state-of-the-art methods.
arXiv Detail & Related papers (2021-09-22T12:18:58Z) - Adaptive Proposal Generation Network for Temporal Sentence Localization
in Videos [58.83440885457272]
We address the problem of temporal sentence localization in videos (TSLV)
Traditional methods follow a top-down framework which localizes the target segment with pre-defined segment proposals.
We propose an Adaptive Proposal Generation Network (APGN) to maintain the segment-level interaction while speeding up the efficiency.
arXiv Detail & Related papers (2021-09-14T02:02:36Z) - Online Active Proposal Set Generation for Weakly Supervised Object
Detection [41.385545249520696]
weakly supervised object detection methods only require image-level annotations.
Online proposal sampling is an intuitive solution to these issues.
Our proposed OPG algorithm shows consistent and significant improvement on both datasets PASCAL VOC 2007 and 2012.
arXiv Detail & Related papers (2021-01-20T02:20:48Z) - RRPN++: Guidance Towards More Accurate Scene Text Detection [0.30458514384586394]
We propose RRPN++ to exploit the potential of RRPN-based model by several improvements.
Based on RRPN, we propose the Anchor-free Pyramid Proposal Networks (APPN) to generate first-stage proposals.
In our second stage, both the detection branch and the recognition branch are incorporated to perform multi-task learning.
arXiv Detail & Related papers (2020-09-28T08:00:35Z) - BSN++: Complementary Boundary Regressor with Scale-Balanced Relation
Modeling for Temporal Action Proposal Generation [85.13713217986738]
We present BSN++, a new framework which exploits complementary boundary regressor and relation modeling for temporal proposal generation.
Not surprisingly, the proposed BSN++ ranked 1st place in the CVPR19 - ActivityNet challenge leaderboard on temporal action localization task.
arXiv Detail & Related papers (2020-09-15T07:08:59Z) - Ref-NMS: Breaking Proposal Bottlenecks in Two-Stage Referring Expression
Grounding [80.46288064284084]
Ref-NMS is the first method to yield expression-aware proposals at the first stage.
Ref-NMS regards all nouns in the expression as critical objects, and introduces a lightweight module to predict a score for aligning each box with a critical object.
Since Ref- NMS is agnostic to the grounding step, it can be easily integrated into any state-of-the-art two-stage method.
arXiv Detail & Related papers (2020-09-03T05:04:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.