Towards Open-vocabulary Scene Graph Generation with Prompt-based
Finetuning
- URL: http://arxiv.org/abs/2208.08165v1
- Date: Wed, 17 Aug 2022 09:05:38 GMT
- Title: Towards Open-vocabulary Scene Graph Generation with Prompt-based
Finetuning
- Authors: Tao He, Lianli Gao, Jingkuan Song, Yuan-Fang Li
- Abstract summary: Scene graph generation (SGG) is a fundamental task aimed at detecting visual relations between objects in an image.
We introduce open-vocabulary scene graph generation, a novel, realistic and challenging setting in which a model is trained on a set of base object classes.
Our method can support inference over completely unseen object classes, which existing methods are incapable of handling.
- Score: 84.39787427288525
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scene graph generation (SGG) is a fundamental task aimed at detecting visual
relations between objects in an image. The prevailing SGG methods require all
object classes to be given in the training set. Such a closed setting limits
the practical application of SGG. In this paper, we introduce open-vocabulary
scene graph generation, a novel, realistic and challenging setting in which a
model is trained on a set of base object classes but is required to infer
relations for unseen target object classes. To this end, we propose a two-step
method that firstly pre-trains on large amounts of coarse-grained
region-caption data and then leverages two prompt-based techniques to finetune
the pre-trained model without updating its parameters. Moreover, our method can
support inference over completely unseen object classes, which existing methods
are incapable of handling. On extensive experiments on three benchmark
datasets, Visual Genome, GQA, and Open-Image, our method significantly
outperforms recent, strong SGG methods on the setting of Ov-SGG, as well as on
the conventional closed SGG.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.