ProposalCLIP: Unsupervised Open-Category Object Proposal Generation via
Exploiting CLIP Cues
- URL: http://arxiv.org/abs/2201.06696v1
- Date: Tue, 18 Jan 2022 01:51:35 GMT
- Title: ProposalCLIP: Unsupervised Open-Category Object Proposal Generation via
Exploiting CLIP Cues
- Authors: Hengcan Shi, Munawar Hayat, Yicheng Wu, Jianfei Cai
- Abstract summary: ProposalCLIP is able to predict proposals for a large variety of object categories without annotations.
ProposalCLIP also shows benefits for downstream tasks, such as unsupervised object detection.
- Score: 49.88590455664064
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object proposal generation is an important and fundamental task in computer
vision. In this paper, we propose ProposalCLIP, a method towards unsupervised
open-category object proposal generation. Unlike previous works which require a
large number of bounding box annotations and/or can only generate proposals for
limited object categories, our ProposalCLIP is able to predict proposals for a
large variety of object categories without annotations, by exploiting CLIP
(contrastive language-image pre-training) cues. Firstly, we analyze CLIP for
unsupervised open-category proposal generation and design an objectness score
based on our empirical analysis on proposal selection. Secondly, a graph-based
merging module is proposed to solve the limitations of CLIP cues and merge
fragmented proposals. Finally, we present a proposal regression module that
extracts pseudo labels based on CLIP cues and trains a lightweight network to
further refine proposals. Extensive experiments on PASCAL VOC, COCO and Visual
Genome datasets show that our ProposalCLIP can better generate proposals than
previous state-of-the-art methods. Our ProposalCLIP also shows benefits for
downstream tasks, such as unsupervised object detection.
Related papers
- Towards Completeness: A Generalizable Action Proposal Generator for Zero-Shot Temporal Action Localization [31.82121743586165]
Generalizable Action Proposal generator (GAP) is built in a query-based architecture and trained with a proposal-level objective.
Based on this architecture, we propose an Action-aware Discrimination loss to enhance the category-agnostic dynamic information of actions.
Our experiments show that our GAP achieves state-of-the-art performance on two challenging ZSTAL benchmarks.
arXiv Detail & Related papers (2024-08-25T09:07:06Z) - Global Knowledge Calibration for Fast Open-Vocabulary Segmentation [124.74256749281625]
We introduce a text diversification strategy that generates a set of synonyms for each training category.
We also employ a text-guided knowledge distillation method to preserve the generalizable knowledge of CLIP.
Our proposed model achieves robust generalization performance across various datasets.
arXiv Detail & Related papers (2023-03-16T09:51:41Z) - ProposalContrast: Unsupervised Pre-training for LiDAR-based 3D Object
Detection [114.54835359657707]
ProposalContrast is an unsupervised point cloud pre-training framework.
It learns robust 3D representations by contrasting region proposals.
ProposalContrast is verified on various 3D detectors.
arXiv Detail & Related papers (2022-07-26T04:45:49Z) - Contrastive Proposal Extension with LSTM Network for Weakly Supervised
Object Detection [52.86681130880647]
Weakly supervised object detection (WSOD) has attracted more and more attention since it only uses image-level labels and can save huge annotation costs.
We propose a new method by comparing the initial proposals and the extension ones to optimize those initial proposals.
Experiments on PASCAL VOC 2007, VOC 2012 and MS-COCO datasets show that our method has achieved the state-of-the-art results.
arXiv Detail & Related papers (2021-10-14T16:31:57Z) - Adaptive Proposal Generation Network for Temporal Sentence Localization
in Videos [58.83440885457272]
We address the problem of temporal sentence localization in videos (TSLV)
Traditional methods follow a top-down framework which localizes the target segment with pre-defined segment proposals.
We propose an Adaptive Proposal Generation Network (APGN) to maintain the segment-level interaction while speeding up the efficiency.
arXiv Detail & Related papers (2021-09-14T02:02:36Z) - Online Active Proposal Set Generation for Weakly Supervised Object
Detection [41.385545249520696]
weakly supervised object detection methods only require image-level annotations.
Online proposal sampling is an intuitive solution to these issues.
Our proposed OPG algorithm shows consistent and significant improvement on both datasets PASCAL VOC 2007 and 2012.
arXiv Detail & Related papers (2021-01-20T02:20:48Z) - Panoster: End-to-end Panoptic Segmentation of LiDAR Point Clouds [81.12016263972298]
We present Panoster, a novel proposal-free panoptic segmentation method for LiDAR point clouds.
Unlike previous approaches, Panoster proposes a simplified framework incorporating a learning-based clustering solution to identify instances.
At inference time, this acts as a class-agnostic segmentation, allowing Panoster to be fast, while outperforming prior methods in terms of accuracy.
arXiv Detail & Related papers (2020-10-28T18:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.