PartImageNet: A Large, High-Quality Dataset of Parts
- URL: http://arxiv.org/abs/2112.00933v1
- Date: Thu, 2 Dec 2021 02:12:03 GMT
- Title: PartImageNet: A Large, High-Quality Dataset of Parts
- Authors: Ju He, Shuo Yang, Shaokang Yang, Adam Kortylewski, Xiaoding Yuan,
Jie-Neng Chen, Shuai Liu, Cheng Yang, Alan Yuille
- Abstract summary: We propose PartImageNet, a high-quality dataset with part segmentation annotations.
PartImageNet is unique because it offers part-level annotations on a general set of classes with non-rigid, articulated objects.
It can be utilized in multiple vision tasks including but not limited to: Part Discovery, Few-shot Learning.
- Score: 16.730418538593703
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A part-based object understanding facilitates efficient compositional
learning and knowledge transfer, robustness to occlusion, and has the potential
to increase the performance on general recognition and localization tasks.
However, research on part-based models is hindered due to the lack of datasets
with part annotations, which is caused by the extreme difficulty and high cost
of annotating object parts in images. In this paper, we propose PartImageNet, a
large, high-quality dataset with part segmentation annotations. It consists of
158 classes from ImageNet with approximately 24000 images. PartImageNet is
unique because it offers part-level annotations on a general set of classes
with non-rigid, articulated objects, while having an order of magnitude larger
size compared to existing datasets. It can be utilized in multiple vision tasks
including but not limited to: Part Discovery, Semantic Segmentation, Few-shot
Learning. Comprehensive experiments are conducted to set up a set of baselines
on PartImageNet and we find that existing works on part discovery can not
always produce satisfactory results during complex variations. The exploit of
parts on downstream tasks also remains insufficient. We believe that our
PartImageNet will greatly facilitate the research on part-based models and
their applications. The dataset and scripts will soon be released at
https://github.com/TACJu/PartImageNet.
Related papers
- 1st Place Solution for MOSE Track in CVPR 2024 PVUW Workshop: Complex Video Object Segmentation [72.54357831350762]
We propose a semantic embedding video object segmentation model and use the salient features of objects as query representations.
We trained our model on a large-scale video object segmentation dataset.
Our model achieves first place (textbf84.45%) in the test set of Complex Video Object Challenge.
arXiv Detail & Related papers (2024-06-07T03:13:46Z) - Inferring and Leveraging Parts from Object Shape for Improving Semantic
Image Synthesis [64.05076727277431]
This paper presents to infer Parts from Object ShapE (iPOSE) and leverage it for improving semantic image synthesis.
We learn a PartNet for predicting the object part map with the guidance of pre-defined support part maps.
Experiments show that our iPOSE not only generates objects with rich part details, but also enables to control the image synthesis flexibly.
arXiv Detail & Related papers (2023-05-31T04:27:47Z) - Towards Open-World Segmentation of Parts [16.056921233445784]
We propose to explore a class-agnostic part segmentation task.
We argue that models trained without part classes can better localize parts and segment them on objects unseen in training.
We show notable and consistent gains by our approach, essentially a critical step towards open-world part segmentation.
arXiv Detail & Related papers (2023-05-26T10:34:58Z) - Leveraging GAN Priors for Few-Shot Part Segmentation [43.35150430895919]
Few-shot part segmentation aims to separate different parts of an object given only a few samples.
We propose to learn task-specific features in a "pre-training"-"fine-tuning" paradigm.
arXiv Detail & Related papers (2022-07-27T10:17:07Z) - VizWiz-FewShot: Locating Objects in Images Taken by People With Visual
Impairments [74.72656607288185]
We introduce a few-shot localization dataset originating from photographers who authentically were trying to learn about the visual content in the images they took.
It includes nearly 10,000 segmentations of 100 categories in over 4,500 images that were taken by people with visual impairments.
Compared to existing few-shot object detection and instance segmentation datasets, our dataset is the first to locate holes in objects.
arXiv Detail & Related papers (2022-07-24T20:44:51Z) - 3D Compositional Zero-shot Learning with DeCompositional Consensus [102.7571947144639]
We argue that part knowledge should be composable beyond the observed object classes.
We present 3D Compositional Zero-shot Learning as a problem of part generalization from seen to unseen object classes.
arXiv Detail & Related papers (2021-11-29T16:34:53Z) - Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations.
We propose an unsupervised approach to object part discovery and segmentation.
Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z) - DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort [117.41383937100751]
Current deep networks are extremely data-hungry, benefiting from training on large-scale datasets.
We show how the GAN latent code can be decoded to produce a semantic segmentation of the image.
These generated datasets can then be used for training any computer vision architecture just as real datasets are.
arXiv Detail & Related papers (2021-04-13T20:08:29Z) - Robust Instance Segmentation through Reasoning about Multi-Object
Occlusion [9.536947328412198]
We propose a deep network for multi-object instance segmentation that is robust to occlusion.
Our work builds on Compositional Networks, which learn a generative model of neural feature activations to locate occluders.
In particular, we obtain feed-forward predictions of the object classes and their instance and occluder segmentations.
arXiv Detail & Related papers (2020-12-03T17:41:55Z) - Multi-branch and Multi-scale Attention Learning for Fine-Grained Visual
Categorization [6.415792312027131]
ImageNet Large Scale Visual Recognition Challenge (ILSVRC) is one of the most authoritative academic competitions in the field of Computer Vision (CV) in recent years.
Applying ILSVRC's annual champion directly to fine-grained visual categorization (FGVC) tasks does not achieve good performance.
Our approach can be trained end-to-end, while provides short inference time.
arXiv Detail & Related papers (2020-03-20T08:43:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.