Abstract: Powered by the ImageNet dataset, unsupervised learning on large-scale data
has made significant advances for classification tasks. There are two major
challenges to allow such an attractive learning modality for segmentation
tasks: i) a large-scale benchmark for assessing algorithms is missing; ii)
unsupervised shape representation learning is difficult. We propose a new
problem of large-scale unsupervised semantic segmentation (LUSS) with a newly
created benchmark dataset to track the research progress. Based on the ImageNet
dataset, we propose the ImageNet-S dataset with 1.2 million training images and
40k high-quality semantic segmentation annotations for evaluation. Our
benchmark has a high data diversity and a clear task objective. We also present
a simple yet effective baseline method that works surprisingly well for LUSS.
In addition, we benchmark related un/weakly supervised methods accordingly,
identifying the challenges and possible directions of LUSS.