Abstract: This paper studies unsupervised/self-supervised whole-graph representation
learning, which is critical in many tasks such as molecule properties
prediction in drug and material discovery. Existing methods mainly focus on
preserving the local similarity structure between different graph instances but
fail to discover the global semantic structure of the entire data set. In this
paper, we propose a unified framework called Local-instance and Global-semantic
Learning (GraphLoG) for self-supervised whole-graph representation learning.
Specifically, besides preserving the local similarities, GraphLoG introduces
the hierarchical prototypes to capture the global semantic clusters. An
efficient online expectation-maximization (EM) algorithm is further developed
for learning the model. We evaluate GraphLoG by pre-training it on massive
unlabeled graphs followed by fine-tuning on downstream tasks. Extensive
experiments on both chemical and biological benchmark data sets demonstrate the
effectiveness of the proposed approach.