A Graph-Theoretic Framework for Understanding Open-World Semi-Supervised
Learning
- URL: http://arxiv.org/abs/2311.03524v1
- Date: Mon, 6 Nov 2023 21:15:09 GMT
- Title: A Graph-Theoretic Framework for Understanding Open-World Semi-Supervised
Learning
- Authors: Yiyou Sun and Zhenmei Shi and Yixuan Li
- Abstract summary: Open-world semi-supervised learning aims at inferring both known and novel classes in unlabeled data.
This paper formalizes a graph-theoretic framework tailored for the open-world setting.
Our graph-theoretic framework illuminates practical algorithms and provides guarantees.
- Score: 33.05104609131764
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Open-world semi-supervised learning aims at inferring both known and novel
classes in unlabeled data, by harnessing prior knowledge from a labeled set
with known classes. Despite its importance, there is a lack of theoretical
foundations for this problem. This paper bridges the gap by formalizing a
graph-theoretic framework tailored for the open-world setting, where the
clustering can be theoretically characterized by graph factorization. Our
graph-theoretic framework illuminates practical algorithms and provides
guarantees. In particular, based on our graph formulation, we apply the
algorithm called Spectral Open-world Representation Learning (SORL), and show
that minimizing our loss is equivalent to performing spectral decomposition on
the graph. Such equivalence allows us to derive a provable error bound on the
clustering performance for both known and novel classes, and analyze rigorously
when labeled data helps. Empirically, SORL can match or outperform several
strong baselines on common benchmark datasets, which is appealing for practical
usage while enjoying theoretical guarantees.
Related papers
- Graph Classification via Reference Distribution Learning: Theory and Practice [24.74871206083017]
This work introduces Graph Reference Distribution Learning (GRDL), an efficient and accurate graph classification method.
GRDL treats each graph's latent node embeddings given by GNN layers as a discrete distribution, enabling direct classification without global pooling.
Experiments on moderate-scale and large-scale graph datasets show the superiority of GRDL over the state-of-the-art.
arXiv Detail & Related papers (2024-08-21T06:42:22Z) - On the Generalization Capability of Temporal Graph Learning Algorithms:
Theoretical Insights and a Simpler Method [59.52204415829695]
Temporal Graph Learning (TGL) has become a prevalent technique across diverse real-world applications.
This paper investigates the generalization ability of different TGL algorithms.
We propose a simplified TGL network, which enjoys a small generalization error, improved overall performance, and lower model complexity.
arXiv Detail & Related papers (2024-02-26T08:22:22Z) - Deep Contrastive Graph Learning with Clustering-Oriented Guidance [61.103996105756394]
Graph Convolutional Network (GCN) has exhibited remarkable potential in improving graph-based clustering.
Models estimate an initial graph beforehand to apply GCN.
Deep Contrastive Graph Learning (DCGL) model is proposed for general data clustering.
arXiv Detail & Related papers (2024-02-25T07:03:37Z) - Graph-level Protein Representation Learning by Structure Knowledge
Refinement [50.775264276189695]
This paper focuses on learning representation on the whole graph level in an unsupervised manner.
We propose a novel framework called Structure Knowledge Refinement (SKR) which uses data structure to determine the probability of whether a pair is positive or negative.
arXiv Detail & Related papers (2024-01-05T09:05:33Z) - Redundancy-Free Self-Supervised Relational Learning for Graph Clustering [13.176413653235311]
We propose a novel self-supervised deep graph clustering method named Redundancy-Free Graph Clustering (R$2$FGC)
It extracts the attribute- and structure-level relational information from both global and local views based on an autoencoder and a graph autoencoder.
Our experiments are performed on widely used benchmark datasets to validate the superiority of our R$2$FGC over state-of-the-art baselines.
arXiv Detail & Related papers (2023-09-09T06:18:50Z) - Learning subtree pattern importance for Weisfeiler-Lehmanbased graph
kernels [15.139294083028782]
We propose a method to learn the weights of subtree patterns in the framework of WWL kernels.
We present an efficient learning algorithm and also derive a generalization gap bound to show its convergence.
arXiv Detail & Related papers (2021-06-08T23:47:44Z) - Self-supervised Graph-level Representation Learning with Local and
Global Structure [71.45196938842608]
We propose a unified framework called Local-instance and Global-semantic Learning (GraphLoG) for self-supervised whole-graph representation learning.
Besides preserving the local similarities, GraphLoG introduces the hierarchical prototypes to capture the global semantic clusters.
An efficient online expectation-maximization (EM) algorithm is further developed for learning the model.
arXiv Detail & Related papers (2021-06-08T05:25:38Z) - Higher-Order Spectral Clustering of Directed Graphs [8.997952791113232]
Clustering is an important topic in algorithms, and has a number of applications in machine learning, computer vision, statistics, and several other research disciplines.
We present a nearly-linear time algorithm for digraph clustering, and show that our proposed algorithm can be implemented in sublinear time under reasonable assumptions.
arXiv Detail & Related papers (2020-11-10T13:06:37Z) - Structured Graph Learning for Clustering and Semi-supervised
Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data.
Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure.
Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z) - Embedding Graph Auto-Encoder for Graph Clustering [90.8576971748142]
Graph auto-encoder (GAE) models are based on semi-supervised graph convolution networks (GCN)
We design a specific GAE-based model for graph clustering to be consistent with the theory, namely Embedding Graph Auto-Encoder (EGAE)
EGAE consists of one encoder and dual decoders.
arXiv Detail & Related papers (2020-02-20T09:53:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.