Related papers: Parameterized Correlation Clustering in Hypergraphs and Bipartite Graphs

Parameterized Correlation Clustering in Hypergraphs and Bipartite Graphs

URL: http://arxiv.org/abs/2002.09460v2
Date: Fri, 19 Jun 2020 15:10:01 GMT
Title: Parameterized Correlation Clustering in Hypergraphs and Bipartite Graphs
Authors: Nate Veldt and Anthony Wirth and David F. Gleich
Abstract summary: We consider new clustering objectives in hypergraphs and bipartite graphs. These objectives are parameterized by one or more resolution parameters in order to enable diverse knowledge discovery in complex data.
Score: 15.36202554903105
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Motivated by applications in community detection and dense subgraph discovery, we consider new clustering objectives in hypergraphs and bipartite graphs. These objectives are parameterized by one or more resolution parameters in order to enable diverse knowledge discovery in complex data. For both hypergraph and bipartite objectives, we identify parameter regimes that are equivalent to existing objectives and share their (polynomial-time) approximation algorithms. We first show that our parameterized hypergraph correlation clustering objective is related to higher-order notions of normalized cut and modularity in hypergraphs. It is further amenable to approximation algorithms via hyperedge expansion techniques. Our parameterized bipartite correlation clustering objective generalizes standard unweighted bipartite correlation clustering, as well as bicluster deletion. For a certain choice of parameters it is also related to our hypergraph objective. Although in general it is NP-hard, we highlight a parameter regime for the bipartite objective where the problem reduces to the bipartite matching problem and thus can be solved in polynomial time. For other parameter settings, we present approximation algorithms using linear program rounding techniques. These results allow us to introduce the first constant-factor approximation for bicluster deletion, the task of removing a minimum number of edges to partition a bipartite graph into disjoint bi-cliques. In several experimental results, we highlight the flexibility of our framework and the diversity of results that can be obtained in different parameter settings. This includes clustering bipartite graphs across a range of parameters, detecting motif-rich clusters in an email network and a food web, and forming clusters of retail products in a product review hypergraph, that are highly correlated with known product categories.

Related papers

Divide-Then-Rule: A Cluster-Driven Hierarchical Interpolator for Attribute-Missing Graphs [51.13363550716544]
Deep graph clustering is an unsupervised task aimed at partitioning nodes with incomplete attributes into distinct clusters.<n>Existing imputation methods for attribute-missing graphs often fail to account for the varying amounts of information available across node neighborhoods.<n>We propose Divide-Then-Rule Graph Completion (DTRGC) to address this issue.
arXiv Detail & Related papers (2025-07-12T03:33:19Z)
Provably Extending PageRank-based Local Clustering Algorithm to Weighted Directed Graphs with Self-Loops and to Hypergraphs [40.215737469808026]
This work focuses on graph local clustering, which has broad applications beyond graphs because of the internal connectivities within various modalities. We extend the non-approximating Andersen-Chung-Lang ("ACL") algorithm beyond discrete graphs and generalize its quadratic optimality to a wider range of graphs. We theoretically prove that, under two mild conditions, both algorithms can identify a quadratically optimal local cluster in terms of conductance with at least 1/2 probability.
arXiv Detail & Related papers (2024-12-04T03:56:14Z)
Interpetable Target-Feature Aggregation for Multi-Task Learning based on Bias-Variance Analysis [53.38518232934096]
Multi-task learning (MTL) is a powerful machine learning paradigm designed to leverage shared knowledge across tasks to improve generalization and performance. We propose an MTL approach at the intersection between task clustering and feature transformation based on a two-phase iterative aggregation of targets and features. In both phases, a key aspect is to preserve the interpretability of the reduced targets and features through the aggregation with the mean, which is motivated by applications to Earth science.
arXiv Detail & Related papers (2024-06-12T08:30:16Z)
HeNCler: Node Clustering in Heterophilous Graphs through Learned Asymmetric Similarity [55.27586970082595]
HeNCler is a novel approach for Heterophilous Node Clustering. We show that HeNCler significantly enhances performance in node clustering tasks within heterophilous graph contexts.
arXiv Detail & Related papers (2024-05-27T11:04:05Z)
Cluster-based Graph Collaborative Filtering [55.929052969825825]
Graph Convolution Networks (GCNs) have succeeded in learning user and item representations for recommendation systems. Most existing GCN-based methods overlook the multiple interests of users while performing high-order graph convolution. We propose a novel GCN-based recommendation model, termed Cluster-based Graph Collaborative Filtering (ClusterGCF)
arXiv Detail & Related papers (2024-04-16T07:05:16Z)
Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels. We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z)
One-step Bipartite Graph Cut: A Normalized Formulation and Its Application to Scalable Subspace Clustering [56.81492360414741]
We show how to enforce a one-step normalized cut for bipartite graphs, especially with linear-time complexity. In this paper, we first characterize a novel one-step bipartite graph cut criterion with normalized constraints, and theoretically prove its equivalence to a trace problem. We extend this cut criterion to a scalable subspace clustering approach, where adaptive anchor learning, bipartite graph learning, and one-step normalized bipartite graph partitioning are simultaneously modeled.
arXiv Detail & Related papers (2023-05-12T11:27:20Z)
ClusterFuG: Clustering Fully connected Graphs by Multicut [20.254912065749956]
In dense multicut, the clustering objective is given in a factorized form as inner products of node feature vectors. We show how to rewrite classical greedy algorithms for multicut in our dense setting and how to modify them for greater efficiency and solution quality.
arXiv Detail & Related papers (2023-01-28T11:10:50Z)
Semi-Supervised Clustering of Sparse Graphs: Crossing the Information-Theoretic Threshold [3.6052935394000234]
Block model is a canonical random graph model for clustering and community detection on network-structured data. No estimator based on the network topology can perform substantially better than chance on sparse graphs if the model parameter is below a certain threshold. We prove that with an arbitrary fraction of the labels feasible throughout the parameter domain.
arXiv Detail & Related papers (2022-05-24T00:03:25Z)
Nonbacktracking spectral clustering of nonuniform hypergraphs [2.408714894793063]
We study spectral clustering for nonuniform hypergraphs based on the hypergraph nonbacktracking operator. We propose an alternating algorithm for inference in a hypergraph blockmodel via linearized belief-propagation.
arXiv Detail & Related papers (2022-04-27T01:14:06Z)
T-LoHo: A Bayesian Regularization Model for Structured Sparsity and Smoothness on Graphs [0.0]
In graph-structured data, structured sparsity and smoothness tend to cluster together. We propose a new prior for high dimensional parameters with graphical relations. We use it to detect structured sparsity and smoothness simultaneously.
arXiv Detail & Related papers (2021-07-06T10:10:03Z)
Finding Geometric Models by Clustering in the Consensus Space [61.65661010039768]
We propose a new algorithm for finding an unknown number of geometric models, e.g., homographies. We present a number of applications where the use of multiple geometric models improves accuracy. These include pose estimation from multiple generalized homographies; trajectory estimation of fast-moving objects.
arXiv Detail & Related papers (2021-03-25T14:35:07Z)
Generative hypergraph clustering: from blockmodels to modularity [26.99290024958576]
We propose an expressive generative model of clustered hypergraphs with heterogeneous node degrees and edge sizes. We show that hypergraph Louvain is highly scalable, including as an example an experiment on a synthetic hypergraph of one million nodes. We use our model to analyze different patterns of higher-order structure in school contact networks, U.S. congressional bill cosponsorship, U.S. congressional committees, product categories in co-purchasing behavior, and hotel locations.
arXiv Detail & Related papers (2021-01-24T00:25:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.