Local Correlation Clustering with Asymmetric Classification Errors
- URL: http://arxiv.org/abs/2108.05697v1
- Date: Wed, 11 Aug 2021 12:31:48 GMT
- Title: Local Correlation Clustering with Asymmetric Classification Errors
- Authors: Jafar Jafarov, Sanchit Kalhan, Konstantin Makarychev and Yury
Makarychev
- Abstract summary: In the Correlation Clustering problem, we are given a complete weighted graph $G$ with its edges labeled as "similar" and "dissimilar"
For a clustering $mathcalC$ of graph $G$, a similar edge is in disagreement with $mathcalC$, if its endpoints belong to distinct clusters; and a dissimilar edge is in disagreement with $mathcalC$ if its endpoints belong to the same cluster.
We produce a clustering that minimizes the $ell_p$ norm of the disagreements vector for $pgeq 1$
- Score: 12.277755088736864
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the Correlation Clustering problem, we are given a complete weighted graph
$G$ with its edges labeled as "similar" and "dissimilar" by a noisy binary
classifier. For a clustering $\mathcal{C}$ of graph $G$, a similar edge is in
disagreement with $\mathcal{C}$, if its endpoints belong to distinct clusters;
and a dissimilar edge is in disagreement with $\mathcal{C}$ if its endpoints
belong to the same cluster. The disagreements vector, $\text{dis}$, is a vector
indexed by the vertices of $G$ such that the $v$-th coordinate $\text{dis}_v$
equals the weight of all disagreeing edges incident on $v$. The goal is to
produce a clustering that minimizes the $\ell_p$ norm of the disagreements
vector for $p\geq 1$. We study the $\ell_p$ objective in Correlation Clustering
under the following assumption: Every similar edge has weight in the range of
$[\alpha\mathbf{w},\mathbf{w}]$ and every dissimilar edge has weight at least
$\alpha\mathbf{w}$ (where $\alpha \leq 1$ and $\mathbf{w}>0$ is a scaling
parameter). We give an
$O\left((\frac{1}{\alpha})^{\frac{1}{2}-\frac{1}{2p}}\cdot
\log\frac{1}{\alpha}\right)$ approximation algorithm for this problem.
Furthermore, we show an almost matching convex programming integrality gap.
Related papers
- The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$.
As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z) - On Generalization Bounds for Projective Clustering [3.4490841399255823]
Given a set of points, clustering consists of finding a partition of a point set into $k$ clusters such that the center to which a point is assigned is as close as possible.
For center-based objectives, we show a convergence rate of $tildeOleft(sqrtfrackj2nright)$.
For subspace clustering with $j$-dimensional subspaces, we show a convergence rate of $tildeOleft(sqrtfrackj2nright)$.
arXiv Detail & Related papers (2023-10-13T14:15:54Z) - Detection of Dense Subhypergraphs by Low-Degree Polynomials [72.4451045270967]
Detection of a planted dense subgraph in a random graph is a fundamental statistical and computational problem.
We consider detecting the presence of a planted $Gr(ngamma, n-alpha)$ subhypergraph in a $Gr(n, n-beta) hypergraph.
Our results are already new in the graph case $r=2$, as we consider the subtle log-density regime where hardness based on average-case reductions is not known.
arXiv Detail & Related papers (2023-04-17T10:38:08Z) - Fair Representation Clustering with Several Protected Classes [13.53362222844008]
We study the problem of fair $k$-median where each cluster is required to have a fair representation of individuals from different groups.
We present an $O(log k)$-approximation algorithm that runs in time $nO(ell)$.
arXiv Detail & Related papers (2022-02-03T03:45:45Z) - Approximating Fair Clustering with Cascaded Norm Objectives [10.69111036810888]
We find a clustering which minimizes the $ell_q$-norm of the vector over $W$ of the $ell_p$-norms of the weighted distances of points in $P$ from the centers.
This generalizes various clustering problems, including Socially Fair $k$-Median and $k$-Means.
arXiv Detail & Related papers (2021-11-08T20:18:10Z) - Fast Graph Sampling for Short Video Summarization using Gershgorin Disc
Alignment [52.577757919003844]
We study the problem of efficiently summarizing a short video into several paragraphs, leveraging recent progress in fast graph sampling.
Experimental results show that our algorithm achieves comparable video summarization as state-of-the-art methods, at a substantially reduced complexity.
arXiv Detail & Related papers (2021-10-21T18:43:00Z) - Exact Matching of Random Graphs with Constant Correlation [2.578242050187029]
This paper deals with the problem of graph matching or network alignment for ErdHos--R'enyi graphs.
It can be viewed as a noisy average-case version of the graph isomorphism problem.
arXiv Detail & Related papers (2021-10-11T05:07:50Z) - Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals.
Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z) - Correlation Clustering with Asymmetric Classification Errors [12.277755088736864]
We study the correlation clustering problem under the following assumption: Every "similar" edge $e$ has weight $mathbfw_ein[alpha mathbfw]$ and every "dissimilar" edge $e$ has weight $mathbfw_egeq alpha mathbfw.
arXiv Detail & Related papers (2021-08-11T12:30:52Z) - Linear Bandits on Uniformly Convex Sets [88.3673525964507]
Linear bandit algorithms yield $tildemathcalO(nsqrtT)$ pseudo-regret bounds on compact convex action sets.
Two types of structural assumptions lead to better pseudo-regret bounds.
arXiv Detail & Related papers (2021-03-10T07:33:03Z) - The Average-Case Time Complexity of Certifying the Restricted Isometry
Property [66.65353643599899]
In compressed sensing, the restricted isometry property (RIP) on $M times N$ sensing matrices guarantees efficient reconstruction of sparse vectors.
We investigate the exact average-case time complexity of certifying the RIP property for $Mtimes N$ matrices with i.i.d. $mathcalN(0,1/M)$ entries.
arXiv Detail & Related papers (2020-05-22T16:55:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.