Fugu-MT 論文翻訳(概要): Exact Recovery of Mangled Clusters with Same-Cluster Queries

論文の概要: Exact Recovery of Mangled Clusters with Same-Cluster Queries

arxiv url: http://arxiv.org/abs/2006.04675v3
Date: Fri, 30 Oct 2020 13:07:19 GMT
ステータス: 翻訳完了
システム内更新日: 2022-11-24 00:51:52.559919
Title: Exact Recovery of Mangled Clusters with Same-Cluster Queries
Title（参考訳）: 同一クラスタクエリを用いたマングルクラスタの完全回復
Authors: Marco Bressan, Nicol\`o Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice
Abstract要約: 半教師付きアクティブクラスタリングフレームワークにおけるクラスタリカバリ問題について検討する。我々は、$n$ポイントを$k$クラスタに分割するアルゴリズムを設計し、$O(k3 ln k ln n)$oracleクエリと$tildeO(kn + k3)$でクラスタを非分類エラーで復元する。
参考スコア（独自算出の注目度）: 20.03712152278538
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the cluster recovery problem in the semi-supervised active clustering framework. Given a finite set of input points, and an oracle revealing whether any two points lie in the same cluster, our goal is to recover all clusters exactly using as few queries as possible. To this end, we relax the spherical $k$-means cluster assumption of Ashtiani et al.\ to allow for arbitrary ellipsoidal clusters with margin. This removes the assumption that the clustering is center-based (i.e., defined through an optimization problem), and includes all those cases where spherical clusters are individually transformed by any combination of rotations, axis scalings, and point deletions. We show that, even in this much more general setting, it is still possible to recover the latent clustering exactly using a number of queries that scales only logarithmically with the number of input points. More precisely, we design an algorithm that, given $n$ points to be partitioned into $k$ clusters, uses $O(k^3 \ln k \ln n)$ oracle queries and $\tilde{O}(kn + k^3)$ time to recover the clustering with zero misclassification error. The $O(\cdot)$ notation hides an exponential dependence on the dimensionality of the clusters, which we show to be necessary thus characterizing the query complexity of the problem. Our algorithm is simple, easy to implement, and can also learn the clusters using low-stretch separators, a class of ellipsoids with additional theoretical guarantees. Experiments on large synthetic datasets confirm that we can reconstruct clusterings exactly and efficiently.
Abstract（参考訳）: 半教師付きアクティブクラスタリングフレームワークにおけるクラスタリカバリ問題について検討する。入力ポイントの有限セットと、2つのポイントが同じクラスタにあるかどうかを明記するオラクルが与えられた場合、我々のゴールは、できるだけ少ないクエリを使って、すべてのクラスタを正確に回復することである。この目的のために、ashtiani et al の球形 $k$-means クラスタの仮定を緩和する。任意の楕円体クラスタをマージンで許容する。これはクラスタリングが中心に基づく(すなわち最適化問題によって定義される)仮定を排除し、回転、軸のスケーリング、点削除の組み合わせによって球状クラスタが個別に変換されるすべてのケースを含む。より一般的な設定であっても、入力ポイントの数に対数的にしかスケールしない多くのクエリを使用して、潜在クラスタリングを正確に復元することが可能であることを示す。より正確には、$n$のポイントを$k$クラスタに分割するアルゴリズムを設計し、$o(k^3 \ln k \ln n)$ oracleクエリと$\tilde{o}(kn + k^3)$時間を使用してクラスタリングを誤分類エラーなく回復する。 O(\cdot)$表記法はクラスタの次元性への指数関数的依存を隠蔽し、この問題の問合せ複雑性を特徴付ける必要があることを示す。我々のアルゴリズムは単純で実装が容易であり、さらに理論的な保証が加えられた楕円形のクラスであるlow-stretch separatorsを使ってクラスタを学習することもできる。大規模合成データセットの実験により、クラスタリングを正確にかつ効率的に再構築できることが確認された。

関連論文リスト

Guaranteed Recovery of Unambiguous Clusters [7.011239860967789]
クラスタリングは、しばしば難しい問題である。なぜなら、それは、"正しい"クラスタリングがどうあるべきかに固有の曖昧さのためである。本稿では,不明瞭な場合のクラスタリングを復元するアルゴリズムについて,情報理論による特徴付けと設計を提案する。
論文参考訳（メタデータ） (2025-01-22T18:51:25Z)
Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
本稿では,Reinforcement Graph Clusteringと呼ばれる新しいディープグラフクラスタリング手法を提案する。提案手法では,クラスタ数決定と教師なし表現学習を統一的なフレームワークに統合する。フィードバック動作を行うために、クラスタリング指向の報酬関数を提案し、同一クラスタの凝集を高め、異なるクラスタを分離する。
論文参考訳（メタデータ） (2023-08-13T18:12:28Z)
Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
観測されたラベルを用いてクラスタを復元する効率的なアルゴリズムを考案する。本稿では,期待値と高い確率でこれらの下位境界との性能を一致させる最初のアルゴリズムであるIACを提案する。
論文参考訳（メタデータ） (2023-06-18T08:46:06Z)
Differentially-Private Clustering of Easy Instances [67.04951703461657]
異なるプライベートクラスタリングでは、個々のデータポイントに関する情報を公開せずに、$k$のクラスタセンターを特定することが目標だ。我々は、データが"簡単"である場合にユーティリティを提供する実装可能な差分プライベートクラスタリングアルゴリズムを提供する。我々は、非プライベートクラスタリングアルゴリズムを簡単なインスタンスに適用し、結果をプライベートに組み合わせることのできるフレームワークを提案する。
論文参考訳（メタデータ） (2021-12-29T08:13:56Z)
How to Find a Good Explanation for Clustering? [7.951746797489421]
Moshkovitz氏、Dasgupta氏、Rashtchian氏、Frost氏(ICML 2020)は、説明可能な$k$-meansと$k$-medianクラスタリングのエレガントなモデルを提案した。説明可能なクラスタリングに関する2つの自然なアルゴリズム的問題について検討する。厳密なアルゴリズム分析では、入力サイズ、データの寸法、外乱数、クラスタ数、近似比といったパラメータが、説明可能なクラスタリングの計算複雑性に与える影響について光を当てています。
論文参考訳（メタデータ） (2021-12-13T11:48:38Z)
Self-supervised Contrastive Attributed Graph Clustering [110.52694943592974]
我々は,自己教師型コントラストグラフクラスタリング(SCAGC)という,新たな属性グラフクラスタリングネットワークを提案する。 SCAGCでは,不正確なクラスタリングラベルを活用することで,ノード表現学習のための自己教師付きコントラスト損失を設計する。 OOSノードでは、SCAGCはクラスタリングラベルを直接計算できる。
論文参考訳（メタデータ） (2021-10-15T03:25:28Z)
On Margin-Based Cluster Recovery with Oracle Queries [22.672233769934845]
1組の$n$ポイントのオラクルと"これらの2つのポイントは同じクラスタ内にあるのか? 本稿では,任意の凸クラスタを正確な時間で復元するアルゴリズムを提案する。一般の擬計量空間では、クラスターが凸でない場合や、形状の概念が存在しない場合、$O(log n)$クエリ境界を達成し、確実に最適に近いアルゴリズムを与える。
論文参考訳（メタデータ） (2021-06-09T08:48:23Z)
Fuzzy Clustering with Similarity Queries [56.96625809888241]
ファジィ(fuzzy, soft objective)は、よく知られた$k$-means問題の一般化である。クエリを少なくすることで、問題の解決が容易になる。
論文参考訳（メタデータ） (2021-06-04T02:32:26Z)
K-expectiles clustering [0.0]
本稿では,期待値に基づく分割クラスタリングアルゴリズムを提案する。固定$tau$クラスタリングと適応$tau$クラスタリングの2つのスキームを提案します。
論文参考訳（メタデータ） (2021-03-16T21:14:56Z)
Exact Recovery of Clusters in Finite Metric Spaces Using Oracle Queries [22.672233769934845]
オラクルクエリを用いたクラスタ依存の正確な問題について検討する。任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意の任意
論文参考訳（メタデータ） (2021-01-31T18:00:29Z)
Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
既存のスケーラブルな階層的クラスタリング手法は、スピードの質を犠牲にする。我々は、品質を犠牲にせず、数十億のデータポイントまでスケールする、スケーラブルで集約的な階層的クラスタリング法を提案する。
論文参考訳（メタデータ） (2020-10-22T15:58:35Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。