Information-Theoretic Active Correlation Clustering
- URL: http://arxiv.org/abs/2402.03587v2
- Date: Wed, 22 May 2024 10:39:38 GMT
- Title: Information-Theoretic Active Correlation Clustering
- Authors: Linus Aronsson, Morteza Haghir Chehreghani,
- Abstract summary: We study correlation clustering where the pairwise similarities are not known in advance.
We employ active learning to query pairwise similarities in a cost-efficient way.
- Score: 3.2688425993442696
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study correlation clustering where the pairwise similarities are not known in advance. For this purpose, we employ active learning to query pairwise similarities in a cost-efficient way. We propose a number of effective information-theoretic acquisition functions based on entropy and information gain. We extensively investigate the performance of our methods in different settings and demonstrate their superior performance compared to the alternatives.
Related papers
- Relation-aware Ensemble Learning for Knowledge Graph Embedding [68.94900786314666]
We propose to learn an ensemble by leveraging existing methods in a relation-aware manner.
exploring these semantics using relation-aware ensemble leads to a much larger search space than general ensemble methods.
We propose a divide-search-combine algorithm RelEns-DSC that searches the relation-wise ensemble weights independently.
arXiv Detail & Related papers (2023-10-13T07:40:12Z) - Correlation Clustering with Active Learning of Pairwise Similarities [3.86170450233149]
Correlation clustering is a well-known unsupervised learning setting that deals with positive and negative pairwise similarities.
In this paper, we study the case where the pairwise similarities are not given in advance and must be queried in a cost-efficient way.
We develop a generic active learning framework for this task that benefits from several advantages.
arXiv Detail & Related papers (2023-02-20T20:39:07Z) - Active Learning of Ordinal Embeddings: A User Study on Football Data [4.856635699699126]
Humans innately measure distance between instances in an unlabeled dataset using an unknown similarity function.
This work uses deep metric learning to learn these user-defined similarity functions from few annotations for a large football trajectory dataset.
arXiv Detail & Related papers (2022-07-26T07:55:23Z) - Query-augmented Active Metric Learning [3.871148938060281]
We propose an active metric learning method for clustering with pairwise constraints.
We augment the queried constraints by generating more pairwise labels to provide additional information in learning a metric.
We increase the robustness of metric learning by updating the learned metric sequentially and penalizing the irrelevant features adaptively.
arXiv Detail & Related papers (2021-11-08T23:32:13Z) - Comparing Cross Correlation-Based Similarities [1.0152838128195467]
Multiset-based correlations based on the real-valued multiset Jaccard and coincidence indices are compared.
Results have immediate implications not only in pattern recognition and deep learning, but also in scientific modeling in general.
arXiv Detail & Related papers (2021-11-08T08:50:13Z) - ACP++: Action Co-occurrence Priors for Human-Object Interaction
Detection [102.9428507180728]
A common problem in the task of human-object interaction (HOI) detection is that numerous HOI classes have only a small number of labeled examples.
We observe that there exist natural correlations and anti-correlations among human-object interactions.
We present techniques to learn these priors and leverage them for more effective training, especially on rare classes.
arXiv Detail & Related papers (2021-09-09T06:02:50Z) - PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense
Passage Retrieval [87.68667887072324]
We propose a novel approach that leverages query-centric and PAssage-centric sImilarity Relations (called PAIR) for dense passage retrieval.
To implement our approach, we make three major technical contributions by introducing formal formulations of the two kinds of similarity relations.
Our approach significantly outperforms previous state-of-the-art models on both MSMARCO and Natural Questions datasets.
arXiv Detail & Related papers (2021-08-13T02:07:43Z) - ReSSL: Relational Self-Supervised Learning with Weak Augmentation [68.47096022526927]
Self-supervised learning has achieved great success in learning visual representations without data annotations.
We introduce a novel relational SSL paradigm that learns representations by modeling the relationship between different instances.
Our proposed ReSSL significantly outperforms the previous state-of-the-art algorithms in terms of both performance and training efficiency.
arXiv Detail & Related papers (2021-07-20T06:53:07Z) - Near-Optimal Comparison Based Clustering [7.930242839366938]
We show that our method can recover a planted clustering using a near-optimal number of comparisons.
We empirically validate our theoretical findings and demonstrate the good behaviour of our method on real data.
arXiv Detail & Related papers (2020-10-08T12:03:13Z) - Memory-Augmented Relation Network for Few-Shot Learning [114.47866281436829]
In this work, we investigate a new metric-learning method, Memory-Augmented Relation Network (MRN)
In MRN, we choose the samples that are visually similar from the working context, and perform weighted information propagation to attentively aggregate helpful information from chosen ones to enhance its representation.
We empirically demonstrate that MRN yields significant improvement over its ancestor and achieves competitive or even better performance when compared with other few-shot learning approaches.
arXiv Detail & Related papers (2020-05-09T10:09:13Z) - Active Learning for Coreference Resolution using Discrete Annotation [76.36423696634584]
We improve upon pairwise annotation for active learning in coreference resolution.
We ask annotators to identify mention antecedents if a presented mention pair is deemed not coreferent.
In experiments with existing benchmark coreference datasets, we show that the signal from this additional question leads to significant performance gains per human-annotation hour.
arXiv Detail & Related papers (2020-04-28T17:17:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.