Query-augmented Active Metric Learning
- URL: http://arxiv.org/abs/2111.04871v1
- Date: Mon, 8 Nov 2021 23:32:13 GMT
- Title: Query-augmented Active Metric Learning
- Authors: Yujia Deng, Yubai Yuan, Haoda Fu, Annie Qu
- Abstract summary: We propose an active metric learning method for clustering with pairwise constraints.
We augment the queried constraints by generating more pairwise labels to provide additional information in learning a metric.
We increase the robustness of metric learning by updating the learned metric sequentially and penalizing the irrelevant features adaptively.
- Score: 3.871148938060281
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper we propose an active metric learning method for clustering with
pairwise constraints. The proposed method actively queries the label of
informative instance pairs, while estimating underlying metrics by
incorporating unlabeled instance pairs, which leads to a more accurate and
efficient clustering process. In particular, we augment the queried constraints
by generating more pairwise labels to provide additional information in
learning a metric to enhance clustering performance. Furthermore, we increase
the robustness of metric learning by updating the learned metric sequentially
and penalizing the irrelevant features adaptively. In addition, we propose a
novel active query strategy that evaluates the information gain of instance
pairs more accurately by incorporating the neighborhood structure, which
improves clustering efficiency without extra labeling cost. In theory, we
provide a tighter error bound of the proposed metric learning method utilizing
augmented queries compared with methods using existing constraints only.
Furthermore, we also investigate the improvement using the active query
strategy instead of random selection. Numerical studies on simulation settings
and real datasets indicate that the proposed method is especially advantageous
when the signal-to-noise ratio between significant features and irrelevant
features is low.
Related papers
- Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations.
Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance.
We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z) - Learning to Rank for Active Learning via Multi-Task Bilevel Optimization [29.207101107965563]
We propose a novel approach for active learning, which aims to select batches of unlabeled instances through a learned surrogate model for data acquisition.
A key challenge in this approach is developing an acquisition function that generalizes well, as the history of data, which forms part of the utility function's input, grows over time.
arXiv Detail & Related papers (2023-10-25T22:50:09Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Dynamic Clustering and Cluster Contrastive Learning for Unsupervised
Person Re-identification [29.167783500369442]
Unsupervised Re-ID methods aim at learning robust and discriminative features from unlabeled data.
We propose a dynamic clustering and cluster contrastive learning (DCCC) method.
Experiments on several widely used public datasets validate the effectiveness of our proposed DCCC.
arXiv Detail & Related papers (2023-03-13T01:56:53Z) - Active Learning for Regression with Aggregated Outputs [28.40183946090337]
We propose an active learning method that sequentially selects sets to be labeled to improve the predictive performance with fewer labeled sets.
With the experiments using various datasets, we demonstrate that the proposed method achieves better predictive performance with fewer labeled sets than existing methods.
arXiv Detail & Related papers (2022-10-04T02:45:14Z) - Mitigating shortage of labeled data using clustering-based active
learning with diversity exploration [3.312798619476657]
We propose a clustering-based active learning framework, namely Active Learning using a Clustering-based Sampling.
A bi-cluster boundary-based sample query procedure is introduced to improve the learning performance for classifying highly overlapped classes.
arXiv Detail & Related papers (2022-07-06T20:53:28Z) - Unsupervised feature selection via self-paced learning and low-redundant
regularization [6.083524716031565]
An unsupervised feature selection is proposed by integrating the framework of self-paced learning and subspace learning.
The convergence of the method is proved theoretically and experimentally.
The experimental results show that the proposed method can improve the performance of clustering methods and outperform other compared algorithms.
arXiv Detail & Related papers (2021-12-14T08:28:19Z) - Adaptive Hierarchical Similarity Metric Learning with Noisy Labels [138.41576366096137]
We propose an Adaptive Hierarchical Similarity Metric Learning method.
It considers two noise-insensitive information, textiti.e., class-wise divergence and sample-wise consistency.
Our method achieves state-of-the-art performance compared with current deep metric learning approaches.
arXiv Detail & Related papers (2021-10-29T02:12:18Z) - Semi-supervised Batch Active Learning via Bilevel Optimization [89.37476066973336]
We formulate our approach as a data summarization problem via bilevel optimization.
We show that our method is highly effective in keyword detection tasks in the regime when only few labeled samples are available.
arXiv Detail & Related papers (2020-10-19T16:53:24Z) - Active Learning for Coreference Resolution using Discrete Annotation [76.36423696634584]
We improve upon pairwise annotation for active learning in coreference resolution.
We ask annotators to identify mention antecedents if a presented mention pair is deemed not coreferent.
In experiments with existing benchmark coreference datasets, we show that the signal from this additional question leads to significant performance gains per human-annotation hour.
arXiv Detail & Related papers (2020-04-28T17:17:11Z) - New advances in enumerative biclustering algorithms with online
partitioning [80.22629846165306]
This paper further extends RIn-Close_CVC, a biclustering algorithm capable of performing an efficient, complete, correct and non-redundant enumeration of maximal biclusters with constant values on columns in numerical datasets.
The improved algorithm is called RIn-Close_CVC3, keeps those attractive properties of RIn-Close_CVC, and is characterized by: a drastic reduction in memory usage; a consistent gain in runtime.
arXiv Detail & Related papers (2020-03-07T14:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.