Exploring dual information in distance metric learning for clustering
- URL: http://arxiv.org/abs/2105.12703v1
- Date: Wed, 26 May 2021 17:33:23 GMT
- Title: Exploring dual information in distance metric learning for clustering
- Authors: Rodrigo Randel and Daniel Aloise and Alain Hertz
- Abstract summary: We propose to exploit the dual information associated with the pairwise constraints of the semi-supervised clustering problem.
Experiments clearly show that distance metric learning algorithms benefit from integrating this dual information.
- Score: 1.452875650827562
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Distance metric learning algorithms aim to appropriately measure similarities
and distances between data points. In the context of clustering, metric
learning is typically applied with the assist of side-information provided by
experts, most commonly expressed in the form of cannot-link and must-link
constraints. In this setting, distance metric learning algorithms move closer
pairs of data points involved in must-link constraints, while pairs of points
involved in cannot-link constraints are moved away from each other. For these
algorithms to be effective, it is important to use a distance metric that
matches the expert knowledge, beliefs, and expectations, and the
transformations made to stick to the side-information should preserve
geometrical properties of the dataset. Also, it is interesting to filter the
constraints provided by the experts to keep only the most useful and reject
those that can harm the clustering process. To address these issues, we propose
to exploit the dual information associated with the pairwise constraints of the
semi-supervised clustering problem. Experiments clearly show that distance
metric learning algorithms benefit from integrating this dual information.
Related papers
- Distribution-Based Trajectory Clustering [14.781854651899705]
Trajectory clustering enables the discovery of common patterns in trajectory data.
The distance measures employed have two challenges: high computational cost and low fidelity.
We propose to use a recent Isolation Distributional Kernel (IDK) as the main tool to meet all three challenges.
arXiv Detail & Related papers (2023-10-08T11:28:34Z) - On Leave-One-Out Conditional Mutual Information For Generalization [122.2734338600665]
We derive information theoretic generalization bounds for supervised learning algorithms based on a new measure of leave-one-out conditional mutual information (loo-CMI)
Contrary to other CMI bounds, our loo-CMI bounds can be computed easily and can be interpreted in connection to other notions such as classical leave-one-out cross-validation.
We empirically validate the quality of the bound by evaluating its predicted generalization gap in scenarios for deep learning.
arXiv Detail & Related papers (2022-07-01T17:58:29Z) - Constrained Clustering and Multiple Kernel Learning without Pairwise
Constraint Relaxation [15.232192645789485]
We introduce a new constrained clustering algorithm that jointly clusters data and learns a kernel in accordance with the available pairwise constraints.
We show that the proposed method outperforms existing approaches on a large number of diverse publicly available datasets.
arXiv Detail & Related papers (2022-03-23T17:07:53Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - Ranking Distance Calibration for Cross-Domain Few-Shot Learning [91.22458739205766]
Recent progress in few-shot learning promotes a more realistic cross-domain setting.
Due to the domain gap and disjoint label spaces between source and target datasets, their shared knowledge is extremely limited.
We employ a re-ranking process for calibrating a target distance matrix by discovering the reciprocal k-nearest neighbours within the task.
arXiv Detail & Related papers (2021-12-01T03:36:58Z) - Deep Relational Metric Learning [84.95793654872399]
This paper presents a deep relational metric learning framework for image clustering and retrieval.
We learn an ensemble of features that characterizes an image from different aspects to model both interclass and intraclass distributions.
Experiments on the widely-used CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate that our framework improves existing deep metric learning methods and achieves very competitive results.
arXiv Detail & Related papers (2021-08-23T09:31:18Z) - (k, l)-Medians Clustering of Trajectories Using Continuous Dynamic Time
Warping [57.316437798033974]
In this work we consider the problem of center-based clustering of trajectories.
We propose the usage of a continuous version of DTW as distance measure, which we call continuous dynamic time warping (CDTW)
We show a practical way to compute a center from a set of trajectories and subsequently iteratively improve it.
arXiv Detail & Related papers (2020-12-01T13:17:27Z) - Towards Certified Robustness of Distance Metric Learning [53.96113074344632]
We advocate imposing an adversarial margin in the input space so as to improve the generalization and robustness of metric learning algorithms.
We show that the enlarged margin is beneficial to the generalization ability by using the theoretical technique of algorithmic robustness.
arXiv Detail & Related papers (2020-06-10T16:51:53Z) - Learning Flat Latent Manifolds with VAEs [16.725880610265378]
We propose an extension to the framework of variational auto-encoders, where the Euclidean metric is a proxy for the similarity between data points.
We replace the compact prior typically used in variational auto-encoders with a recently presented, more expressive hierarchical one.
We evaluate our method on a range of data-sets, including a video-tracking benchmark.
arXiv Detail & Related papers (2020-02-12T09:54:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.