Related papers: Linear Distance Metric Learning with Noisy Labels

Linear Distance Metric Learning with Noisy Labels

URL: http://arxiv.org/abs/2306.03173v3
Date: Wed, 20 Dec 2023 19:37:34 GMT
Title: Linear Distance Metric Learning with Noisy Labels
Authors: Meysam Alishahi, Anna Little, and Jeff M. Phillips
Abstract summary: We show that even if the data is noisy, the ground truth linear metric can be learned with any precision. We present an effective way to truncate the learned model to a low-rank model that can provably maintain the accuracy in loss function and in parameters.
Score: 7.326930455001404
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In linear distance metric learning, we are given data in one Euclidean metric space and the goal is to find an appropriate linear map to another Euclidean metric space which respects certain distance conditions as much as possible. In this paper, we formalize a simple and elegant method which reduces to a general continuous convex loss optimization problem, and for different noise models we derive the corresponding loss functions. We show that even if the data is noisy, the ground truth linear metric can be learned with any precision provided access to enough samples, and we provide a corresponding sample complexity bound. Moreover, we present an effective way to truncate the learned model to a low-rank model that can provably maintain the accuracy in loss function and in parameters -- the first such results of this type. Several experimental observations on synthetic and real data sets support and inform our theoretical results.

Related papers

Sample-Efficient Geometry Reconstruction from Euclidean Distances using Non-Convex Optimization [7.114174944371803]
The problem of finding suitable point embedding Euclidean distance information point pairs arises both as a core task and as a sub-machine learning learning problem. In this paper, we aim to solve this problem given a minimal number of samples.
arXiv Detail & Related papers (2024-10-22T13:02:12Z)
Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation [151.70234052015948]
We propose a novel approach that encourages the optimization algorithm to seek a flat trajectory. We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory. Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7%.
arXiv Detail & Related papers (2022-11-20T15:49:11Z)
One-Way Matching of Datasets with Low Rank Signals [4.582330307986793]
We show that linear assignment with projected data achieves fast rates of convergence and sometimes even minimax rate optimality for this task. We illustrate practical use of the matching procedure on two single-cell data examples.
arXiv Detail & Related papers (2022-04-29T03:12:23Z)
Adaptive neighborhood Metric learning [184.95321334661898]
We propose a novel distance metric learning algorithm, named adaptive neighborhood metric learning (ANML) ANML can be used to learn both the linear and deep embeddings. The emphlog-exp mean function proposed in our method gives a new perspective to review the deep metric learning methods.
arXiv Detail & Related papers (2022-01-20T17:26:37Z)
Calibrated Simplex Mapping Classification [0.0]
We propose a novel supervised multi-class/single-label classifier that maps training data onto a linearly separable latent space with a simplex-like geometry. For its solution we can choose suitable distance metrics in feature space and regression models predicting latent space coordinates.
arXiv Detail & Related papers (2021-03-04T10:18:22Z)
GELATO: Geometrically Enriched Latent Model for Offline Reinforcement Learning [54.291331971813364]
offline reinforcement learning approaches can be divided into proximal and uncertainty-aware methods. In this work, we demonstrate the benefit of combining the two in a latent variational model. Our proposed metrics measure both the quality of out of distribution samples as well as the discrepancy of examples in the data.
arXiv Detail & Related papers (2021-02-22T19:42:40Z)
Evaluating representations by the complexity of learning low-loss predictors [55.94170724668857]
We consider the problem of evaluating representations of data for use in solving a downstream task. We propose to measure the quality of a representation by the complexity of learning a predictor on top of the representation that achieves low loss on a task of interest.
arXiv Detail & Related papers (2020-09-15T22:06:58Z)
Graph Embedding with Data Uncertainty [113.39838145450007]
spectral-based subspace learning is a common data preprocessing step in many machine learning pipelines. Most subspace learning methods do not take into consideration possible measurement inaccuracies or artifacts that can lead to data with high uncertainty.
arXiv Detail & Related papers (2020-09-01T15:08:23Z)
Deep Dimension Reduction for Supervised Representation Learning [51.10448064423656]
We propose a deep dimension reduction approach to learning representations with essential characteristics. The proposed approach is a nonparametric generalization of the sufficient dimension reduction method. We show that the estimated deep nonparametric representation is consistent in the sense that its excess risk converges to zero.
arXiv Detail & Related papers (2020-06-10T14:47:43Z)
Linear predictor on linearly-generated data with missing values: non consistency and solutions [0.0]
We study the seemingly-simple case where the target to predict is a linear function of the fully-observed data. We show that, in the presence of missing values, the optimal predictor may not be linear.
arXiv Detail & Related papers (2020-02-03T11:49:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.