Global Ground Metric Learning with Applications to scRNA data
- URL: http://arxiv.org/abs/2506.15383v1
- Date: Wed, 18 Jun 2025 11:53:13 GMT
- Title: Global Ground Metric Learning with Applications to scRNA data
- Authors: Damin Kühn, Michael T. Schaub,
- Abstract summary: We propose a novel approach for learning metrics for arbitrary distributions over a shared metric space.<n>Our method provides a distance between individual points like a global metric, but requires only class labels on a distribution-level for training.<n>We demonstrate the effectiveness and interpretability of our approach using patient-level scRNA-seq data spanning multiple diseases.
- Score: 5.70896453969985
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Optimal transport provides a robust framework for comparing probability distributions. Its effectiveness is significantly influenced by the choice of the underlying ground metric. Traditionally, the ground metric has either been (i) predefined, e.g., as the Euclidean distance, or (ii) learned in a supervised way, by utilizing labeled data to learn a suitable ground metric for enhanced task-specific performance. Yet, predefined metrics typically cannot account for the inherent structure and varying importance of different features in the data, and existing supervised approaches to ground metric learning often do not generalize across multiple classes or are restricted to distributions with shared supports. To address these limitations, we propose a novel approach for learning metrics for arbitrary distributions over a shared metric space. Our method provides a distance between individual points like a global metric, but requires only class labels on a distribution-level for training. The learned global ground metric enables more accurate optimal transport distances, leading to improved performance in embedding, clustering and classification tasks. We demonstrate the effectiveness and interpretability of our approach using patient-level scRNA-seq data spanning multiple diseases.
Related papers
- GLRT-Based Metric Learning for Remote Sensing Object Retrieval [19.210692452537007]
Existing CBRSOR methods neglect the utilization of global statistical information during both training and test stages.
Inspired by the Neyman-Pearson theorem, we propose a generalized likelihood ratio test-based metric learning (GLRTML) approach.
arXiv Detail & Related papers (2024-10-08T07:53:30Z) - Revisiting Evaluation Metrics for Semantic Segmentation: Optimization
and Evaluation of Fine-grained Intersection over Union [113.20223082664681]
We propose the use of fine-grained mIoUs along with corresponding worst-case metrics.
These fine-grained metrics offer less bias towards large objects, richer statistical information, and valuable insights into model and dataset auditing.
Our benchmark study highlights the necessity of not basing evaluations on a single metric and confirms that fine-grained mIoUs reduce the bias towards large objects.
arXiv Detail & Related papers (2023-10-30T03:45:15Z) - Joint Metrics Matter: A Better Standard for Trajectory Forecasting [67.1375677218281]
Multi-modal trajectory forecasting methods evaluate using single-agent metrics (marginal metrics)
Only focusing on marginal metrics can lead to unnatural predictions, such as colliding trajectories or diverging trajectories for people who are clearly walking together as a group.
We present the first comprehensive evaluation of state-of-the-art trajectory forecasting methods with respect to multi-agent metrics (joint metrics): JADE, JFDE, and collision rate.
arXiv Detail & Related papers (2023-05-10T16:27:55Z) - Metric Learning Improves the Ability of Combinatorial Coverage Metrics
to Anticipate Classification Error [0.0]
Many machine learning methods are sensitive to test or operational data that is dissimilar to training data.
metric learning is a technique for learning latent spaces where data from different classes is further apart.
In a study of 6 open-source datasets, we find that metric learning increased the difference between set-difference coverage metrics calculated on correctly and incorrectly classified data.
arXiv Detail & Related papers (2023-02-28T14:55:57Z) - Transferable Deep Metric Learning for Clustering [1.2762298148425795]
Clustering in high spaces is a difficult task; the usual dimension distance metrics may no longer be appropriate under the curse of dimensionality.
We show that we can learn a metric on a labelled dataset, then apply it to cluster a different dataset.
We achieve results competitive with the state-of-the-art while using only a small number of labelled training datasets and shallow networks.
arXiv Detail & Related papers (2023-02-13T17:09:59Z) - Divide and Contrast: Source-free Domain Adaptation via Adaptive
Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations.
DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals.
We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z) - Multi-level Consistency Learning for Semi-supervised Domain Adaptation [85.90600060675632]
Semi-supervised domain adaptation (SSDA) aims to apply knowledge learned from a fully labeled source domain to a scarcely labeled target domain.
We propose a Multi-level Consistency Learning framework for SSDA.
arXiv Detail & Related papers (2022-05-09T06:41:18Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - Self-Supervised Metric Learning in Multi-View Data: A Downstream Task
Perspective [2.01243755755303]
We study how self-supervised metric learning can benefit downstream tasks in the context of multi-view data.
We show that the target distance of metric learning satisfies several desired properties for the downstream tasks.
Our analysis characterizes the improvement by self-supervised metric learning on four commonly used downstream tasks.
arXiv Detail & Related papers (2021-06-14T02:34:33Z) - OVANet: One-vs-All Network for Universal Domain Adaptation [78.86047802107025]
Existing methods manually set a threshold to reject unknown samples based on validation or a pre-defined ratio of unknown samples.
We propose a method to learn the threshold using source samples and to adapt it to the target domain.
Our idea is that a minimum inter-class distance in the source domain should be a good threshold to decide between known or unknown in the target.
arXiv Detail & Related papers (2021-04-07T18:36:31Z) - Meta-Generating Deep Attentive Metric for Few-shot Classification [53.07108067253006]
We present a novel deep metric meta-generation method to generate a specific metric for a new few-shot learning task.
In this study, we structure the metric using a three-layer deep attentive network that is flexible enough to produce a discriminative metric for each task.
We gain surprisingly obvious performance improvement over state-of-the-art competitors, especially in the challenging cases.
arXiv Detail & Related papers (2020-12-03T02:07:43Z) - Interpretable Locally Adaptive Nearest Neighbors [8.052709336750821]
We develop a method that allows learning locally adaptive metrics.
These local metrics not only improve performance but are naturally interpretable.
We conduct a number of experiments on synthetic data sets, and show its usefulness on real-world benchmark data sets.
arXiv Detail & Related papers (2020-11-08T05:27:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.