Related papers: Imitation Learning with Sinkhorn Distances

Imitation Learning with Sinkhorn Distances

URL: http://arxiv.org/abs/2008.09167v2
Date: Sat, 2 Jul 2022 17:16:08 GMT
Title: Imitation Learning with Sinkhorn Distances
Authors: Georgios Papagiannis and Yunpeng Li
Abstract summary: We present tractable solutions by formulating imitation learning as minimization of the Sinkhorn distance between occupancy measures. We evaluate the proposed approach using both the reward metric and the Sinkhorn distance metric on a number of MuJoCo experiments.
Score: 12.161649672131286
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Imitation learning algorithms have been interpreted as variants of divergence minimization problems. The ability to compare occupancy measures between experts and learners is crucial in their effectiveness in learning from demonstrations. In this paper, we present tractable solutions by formulating imitation learning as minimization of the Sinkhorn distance between occupancy measures. The formulation combines the valuable properties of optimal transport metrics in comparing non-overlapping distributions with a cosine distance cost defined in an adversarially learned feature space. This leads to a highly discriminative critic network and optimal transport plan that subsequently guide imitation learning. We evaluate the proposed approach using both the reward metric and the Sinkhorn distance metric on a number of MuJoCo experiments. For the implementation and reproducing results please refer to the following repository https://github.com/gpapagiannis/sinkhorn-imitation.

Related papers

Kolmogorov-Smirnov GAN [52.36633001046723]
We propose a novel deep generative model, the Kolmogorov-Smirnov Generative Adversarial Network (KSGAN) Unlike existing approaches, KSGAN formulates the learning process as a minimization of the Kolmogorov-Smirnov (KS) distance.
arXiv Detail & Related papers (2024-06-28T14:30:14Z)
Sinkhorn Distance Minimization for Knowledge Distillation [97.64216712016571]
Knowledge distillation (KD) has been widely adopted to compress large language models (LLMs) In this paper, we show that the aforementioned KL, RKL, and JS divergences respectively suffer from issues of mode-averaging, mode-collapsing, and mode-underestimation. We propose the Sinkhorn Knowledge Distillation (SinKD) that exploits the Sinkhorn distance to ensure a nuanced and precise assessment of the disparity between teacher and student distributions.
arXiv Detail & Related papers (2024-02-27T01:13:58Z)
Histopathology Image Classification using Deep Manifold Contrastive Learning [8.590026259176806]
We propose a novel extension of contrastive learning that leverages geodesic distance between features as a similarity metric for histopathology whole slide image classification. Results demonstrate that our method outperforms state-of-the-art cosine-distance-based contrastive learning methods.
arXiv Detail & Related papers (2023-06-26T07:02:07Z)
Contrastive Bayesian Analysis for Deep Metric Learning [30.21464199249958]
We develop a contrastive Bayesian analysis to characterize and model the posterior probabilities of image labels conditioned by their features similarity. This contrastive Bayesian analysis leads to a new loss function for deep metric learning. Our experimental results and ablation studies demonstrate that the proposed contrastive Bayesian metric learning method significantly improves the performance of deep metric learning.
arXiv Detail & Related papers (2022-10-10T02:24:21Z)
Neural Bregman Divergences for Distance Learning [60.375385370556145]
We propose a new approach to learning arbitrary Bregman divergences in a differentiable manner via input convex neural networks. We show that our method more faithfully learns divergences over a set of both new and previously studied tasks. Our tests further extend to known asymmetric, but non-Bregman tasks, where our method still performs competitively despite misspecification.
arXiv Detail & Related papers (2022-06-09T20:53:15Z)
Embedding Transfer with Label Relaxation for Improved Metric Learning [43.94511888670419]
We present a novel method for embedding transfer, a task of transferring knowledge of a learned embedding model to another. Our method exploits pairwise similarities between samples in the source embedding space as the knowledge, and transfers them through a loss used for learning target embedding models.
arXiv Detail & Related papers (2021-03-27T13:35:03Z)
Robust Imitation Learning from Noisy Demonstrations [81.67837507534001]
We show that robust imitation learning can be achieved by optimizing a classification risk with a symmetric loss. We propose a new imitation learning method that effectively combines pseudo-labeling with co-training. Experimental results on continuous-control benchmarks show that our method is more robust compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-10-20T10:41:37Z)
Provably Robust Metric Learning [98.50580215125142]
We show that existing metric learning algorithms can result in metrics that are less robust than the Euclidean distance. We propose a novel metric learning algorithm to find a Mahalanobis distance that is robust against adversarial perturbations. Experimental results show that the proposed metric learning algorithm improves both certified robust errors and empirical robust errors.
arXiv Detail & Related papers (2020-06-12T09:17:08Z)
Towards Certified Robustness of Distance Metric Learning [53.96113074344632]
We advocate imposing an adversarial margin in the input space so as to improve the generalization and robustness of metric learning algorithms. We show that the enlarged margin is beneficial to the generalization ability by using the theoretical technique of algorithmic robustness.
arXiv Detail & Related papers (2020-06-10T16:51:53Z)
An end-to-end approach for the verification problem: learning the right distance [15.553424028461885]
We augment the metric learning setting by introducing a parametric pseudo-distance, trained jointly with the encoder. We first show it approximates a likelihood ratio which can be used for hypothesis tests. We observe training is much simplified under the proposed approach compared to metric learning with actual distances.
arXiv Detail & Related papers (2020-02-21T18:46:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.