Imitation Learning with Sinkhorn Distances
- URL: http://arxiv.org/abs/2008.09167v2
- Date: Sat, 2 Jul 2022 17:16:08 GMT
- Title: Imitation Learning with Sinkhorn Distances
- Authors: Georgios Papagiannis and Yunpeng Li
- Abstract summary: We present tractable solutions by formulating imitation learning as minimization of the Sinkhorn distance between occupancy measures.
We evaluate the proposed approach using both the reward metric and the Sinkhorn distance metric on a number of MuJoCo experiments.
- Score: 12.161649672131286
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Imitation learning algorithms have been interpreted as variants of divergence
minimization problems. The ability to compare occupancy measures between
experts and learners is crucial in their effectiveness in learning from
demonstrations. In this paper, we present tractable solutions by formulating
imitation learning as minimization of the Sinkhorn distance between occupancy
measures. The formulation combines the valuable properties of optimal transport
metrics in comparing non-overlapping distributions with a cosine distance cost
defined in an adversarially learned feature space. This leads to a highly
discriminative critic network and optimal transport plan that subsequently
guide imitation learning. We evaluate the proposed approach using both the
reward metric and the Sinkhorn distance metric on a number of MuJoCo
experiments. For the implementation and reproducing results please refer to the
following repository https://github.com/gpapagiannis/sinkhorn-imitation.
Related papers
- Kolmogorov-Smirnov GAN [52.36633001046723]
We propose a novel deep generative model, the Kolmogorov-Smirnov Generative Adversarial Network (KSGAN)
Unlike existing approaches, KSGAN formulates the learning process as a minimization of the Kolmogorov-Smirnov (KS) distance.
arXiv Detail & Related papers (2024-06-28T14:30:14Z) - Sinkhorn Distance Minimization for Knowledge Distillation [97.64216712016571]
Knowledge distillation (KD) has been widely adopted to compress large language models (LLMs)
In this paper, we show that the aforementioned KL, RKL, and JS divergences respectively suffer from issues of mode-averaging, mode-collapsing, and mode-underestimation.
We propose the Sinkhorn Knowledge Distillation (SinKD) that exploits the Sinkhorn distance to ensure a nuanced and precise assessment of the disparity between teacher and student distributions.
arXiv Detail & Related papers (2024-02-27T01:13:58Z) - Histopathology Image Classification using Deep Manifold Contrastive
Learning [8.590026259176806]
We propose a novel extension of contrastive learning that leverages geodesic distance between features as a similarity metric for histopathology whole slide image classification.
Results demonstrate that our method outperforms state-of-the-art cosine-distance-based contrastive learning methods.
arXiv Detail & Related papers (2023-06-26T07:02:07Z) - Contrastive Bayesian Analysis for Deep Metric Learning [30.21464199249958]
We develop a contrastive Bayesian analysis to characterize and model the posterior probabilities of image labels conditioned by their features similarity.
This contrastive Bayesian analysis leads to a new loss function for deep metric learning.
Our experimental results and ablation studies demonstrate that the proposed contrastive Bayesian metric learning method significantly improves the performance of deep metric learning.
arXiv Detail & Related papers (2022-10-10T02:24:21Z) - Neural Bregman Divergences for Distance Learning [60.375385370556145]
We propose a new approach to learning arbitrary Bregman divergences in a differentiable manner via input convex neural networks.
We show that our method more faithfully learns divergences over a set of both new and previously studied tasks.
Our tests further extend to known asymmetric, but non-Bregman tasks, where our method still performs competitively despite misspecification.
arXiv Detail & Related papers (2022-06-09T20:53:15Z) - Embedding Transfer with Label Relaxation for Improved Metric Learning [43.94511888670419]
We present a novel method for embedding transfer, a task of transferring knowledge of a learned embedding model to another.
Our method exploits pairwise similarities between samples in the source embedding space as the knowledge, and transfers them through a loss used for learning target embedding models.
arXiv Detail & Related papers (2021-03-27T13:35:03Z) - Robust Imitation Learning from Noisy Demonstrations [81.67837507534001]
We show that robust imitation learning can be achieved by optimizing a classification risk with a symmetric loss.
We propose a new imitation learning method that effectively combines pseudo-labeling with co-training.
Experimental results on continuous-control benchmarks show that our method is more robust compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-10-20T10:41:37Z) - Provably Robust Metric Learning [98.50580215125142]
We show that existing metric learning algorithms can result in metrics that are less robust than the Euclidean distance.
We propose a novel metric learning algorithm to find a Mahalanobis distance that is robust against adversarial perturbations.
Experimental results show that the proposed metric learning algorithm improves both certified robust errors and empirical robust errors.
arXiv Detail & Related papers (2020-06-12T09:17:08Z) - Towards Certified Robustness of Distance Metric Learning [53.96113074344632]
We advocate imposing an adversarial margin in the input space so as to improve the generalization and robustness of metric learning algorithms.
We show that the enlarged margin is beneficial to the generalization ability by using the theoretical technique of algorithmic robustness.
arXiv Detail & Related papers (2020-06-10T16:51:53Z) - An end-to-end approach for the verification problem: learning the right
distance [15.553424028461885]
We augment the metric learning setting by introducing a parametric pseudo-distance, trained jointly with the encoder.
We first show it approximates a likelihood ratio which can be used for hypothesis tests.
We observe training is much simplified under the proposed approach compared to metric learning with actual distances.
arXiv Detail & Related papers (2020-02-21T18:46:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.