Related papers: An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality

An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality

URL: http://arxiv.org/abs/2002.05825v3
Date: Mon, 6 Jul 2020 20:06:56 GMT
Title: An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality
Authors: Silviu Pitis, Harris Chan, Kiarash Jamali, Jimmy Ba
Abstract summary: Distances are pervasive in machine learning and serve as similarity measures, loss functions, and learning targets. Deep metric learning architectures that respect the triangle inequality rely, almost exclusively, on Euclidean distance in the latent space. We show that our architectures outperform existing metric approaches when modeling graph distances and have a better inductive bias than non-metric approaches when training data is limited.
Score: 38.95108641775682
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Distances are pervasive in machine learning. They serve as similarity measures, loss functions, and learning targets; it is said that a good distance measure solves a task. When defining distances, the triangle inequality has proven to be a useful constraint, both theoretically--to prove convergence and optimality guarantees--and empirically--as an inductive bias. Deep metric learning architectures that respect the triangle inequality rely, almost exclusively, on Euclidean distance in the latent space. Though effective, this fails to model two broad classes of subadditive distances, common in graphs and reinforcement learning: asymmetric metrics, and metrics that cannot be embedded into Euclidean space. To address these problems, we introduce novel architectures that are guaranteed to satisfy the triangle inequality. We prove our architectures universally approximate norm-induced metrics on $\mathbb{R}^n$, and present a similar result for modified Input Convex Neural Networks. We show that our architectures outperform existing metric approaches when modeling graph distances and have a better inductive bias than non-metric approaches when training data is limited in the multi-goal reinforcement learning setting.

Related papers

Geometry-aware Distance Measure for Diverse Hierarchical Structures in Hyperbolic Spaces [48.948334221681684]
We propose a geometry-aware distance measure in hyperbolic spaces, which dynamically adapts to varying hierarchical structures.<n>Our approach consistently outperforms learning methods that use fixed distance measures.<n>Visualization shows clearer class boundaries and improved prototype separation in hyperbolic spaces.
arXiv Detail & Related papers (2025-06-23T11:43:39Z)
Riemannian Metric Learning: Closer to You than You Imagine [1.6574413179773761]
Review provides a structured and accessible overview of key methods, applications, and recent advances. Describes a powerful generalization that leverages differential geometry to model the data according to their underlying Riemannian manifold. argues that this review should serve as a valuable resource for researchers and practitioners.
arXiv Detail & Related papers (2025-03-07T11:00:29Z)
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making [66.27188304203217]
Temporal distances lie at the heart of many algorithms for planning, control, and reinforcement learning. Prior attempts to define such temporal distances in settings have been stymied by an important limitation. We show how successor features learned by contrastive learning form a temporal distance that does satisfy the triangle inequality.
arXiv Detail & Related papers (2024-06-24T19:36:45Z)
Revisiting Evaluation Metrics for Semantic Segmentation: Optimization and Evaluation of Fine-grained Intersection over Union [113.20223082664681]
We propose the use of fine-grained mIoUs along with corresponding worst-case metrics. These fine-grained metrics offer less bias towards large objects, richer statistical information, and valuable insights into model and dataset auditing. Our benchmark study highlights the necessity of not basing evaluations on a single metric and confirms that fine-grained mIoUs reduce the bias towards large objects.
arXiv Detail & Related papers (2023-10-30T03:45:15Z)
Tight and fast generalization error bound of graph embedding in metric space [54.279425319381374]
We show that graph embedding in non-Euclidean metric space can outperform that in Euclidean space with much smaller training data than the existing bound has suggested. Our new upper bound is significantly tighter and faster than the existing one, which can be exponential to $R$ and $O(frac1S)$ at the fastest.
arXiv Detail & Related papers (2023-05-13T17:29:18Z)
A principled distributional approach to trajectory similarity measurement [8.316979146894989]
This paper proposes a powerful way to represent trajectories and measure the similarity between two trajectories using a distributional kernel. A distributional kernel is used for the very first time for trajectory representation and similarity measurement. We show the generality of this new approach in three applications: (a) trajectory anomaly detection, (b) anomalous sub-trajectory detection, and (c) trajectory pattern mining.
arXiv Detail & Related papers (2023-01-01T12:35:07Z)
Mathematical Justification of Hard Negative Mining via Isometric Approximation Theorem [18.525368151998386]
In deep metric learning, the Triplet Loss has emerged as a popular method to learn many computer vision and natural language processing tasks. One issue that plagues the Triplet Loss is network collapse, an undesirable phenomenon where the network projects the embeddings of all data onto a single point. While hard negative mining is the most effective of these strategies, existing formulations lack strong theoretical justification for their empirical success.
arXiv Detail & Related papers (2022-10-20T11:21:17Z)
Neural Bregman Divergences for Distance Learning [60.375385370556145]
We propose a new approach to learning arbitrary Bregman divergences in a differentiable manner via input convex neural networks. We show that our method more faithfully learns divergences over a set of both new and previously studied tasks. Our tests further extend to known asymmetric, but non-Bregman tasks, where our method still performs competitively despite misspecification.
arXiv Detail & Related papers (2022-06-09T20:53:15Z)
GELATO: Geometrically Enriched Latent Model for Offline Reinforcement Learning [54.291331971813364]
offline reinforcement learning approaches can be divided into proximal and uncertainty-aware methods. In this work, we demonstrate the benefit of combining the two in a latent variational model. Our proposed metrics measure both the quality of out of distribution samples as well as the discrepancy of examples in the data.
arXiv Detail & Related papers (2021-02-22T19:42:40Z)
New Methods for Detecting Concentric Objects With High Accuracy [0.0]
Fitting geometric objects to digitized data is an important problem in many areas such as iris detection, autonomous navigation, and industrial robotics operations. There are two common approaches to fitting geometric shapes to data: the geometric (iterative) approach and algebraic (non-iterative) approach. We develop new estimators, which can be used as reliable initial guesses for other iterative methods.
arXiv Detail & Related papers (2021-02-16T08:19:18Z)
Towards Certified Robustness of Distance Metric Learning [53.96113074344632]
We advocate imposing an adversarial margin in the input space so as to improve the generalization and robustness of metric learning algorithms. We show that the enlarged margin is beneficial to the generalization ability by using the theoretical technique of algorithmic robustness.
arXiv Detail & Related papers (2020-06-10T16:51:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.