An Inductive Bias for Distances: Neural Nets that Respect the Triangle
Inequality
- URL: http://arxiv.org/abs/2002.05825v3
- Date: Mon, 6 Jul 2020 20:06:56 GMT
- Title: An Inductive Bias for Distances: Neural Nets that Respect the Triangle
Inequality
- Authors: Silviu Pitis, Harris Chan, Kiarash Jamali, Jimmy Ba
- Abstract summary: Distances are pervasive in machine learning and serve as similarity measures, loss functions, and learning targets.
Deep metric learning architectures that respect the triangle inequality rely, almost exclusively, on Euclidean distance in the latent space.
We show that our architectures outperform existing metric approaches when modeling graph distances and have a better inductive bias than non-metric approaches when training data is limited.
- Score: 38.95108641775682
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distances are pervasive in machine learning. They serve as similarity
measures, loss functions, and learning targets; it is said that a good distance
measure solves a task. When defining distances, the triangle inequality has
proven to be a useful constraint, both theoretically--to prove convergence and
optimality guarantees--and empirically--as an inductive bias. Deep metric
learning architectures that respect the triangle inequality rely, almost
exclusively, on Euclidean distance in the latent space. Though effective, this
fails to model two broad classes of subadditive distances, common in graphs and
reinforcement learning: asymmetric metrics, and metrics that cannot be embedded
into Euclidean space. To address these problems, we introduce novel
architectures that are guaranteed to satisfy the triangle inequality. We prove
our architectures universally approximate norm-induced metrics on
$\mathbb{R}^n$, and present a similar result for modified Input Convex Neural
Networks. We show that our architectures outperform existing metric approaches
when modeling graph distances and have a better inductive bias than non-metric
approaches when training data is limited in the multi-goal reinforcement
learning setting.
Related papers
- Geometry-aware Distance Measure for Diverse Hierarchical Structures in Hyperbolic Spaces [48.948334221681684]
We propose a geometry-aware distance measure in hyperbolic spaces, which dynamically adapts to varying hierarchical structures.<n>Our approach consistently outperforms learning methods that use fixed distance measures.<n>Visualization shows clearer class boundaries and improved prototype separation in hyperbolic spaces.
arXiv Detail & Related papers (2025-06-23T11:43:39Z) - Riemannian Metric Learning: Closer to You than You Imagine [1.6574413179773761]
Review provides a structured and accessible overview of key methods, applications, and recent advances.
Describes a powerful generalization that leverages differential geometry to model the data according to their underlying Riemannian manifold.
argues that this review should serve as a valuable resource for researchers and practitioners.
arXiv Detail & Related papers (2025-03-07T11:00:29Z) - Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making [66.27188304203217]
Temporal distances lie at the heart of many algorithms for planning, control, and reinforcement learning.
Prior attempts to define such temporal distances in settings have been stymied by an important limitation.
We show how successor features learned by contrastive learning form a temporal distance that does satisfy the triangle inequality.
arXiv Detail & Related papers (2024-06-24T19:36:45Z) - Revisiting Evaluation Metrics for Semantic Segmentation: Optimization
and Evaluation of Fine-grained Intersection over Union [113.20223082664681]
We propose the use of fine-grained mIoUs along with corresponding worst-case metrics.
These fine-grained metrics offer less bias towards large objects, richer statistical information, and valuable insights into model and dataset auditing.
Our benchmark study highlights the necessity of not basing evaluations on a single metric and confirms that fine-grained mIoUs reduce the bias towards large objects.
arXiv Detail & Related papers (2023-10-30T03:45:15Z) - Tight and fast generalization error bound of graph embedding in metric
space [54.279425319381374]
We show that graph embedding in non-Euclidean metric space can outperform that in Euclidean space with much smaller training data than the existing bound has suggested.
Our new upper bound is significantly tighter and faster than the existing one, which can be exponential to $R$ and $O(frac1S)$ at the fastest.
arXiv Detail & Related papers (2023-05-13T17:29:18Z) - A principled distributional approach to trajectory similarity
measurement [8.316979146894989]
This paper proposes a powerful way to represent trajectories and measure the similarity between two trajectories using a distributional kernel.
A distributional kernel is used for the very first time for trajectory representation and similarity measurement.
We show the generality of this new approach in three applications: (a) trajectory anomaly detection, (b) anomalous sub-trajectory detection, and (c) trajectory pattern mining.
arXiv Detail & Related papers (2023-01-01T12:35:07Z) - Mathematical Justification of Hard Negative Mining via Isometric
Approximation Theorem [18.525368151998386]
In deep metric learning, the Triplet Loss has emerged as a popular method to learn many computer vision and natural language processing tasks.
One issue that plagues the Triplet Loss is network collapse, an undesirable phenomenon where the network projects the embeddings of all data onto a single point.
While hard negative mining is the most effective of these strategies, existing formulations lack strong theoretical justification for their empirical success.
arXiv Detail & Related papers (2022-10-20T11:21:17Z) - Neural Bregman Divergences for Distance Learning [60.375385370556145]
We propose a new approach to learning arbitrary Bregman divergences in a differentiable manner via input convex neural networks.
We show that our method more faithfully learns divergences over a set of both new and previously studied tasks.
Our tests further extend to known asymmetric, but non-Bregman tasks, where our method still performs competitively despite misspecification.
arXiv Detail & Related papers (2022-06-09T20:53:15Z) - GELATO: Geometrically Enriched Latent Model for Offline Reinforcement
Learning [54.291331971813364]
offline reinforcement learning approaches can be divided into proximal and uncertainty-aware methods.
In this work, we demonstrate the benefit of combining the two in a latent variational model.
Our proposed metrics measure both the quality of out of distribution samples as well as the discrepancy of examples in the data.
arXiv Detail & Related papers (2021-02-22T19:42:40Z) - New Methods for Detecting Concentric Objects With High Accuracy [0.0]
Fitting geometric objects to digitized data is an important problem in many areas such as iris detection, autonomous navigation, and industrial robotics operations.
There are two common approaches to fitting geometric shapes to data: the geometric (iterative) approach and algebraic (non-iterative) approach.
We develop new estimators, which can be used as reliable initial guesses for other iterative methods.
arXiv Detail & Related papers (2021-02-16T08:19:18Z) - Towards Certified Robustness of Distance Metric Learning [53.96113074344632]
We advocate imposing an adversarial margin in the input space so as to improve the generalization and robustness of metric learning algorithms.
We show that the enlarged margin is beneficial to the generalization ability by using the theoretical technique of algorithmic robustness.
arXiv Detail & Related papers (2020-06-10T16:51:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.