Monotonic Differentiable Sorting Networks
- URL: http://arxiv.org/abs/2203.09630v1
- Date: Thu, 17 Mar 2022 21:51:29 GMT
- Title: Monotonic Differentiable Sorting Networks
- Authors: Felix Petersen, Christian Borgelt, Hilde Kuehne, Oliver Deussen
- Abstract summary: Differentiable sorting algorithms allow training with sorting and ranking supervision, where only the ordering or ranking of samples is known.
One problem of current differentiable sorting methods is that they are non-monotonic.
We propose a novel relaxation of conditional swap operations that guarantees monotonicity in differentiable sorting networks.
- Score: 29.75063301688965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differentiable sorting algorithms allow training with sorting and ranking
supervision, where only the ordering or ranking of samples is known. Various
methods have been proposed to address this challenge, ranging from optimal
transport-based differentiable Sinkhorn sorting algorithms to making classic
sorting networks differentiable. One problem of current differentiable sorting
methods is that they are non-monotonic. To address this issue, we propose a
novel relaxation of conditional swap operations that guarantees monotonicity in
differentiable sorting networks. We introduce a family of sigmoid functions and
prove that they produce differentiable sorting networks that are monotonic.
Monotonicity ensures that the gradients always have the correct sign, which is
an advantage in gradient-based optimization. We demonstrate that monotonic
differentiable sorting networks improve upon previous differentiable sorting
methods.
Related papers
- Generalizing Stochastic Smoothing for Differentiation and Gradient Estimation [59.86921150579892]
We deal with the problem of gradient estimation for differentiable relaxations of algorithms, operators, simulators, and other non-differentiable functions.
We develop variance reduction strategies for differentiable sorting and ranking, differentiable shortest-paths on graphs, differentiable rendering for pose estimation, as well as differentiable cryo-ET simulations.
arXiv Detail & Related papers (2024-10-10T17:10:00Z) - Generalized Neural Sorting Networks with Error-Free Differentiable Swap Functions [44.71975181739874]
We consider sorting problems for more abstract yet expressive inputs, e.g., multi-digit images and image fragments, through a neural sorting network.
To learn a mapping from a high-dimensional input to an ordinal variable, the differentiability of sorting networks needs to be guaranteed.
arXiv Detail & Related papers (2023-10-11T03:47:34Z) - Diffsurv: Differentiable sorting for censored time-to-event data [0.3303008003874494]
Currently, the most common approach to survival analysis is based on Cox's partial likelihood.
We propose a novel method called Diffsurv to handle censored tasks.
Our experiments show that Diffsurv outperforms established baselines in various simulated and real-world risk prediction scenarios.
arXiv Detail & Related papers (2023-04-26T14:42:31Z) - Learning by Sorting: Self-supervised Learning with Group Ordering
Constraints [75.89238437237445]
This paper proposes a new variation of the contrastive learning objective, Group Ordering Constraints (GroCo)
It exploits the idea of sorting the distances of positive and negative pairs and computing the respective loss based on how many positive pairs have a larger distance than the negative pairs, and thus are not ordered correctly.
We evaluate the proposed formulation on various self-supervised learning benchmarks and show that it not only leads to improved results compared to vanilla contrastive learning but also shows competitive performance to comparable methods in linear probing and outperforms current methods in k-NN performance.
arXiv Detail & Related papers (2023-01-05T11:17:55Z) - Learning to Hash Naturally Sorts [84.90210592082829]
We introduce Naturally-Sorted Hashing (NSH) to train a deep hashing model with sorted results end-to-end.
NSH sort the Hamming distances of samples' hash codes and accordingly gather their latent representations for self-supervised training.
We describe a novel Sorted Noise-Contrastive Estimation (SortedNCE) loss that selectively picks positive and negative samples for contrastive learning.
arXiv Detail & Related papers (2022-01-31T16:19:02Z) - Differentiable Sorting Networks for Scalable Sorting and Ranking
Supervision [19.437400671428737]
We propose differentiable sorting networks by relaxing their pairwise conditional swap operations.
We show that bitonic sorting networks can achieve stable training on large input sets of up to 1024 elements.
arXiv Detail & Related papers (2021-05-09T20:39:03Z) - PiRank: Learning To Rank via Differentiable Sorting [85.28916333414145]
We propose PiRank, a new class of differentiable surrogates for ranking.
We show that PiRank exactly recovers the desired metrics in the limit of zero temperature.
arXiv Detail & Related papers (2020-12-12T05:07:36Z) - Sparse Sinkhorn Attention [93.88158993722716]
We propose Sparse Sinkhorn Attention, a new efficient and sparse method for learning to attend.
We introduce a meta sorting network that learns to generate latent permutations over sequences.
Given sorted sequences, we are then able to compute quasi-global attention with only local windows.
arXiv Detail & Related papers (2020-02-26T04:18:01Z) - Fast Differentiable Sorting and Ranking [36.40586857569459]
We propose the first differentiable sorting and ranking operators with $O(n log n)$ time and $O(n)$ space complexity.
We achieve this feat by constructing differentiable operators as projections onto the permutahedron, the convex hull of permutations, and using a reduction to isotonic optimization.
arXiv Detail & Related papers (2020-02-20T17:11:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.