Generalized Neural Sorting Networks with Error-Free Differentiable Swap Functions
- URL: http://arxiv.org/abs/2310.07174v2
- Date: Thu, 14 Mar 2024 00:39:43 GMT
- Title: Generalized Neural Sorting Networks with Error-Free Differentiable Swap Functions
- Authors: Jungtaek Kim, Jeongbeen Yoon, Minsu Cho,
- Abstract summary: We consider sorting problems for more abstract yet expressive inputs, e.g., multi-digit images and image fragments, through a neural sorting network.
To learn a mapping from a high-dimensional input to an ordinal variable, the differentiability of sorting networks needs to be guaranteed.
- Score: 44.71975181739874
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sorting is a fundamental operation of all computer systems, having been a long-standing significant research topic. Beyond the problem formulation of traditional sorting algorithms, we consider sorting problems for more abstract yet expressive inputs, e.g., multi-digit images and image fragments, through a neural sorting network. To learn a mapping from a high-dimensional input to an ordinal variable, the differentiability of sorting networks needs to be guaranteed. In this paper we define a softening error by a differentiable swap function, and develop an error-free swap function that holds a non-decreasing condition and differentiability. Furthermore, a permutation-equivariant Transformer network with multi-head attention is adopted to capture dependency between given inputs and also leverage its model capacity with self-attention. Experiments on diverse sorting benchmarks show that our methods perform better than or comparable to baseline methods.
Related papers
- Variation Matters: from Mitigating to Embracing Zero-Shot NAS Ranking Function Variation [18.672184596814077]
We propose taking into account the variation in the ranking function output as a random variable representing a proxy performance metric.
During the search process, we strive to construct a ordering of the performance metrics to determine the best architecture.
Our experiments show that the proposed ordering can effectively boost performance of a search on standard benchmark search spaces.
arXiv Detail & Related papers (2025-02-27T01:01:22Z) - Going Beyond Neural Network Feature Similarity: The Network Feature
Complexity and Its Interpretation Using Category Theory [64.06519549649495]
We provide the definition of what we call functionally equivalent features.
These features produce equivalent output under certain transformations.
We propose an efficient algorithm named Iterative Feature Merging.
arXiv Detail & Related papers (2023-10-10T16:27:12Z) - Equivariance with Learned Canonicalization Functions [77.32483958400282]
We show that learning a small neural network to perform canonicalization is better than using predefineds.
Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks.
arXiv Detail & Related papers (2022-11-11T21:58:15Z) - Graph Neural Networks with Adaptive Readouts [5.575293536755126]
We show the effectiveness of neural readouts on more than 40 datasets spanning different domains and graph characteristics.
We observe a consistent improvement over standard readouts relative to the number of neighborhood aggregation and different convolutional operators.
arXiv Detail & Related papers (2022-11-09T15:21:09Z) - Set Interdependence Transformer: Set-to-Sequence Neural Networks for
Permutation Learning and Structure Prediction [6.396288020763144]
Set-to-sequence problems occur in natural language processing, computer vision and structure prediction.
Previous attention-based methods require $n$ layers of their set transformations to explicitly represent $n$-th order relations.
We propose a novel neural set encoding method called the Set Interdependence Transformer, capable of relating the set's permutation invariant representation to its elements within sets of any cardinality.
arXiv Detail & Related papers (2022-06-08T07:46:49Z) - Monotonic Differentiable Sorting Networks [29.75063301688965]
Differentiable sorting algorithms allow training with sorting and ranking supervision, where only the ordering or ranking of samples is known.
One problem of current differentiable sorting methods is that they are non-monotonic.
We propose a novel relaxation of conditional swap operations that guarantees monotonicity in differentiable sorting networks.
arXiv Detail & Related papers (2022-03-17T21:51:29Z) - Discovering Non-monotonic Autoregressive Orderings with Variational
Inference [67.27561153666211]
We develop an unsupervised parallelizable learner that discovers high-quality generation orders purely from training data.
We implement the encoder as a Transformer with non-causal attention that outputs permutations in one forward pass.
Empirical results in language modeling tasks demonstrate that our method is context-aware and discovers orderings that are competitive with or even better than fixed orders.
arXiv Detail & Related papers (2021-10-27T16:08:09Z) - Differentiable Sorting Networks for Scalable Sorting and Ranking
Supervision [19.437400671428737]
We propose differentiable sorting networks by relaxing their pairwise conditional swap operations.
We show that bitonic sorting networks can achieve stable training on large input sets of up to 1024 elements.
arXiv Detail & Related papers (2021-05-09T20:39:03Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z) - Fast Differentiable Sorting and Ranking [36.40586857569459]
We propose the first differentiable sorting and ranking operators with $O(n log n)$ time and $O(n)$ space complexity.
We achieve this feat by constructing differentiable operators as projections onto the permutahedron, the convex hull of permutations, and using a reduction to isotonic optimization.
arXiv Detail & Related papers (2020-02-20T17:11:09Z) - Learn to Predict Sets Using Feed-Forward Neural Networks [63.91494644881925]
This paper addresses the task of set prediction using deep feed-forward neural networks.
We present a novel approach for learning to predict sets with unknown permutation and cardinality.
We demonstrate the validity of our set formulations on relevant vision problems.
arXiv Detail & Related papers (2020-01-30T01:52:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.