Related papers: Learning Distributions over Permutations and Rankings with Factorized Representations

Learning Distributions over Permutations and Rankings with Factorized Representations

URL: http://arxiv.org/abs/2505.24664v1
Date: Fri, 30 May 2025 14:53:40 GMT
Title: Learning Distributions over Permutations and Rankings with Factorized Representations
Authors: Daniel Severo, Brian Karrer, Niklas Nolte,
Abstract summary: Learning distributions over permutations is a fundamental problem in machine learning.<n>We propose a novel approach that leverages alternative representations for permutations.<n>We show that our method learns nontrivial distributions even in the least expressive mode.
Score: 6.51628774380971
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning distributions over permutations is a fundamental problem in machine learning, with applications in ranking, combinatorial optimization, structured prediction, and data association. Existing methods rely on mixtures of parametric families or neural networks with expensive variational inference procedures. In this work, we propose a novel approach that leverages alternative representations for permutations, including Lehmer codes, Fisher-Yates draws, and Insertion-Vectors. These representations form a bijection with the symmetric group, allowing for unconstrained learning using conventional deep learning techniques, and can represent any probability distribution over permutations. Our approach enables a trade-off between expressivity of the model family and computational requirements. In the least expressive and most computationally efficient case, our method subsumes previous families of well established probabilistic models over permutations, including Mallow's and the Repeated Insertion Model. Experiments indicate our method significantly outperforms current approaches on the jigsaw puzzle benchmark, a common task for permutation learning. However, we argue this benchmark is limited in its ability to assess learning probability distributions, as the target is a delta distribution (i.e., a single correct solution exists). We therefore propose two additional benchmarks: learning cyclic permutations and re-ranking movies based on user preference. We show that our method learns non-trivial distributions even in the least expressive mode, while traditional models fail to even generate valid permutations in this setting.

Related papers

A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization [7.378582040635655]
Current deep learning approaches rely on generative models that yield exact sample likelihoods. This work introduces a method that lifts this restriction and opens the possibility to employ highly expressive latent variable models. We experimentally validate our approach in data-free Combinatorial Optimization and demonstrate that our method achieves a new state-of-the-art on a wide range of benchmark problems.
arXiv Detail & Related papers (2024-06-03T17:55:02Z)
TERM Model: Tensor Ring Mixture Model for Density Estimation [48.622060998018206]
In this paper, we take tensor ring decomposition for density estimator, which significantly reduces the number of permutation candidates. A mixture model that incorporates multiple permutation candidates with adaptive weights is further designed, resulting in increased expressive flexibility. This approach acknowledges that suboptimal permutations can offer distinctive information besides that of optimal permutations.
arXiv Detail & Related papers (2023-12-13T11:39:56Z)
Probabilistic Invariant Learning with Randomized Linear Classifiers [24.485477981244593]
We show how to leverage randomness and design models that are both expressive and invariant but use less resources. Inspired by randomized algorithms, we propose a class of binary classification models called Randomized Linears (RLCs)
arXiv Detail & Related papers (2023-08-08T17:18:04Z)
Equivariance with Learned Canonicalization Functions [77.32483958400282]
We show that learning a small neural network to perform canonicalization is better than using predefineds. Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks.
arXiv Detail & Related papers (2022-11-11T21:58:15Z)
Structured Reordering for Modeling Latent Alignments in Sequence Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations. The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z)
Towards Improved and Interpretable Deep Metric Learning via Attentive Grouping [103.71992720794421]
Grouping has been commonly used in deep metric learning for computing diverse features. We propose an improved and interpretable grouping method to be integrated flexibly with any metric learning framework.
arXiv Detail & Related papers (2020-11-17T19:08:24Z)
Regularizing Towards Permutation Invariance in Recurrent Models [26.36835670113303]
We show that RNNs can be regularized towards permutation invariance, and that this can result in compact models. Existing solutions mostly suggest restricting the learning problem to hypothesis classes which are permutation invariant by design. We show that our method outperforms other permutation invariant approaches on synthetic and real world datasets.
arXiv Detail & Related papers (2020-10-25T07:46:51Z)
Scalable Normalizing Flows for Permutation Invariant Densities [0.0]
A promising approach defines a family of permutation invariant densities with continuous normalizing flows. We demonstrate how calculating the trace, a crucial step in this method, raises issues that occur both during training and inference. We propose an alternative way of defining permutation equivariant transformations that give closed form trace.
arXiv Detail & Related papers (2020-10-07T07:51:30Z)
Performance-Agnostic Fusion of Probabilistic Classifier Outputs [2.4206828137867107]
We propose a method for combining probabilistic outputs of classifiers to make a single consensus class prediction. Our proposed method works well in situations where accuracy is the performance metric. It does not output calibrated probabilities, so it is not suitable in situations where such probabilities are required for further processing.
arXiv Detail & Related papers (2020-09-01T16:53:29Z)
Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers. We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model. Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
Continual Learning using a Bayesian Nonparametric Dictionary of Weight Factors [75.58555462743585]
Naively trained neural networks tend to experience catastrophic forgetting in sequential task settings. We propose a principled nonparametric approach based on the Indian Buffet Process (IBP) prior, letting the data determine how much to expand the model complexity. We demonstrate the effectiveness of our method on a number of continual learning benchmarks and analyze how weight factors are allocated and reused throughout the training.
arXiv Detail & Related papers (2020-04-21T15:20:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.