Related papers: SynJax: Structured Probability Distributions for JAX

SynJax: Structured Probability Distributions for JAX

URL: http://arxiv.org/abs/2308.03291v3
Date: Sun, 15 Oct 2023 23:51:32 GMT
Title: SynJax: Structured Probability Distributions for JAX
Authors: Milo\v{s} Stanojevi\'c and Laurent Sartran
Abstract summary: SynJax provides efficient vectorized implementation of inference algorithms for structured distributions. We can build large-scale differentiable models that explicitly model structure in the data.
Score: 3.4447129363520337
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The development of deep learning software libraries enabled significant progress in the field by allowing users to focus on modeling, while letting the library to take care of the tedious and time-consuming task of optimizing execution for modern hardware accelerators. However, this has benefited only particular types of deep learning models, such as Transformers, whose primitives map easily to the vectorized computation. The models that explicitly account for structured objects, such as trees and segmentations, did not benefit equally because they require custom algorithms that are difficult to implement in a vectorized form. SynJax directly addresses this problem by providing an efficient vectorized implementation of inference algorithms for structured distributions covering alignment, tagging, segmentation, constituency trees and spanning trees. This is done by exploiting the connection between algorithms for automatic differentiation and probabilistic inference. With SynJax we can build large-scale differentiable models that explicitly model structure in the data. The code is available at https://github.com/google-deepmind/synjax

Related papers

A projection-based framework for gradient-free and parallel learning [50.96641619247761]
We introduce PJAX, a JAX-based software framework that enables this paradigm.<n>PJAX composes projection operators for elementary operations, automatically deriving the solution operators for the feasibility problems.<n>We train diverse architectures (MLPs, CNNs, RNNs) on standard benchmarks using PJAX, demonstrating its generality.
arXiv Detail & Related papers (2025-06-06T08:44:56Z)
NGPU-LM: GPU-Accelerated N-Gram Language Model for Context-Biasing in Greedy ASR Decoding [54.88765757043535]
This work rethinks data structures for statistical n-gram language models to enable fast and parallel operations for GPU-optimized inference.<n>Our approach, named NGPU-LM, introduces customizable greedy decoding for all major ASR model types with less than 7% computational overhead.<n>The proposed approach can eliminate more than 50% of the accuracy gap between greedy and beam search for out-of-domain scenarios while avoiding significant slowdown caused by beam search.
arXiv Detail & Related papers (2025-05-28T20:43:10Z)
A Top-down Graph-based Tool for Modeling Classical Semantic Maps: A Crosslinguistic Case Study of Supplementary Adverbs [50.982315553104975]
Semantic map models (SMMs) construct a network-like conceptual space from cross-linguistic instances or forms. Most SMMs are manually built by human experts using bottom-up procedures. We propose a novel graph-based algorithm that automatically generates conceptual spaces and SMMs in a top-down manner.
arXiv Detail & Related papers (2024-12-02T12:06:41Z)
BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem Solving [11.596474985695679]
We release the StructuredOR dataset, annotated with comprehensive labels that capture the complete mathematical modeling process.<n>We propose BPP-Search, an algorithm that integrates reinforcement learning into a tree-of-thought structure.
arXiv Detail & Related papers (2024-11-26T13:05:53Z)
Show and Grasp: Few-shot Semantic Segmentation for Robot Grasping through Zero-shot Foundation Models [5.792788640304759]
The ability of a robot to pick an object, known as robot grasping, is crucial for several applications. In such tasks, selecting the right target to pick is as essential as inferring a correct configuration of the gripper. A common solution to this problem relies on semantic segmentation models, which often show poor generalization to unseen objects.
arXiv Detail & Related papers (2024-04-19T08:58:52Z)
CORE: Common Random Reconstruction for Distributed Optimization with Provable Low Communication Complexity [110.50364486645852]
Communication complexity has become a major bottleneck for speeding up training and scaling up machine numbers. We propose Common Om REOm, which can be used to compress information transmitted between machines.
arXiv Detail & Related papers (2023-09-23T08:45:27Z)
Hexatagging: Projective Dependency Parsing as Tagging [63.5392760743851]
We introduce a novel dependency, the hexatagger, that constructs dependency trees by tagging the words in a sentence with elements from a finite set of possible tags. Our approach is fully parallelizable at training time, i.e., the structure-building actions needed to build a dependency parse can be predicted in parallel to each other. We achieve state-of-the-art performance of 96.4 LAS and 97.4 UAS on the Penn Treebank test set.
arXiv Detail & Related papers (2023-06-08T18:02:07Z)
pyGSL: A Graph Structure Learning Toolkit [14.000763778781547]
pyGSL is a Python library that provides efficient implementations of state-of-the-art graph structure learning models. pyGSL is written in GPU-friendly ways, allowing one to scale to much larger network tasks.
arXiv Detail & Related papers (2022-11-07T14:23:10Z)
Machine Learning Techniques to Construct Patched Analog Ensembles for Data Assimilation [0.0]
We study general and variational autoencoders for the machine learning component of cAnEnOI. We propose using patching schemes to divide the global spatial domain into digestible chunks. Testing this new algorithm on a 1D toy model, we find that larger patch sizes make it harder to train an accurate generative model.
arXiv Detail & Related papers (2021-02-27T20:47:27Z)
Captum: A unified and generic model interpretability library for PyTorch [49.72749684393332]
We introduce a novel, unified, open-source model interpretability library for PyTorch. The library contains generic implementations of a number of gradient and perturbation-based attribution algorithms. It can be used for both classification and non-classification models.
arXiv Detail & Related papers (2020-09-16T18:57:57Z)
PolyDL: Polyhedral Optimizations for Creation of High Performance DL primitives [55.79741270235602]
We present compiler algorithms to automatically generate high performance implementations of Deep Learning primitives. We develop novel data reuse analysis algorithms using the polyhedral model. We also show that such a hybrid compiler plus a minimal library-use approach results in state-of-the-art performance.
arXiv Detail & Related papers (2020-06-02T06:44:09Z)
Efficient Second-Order TreeCRF for Neural Dependency Parsing [23.426500262860777]
In the deep learning (DL) era, parsing models are extremely simplified with little hurt on performance. This paper presents a second-order TreeCRF extension to the biaffine. We propose an effective way to batchify the inside and Viterbi algorithms for direct large matrix operation.
arXiv Detail & Related papers (2020-05-03T03:18:59Z)
Learning Autoencoders with Relational Regularization [89.53065887608088]
A new framework is proposed for learning autoencoders of data distributions. We minimize the discrepancy between the model and target distributions, with a emphrelational regularization We implement the framework with two scalable algorithms, making it applicable for both probabilistic and deterministic autoencoders.
arXiv Detail & Related papers (2020-02-07T17:27:30Z)
Torch-Struct: Deep Structured Prediction Library [138.5262350501951]
We introduce Torch-Struct, a library for structured prediction. Torch-Struct includes a broad collection of probabilistic structures accessed through a simple and flexible distribution-based API.
arXiv Detail & Related papers (2020-02-03T16:43:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.