Higher-order Derivatives of Weighted Finite-state Machines
- URL: http://arxiv.org/abs/2106.00749v2
- Date: Wed, 27 Sep 2023 18:02:34 GMT
- Title: Higher-order Derivatives of Weighted Finite-state Machines
- Authors: Ran Zmigrod, Tim Vieira, Ryan Cotterell
- Abstract summary: This work examines the computation of higher-order derivatives with respect to the normalization constant for weighted finite-state machines.
We provide a general algorithm for evaluating derivatives of all orders, which has not been previously described in the literature.
Our algorithm is significantly faster than prior algorithms.
- Score: 68.43084108204741
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Weighted finite-state machines are a fundamental building block of NLP
systems. They have withstood the test of time -- from their early use in noisy
channel models in the 1990s up to modern-day neurally parameterized conditional
random fields. This work examines the computation of higher-order derivatives
with respect to the normalization constant for weighted finite-state machines.
We provide a general algorithm for evaluating derivatives of all orders, which
has not been previously described in the literature. In the case of
second-order derivatives, our scheme runs in the optimal $\mathcal{O}(A^2 N^4)$
time where $A$ is the alphabet size and $N$ is the number of states. Our
algorithm is significantly faster than prior algorithms. Additionally, our
approach leads to a significantly faster algorithm for computing second-order
expectations, such as covariance matrices and gradients of first-order
expectations.
Related papers
- Fast Minimization of Expected Logarithmic Loss via Stochastic Dual
Averaging [8.990961435218544]
We propose a first-order algorithm named $B$-sample dual averaging with the logarithmic barrier.
For the Poisson inverse problem, our algorithm attains an $varepsilon$ solution in $smashtildeO(d3/varepsilon2)$ time.
When computing the maximum-likelihood estimate for quantum state tomography, our algorithm yields an $varepsilon$-optimal solution in $smashtildeO(d3/varepsilon2)$ time.
arXiv Detail & Related papers (2023-11-05T03:33:44Z) - On adaptive low-depth quantum algorithms for robust multiple-phase
estimation [11.678822620192438]
We present robust multiple-phase estimation (RMPE) algorithms with Heisenberg-limited scaling.
These algorithms are particularly suitable for early fault-tolerant quantum computers.
arXiv Detail & Related papers (2023-03-14T17:38:01Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications.
This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z) - Improved Rate of First Order Algorithms for Entropic Optimal Transport [2.1485350418225244]
This paper improves the state-of-the-art rate of a first-order algorithm for solving entropy regularized optimal transport.
We propose an accelerated primal-dual mirror descent algorithm with variance reduction.
Our algorithm may inspire more research to develop accelerated primal-dual algorithms that have rate $widetildeO(n2/epsilon)$ for solving OT.
arXiv Detail & Related papers (2023-01-23T19:13:25Z) - Private estimation algorithms for stochastic block models and mixture
models [63.07482515700984]
General tools for designing efficient private estimation algorithms.
First efficient $(epsilon, delta)$-differentially private algorithm for both weak recovery and exact recovery.
arXiv Detail & Related papers (2023-01-11T09:12:28Z) - Quantum Algorithms for Sampling Log-Concave Distributions and Estimating
Normalizing Constants [8.453228628258778]
We develop quantum algorithms for sampling logconcave distributions and for estimating their normalizing constants.
We exploit quantum analogs of the Monte Carlo method and quantum walks.
We also prove a $1/epsilon1-o(1)$ quantum lower bound for estimating normalizing constants.
arXiv Detail & Related papers (2022-10-12T19:10:43Z) - Scheduling with Speed Predictions [10.217102227210669]
We study the speed-robust scheduling problem where the speeds of the machines, instead of the processing times of the jobs, are unknown.
Our main result is an algorithm that achieves a $mineta2 (1+epsilon)2 (1+epsilon)(2 + 2/alpha)$ approximation.
When the predictions are accurate, this approximation improves over the previously best known approximation of $2-1/m$ for speed-robust scheduling.
arXiv Detail & Related papers (2022-05-02T23:39:41Z) - Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications.
We propose two new algorithms for bilevel optimization.
We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z) - Correcting Momentum with Second-order Information [50.992629498861724]
We develop a new algorithm for non-critical optimization that finds an $O(epsilon)$epsilon point in the optimal product.
We validate our results on a variety of large-scale deep learning benchmarks and architectures.
arXiv Detail & Related papers (2021-03-04T19:01:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.