On the Communication Complexity of Approximate Pattern Matching
- URL: http://arxiv.org/abs/2403.18812v2
- Date: Wed, 09 Oct 2024 12:08:17 GMT
- Title: On the Communication Complexity of Approximate Pattern Matching
- Authors: Tomasz Kociumaka, Jakob Nogler, Philip Wellnitz,
- Abstract summary: We prove an upper bound of $O(n/m cdot k log2 m)$ bits, thus establishing the optimal communication complexity up to logarithmic factors.
We also show that $O(n/m cdot k log2 m)$ bits allow encoding, for each $k$-error occurrence of $P$ in $T$, the shortest sequence of edits needed to make the occurrence exact.
- Score: 2.4167127333650202
- License:
- Abstract: The decades-old Pattern Matching with Edits problem, given a length-$n$ string $T$ (the text), a length-$m$ string $P$ (the pattern), and a positive integer $k$ (the threshold), asks to list all fragments of $T$ that are at edit distance at most $k$ from $P$. The one-way communication complexity of this problem is the minimum amount of space needed to encode the answer so that it can be retrieved without accessing the input strings $P$ and $T$. The closely related Pattern Matching with Mismatches problem (defined in terms of the Hamming distance instead of the edit distance) is already well understood from the communication complexity perspective: Clifford, Kociumaka, and Porat [SODA 2019] proved that $\Omega(n/m \cdot k \log(m/k))$ bits are necessary and $O(n/m \cdot k\log (m|\Sigma|/k))$ bits are sufficient; the upper bound allows encoding not only the occurrences of $P$ in $T$ with at most $k$ mismatches but also the substitutions needed to make each $k$-mismatch occurrence exact. Despite recent improvements in the running time [Charalampopoulos, Kociumaka, and Wellnitz; FOCS 2020 and 2022], the communication complexity of Pattern Matching with Edits remained unexplored, with a lower bound of $\Omega(n/m \cdot k\log(m/k))$ bits and an upper bound of $O(n/m \cdot k^3\log m)$ bits stemming from previous research. In this work, we prove an upper bound of $O(n/m \cdot k \log^2 m)$ bits, thus establishing the optimal communication complexity up to logarithmic factors. We also show that $O(n/m \cdot k \log m \log (m|\Sigma|))$ bits allow encoding, for each $k$-error occurrence of $P$ in $T$, the shortest sequence of edits needed to make the occurrence exact. We leverage the techniques behind our new result on the communication complexity to obtain quantum algorithms for Pattern Matching with Edits.
Related papers
- Quantum Algorithm for the Multiple String Matching Problem [0.0]
We consider a sequence of $m$ strings, denoted by $S$, which we refer to as a dictionary.
The objective is to identify all instances of strings from the dictionary within the text.
We propose a quantum algorithm with $O(n+sqrtmLlog nlog n)$ query complexity and $O(n+sqrtmLlog n)=O*(n+sqrtmL)$ time complexity.
arXiv Detail & Related papers (2024-11-22T10:50:43Z) - Near-Optimal-Time Quantum Algorithms for Approximate Pattern Matching [2.4167127333650202]
We consider the two most common distances: Hamming distance in Pattern Matching with Mismatches and edit distance in Pattern Matching with Edits.
We present quantum algorithms with a time complexity of $tildeO(sqrtnk+sqrtn/mcdot k3.5)$ for Pattern Matching with Mismatches and $hatO(sqrtnk+sqrtn/mcdot k3.5)$ for Pattern Matching with Edits.
arXiv Detail & Related papers (2024-10-09T12:05:26Z) - Nearly-Optimal Consensus Tolerating Adaptive Omissions: Why is a Lot of Randomness Needed? [4.465134753953128]
We study the problem of reaching agreement in a synchronous distributed system by $n$ autonomous parties, when the communication links from/to faulty parties can omit messages.
We design a randomized algorithm that works in $O(sqrtnlog2 n)$ rounds and sends $O(n2log3 n)$ communication bits.
We prove that no MC algorithm can work in less than $Omega(fracn2maxR,nlog n)$ rounds if it uses less than $O(R)$ calls to
arXiv Detail & Related papers (2024-05-08T02:17:10Z) - Near-Optimal Quantum Algorithms for Bounded Edit Distance and Lempel-Ziv
Factorization [2.684542790908823]
We present a quantum $tildeO(sqrtnk+k2)$-time algorithm that uses $tildeO(sqrtnz)$ queries, where $tildeO(cdot)$ hides polylogarithmic factors.
Our second main contribution is a quantum algorithm that achieves the optimal time complexity of $tildeO(sqrtnz)$.
arXiv Detail & Related papers (2023-11-03T09:09:23Z) - Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning [54.806166861456035]
We study the episodic reinforcement learning (RL) problem modeled by finite-horizon Markov Decision Processes (MDPs) with constraint on the number of batches.
We design a computational efficient algorithm to achieve near-optimal regret of $tildeO(sqrtSAH3Kln (1/delta))$tildeO(cdot) hides logarithmic terms of $(S,A,H,K)$ in $K$ episodes.
Our technical contribution are two-fold: 1) a near-optimal design scheme to explore
arXiv Detail & Related papers (2022-10-15T09:22:22Z) - Logarithmic Regret from Sublinear Hints [76.87432703516942]
We show that an algorithm can obtain $O(log T)$ regret with just $O(sqrtT)$ hints under a natural query model.
We also show that $o(sqrtT)$ hints cannot guarantee better than $Omega(sqrtT)$ regret.
arXiv Detail & Related papers (2021-11-09T16:50:18Z) - An Optimal Separation of Randomized and Quantum Query Complexity [67.19751155411075]
We prove that for every decision tree, the absolute values of the Fourier coefficients of a given order $ellsqrtbinomdell (1+log n)ell-1,$ sum to at most $cellsqrtbinomdell (1+log n)ell-1,$ where $n$ is the number of variables, $d$ is the tree depth, and $c>0$ is an absolute constant.
arXiv Detail & Related papers (2020-08-24T06:50:57Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - The quantum query complexity of composition with a relation [78.55044112903148]
Belovs gave a modified version of the negative weight adversary method, $mathrmADV_relpm(f)$.
A relation is efficiently verifiable if $mathrmADVpm(f_a) = o(mathrmADV_relpm(f))$ for every $a in [K]$.
arXiv Detail & Related papers (2020-04-14T12:07:20Z) - On the Complexity of Minimizing Convex Finite Sums Without Using the
Indices of the Individual Functions [62.01594253618911]
We exploit the finite noise structure of finite sums to derive a matching $O(n2)$-upper bound under the global oracle model.
Following a similar approach, we propose a novel adaptation of SVRG which is both emphcompatible with oracles, and achieves complexity bounds of $tildeO(n2+nsqrtL/mu)log (1/epsilon)$ and $O(nsqrtL/epsilon)$, for $mu>0$ and $mu=0$
arXiv Detail & Related papers (2020-02-09T03:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.