Quantum Algorithm for the Multiple String Matching Problem
- URL: http://arxiv.org/abs/2411.14850v1
- Date: Fri, 22 Nov 2024 10:50:43 GMT
- Title: Quantum Algorithm for the Multiple String Matching Problem
- Authors: Kamil Khadiev, Danil Serov,
- Abstract summary: We consider a sequence of $m$ strings, denoted by $S$, which we refer to as a dictionary.
The objective is to identify all instances of strings from the dictionary within the text.
We propose a quantum algorithm with $O(n+sqrtmLlog nlog n)$ query complexity and $O(n+sqrtmLlog n)=O*(n+sqrtmL)$ time complexity.
- Score: 0.0
- License:
- Abstract: Let us consider the Multiple String Matching Problem. In this problem, we consider a long string, denoted by $t$, of length $n$. This string is referred to as a text. We also consider a sequence of $m$ strings, denoted by $S$, which we refer to as a dictionary. The total length of all strings from the dictionary is represented by the variable L. The objective is to identify all instances of strings from the dictionary within the text. The standard classical solution to this problem is Aho-Corasick Algorithm that has $O(n+L)$ query and time complexity. At the same time, the classical lower bound for the problem is the same $\Omega(n+L)$. We propose a quantum algorithm with $O(n+\sqrt{mL\log n}+m\log n)$ query complexity and $O(n+\sqrt{mL\log n}\log b+m\log n)=O^*(n+\sqrt{mL})$ time complexity, where $b$ is the maximal length of strings from the dictionary. This improvement is particularly significant in the case of dictionaries comprising long words. Our algorithm's complexity is equal to the quantum lower bound $O(n + \sqrt{mL})$, up to a log factor. In some sense, our algorithm can be viewed as a quantum analogue of the Aho-Corasick algorithm.
Related papers
- Near-Optimal Quantum Algorithm for Finding the Longest Common Substring between Run-Length Encoded Strings [0.8057006406834466]
We give a near-optimal quantum algorithm for the longest common (LCS) problem between two run-length encoded (RLE) strings.
Our algorithm costs $tildemathcalO(n2/3/d1/6-o(1)cdotmathrmpolylog(tilden))$ time, while the query lower bound for the problem is $tildeOmega(n2/3/d1/6)$.
arXiv Detail & Related papers (2024-10-21T15:52:08Z) - Efficient Algorithms for Recognizing Weighted Tree-Adjoining Languages [104.90415092306219]
Four formalisms are equivalent to tree-adjoining grammars (TAG), linear indexed grammars (LIG), pushdown-adjoining automata (PAA) and embedded pushdown automata (EPDA)
We design new algorithms for computing their stringsum derivations (the weight of all automatons of a string) and allsums (the weight of all derivations)
For EPDA, our algorithm is both more space-efficient and time-efficient than the algorithm of Alonso et al. (2001) by factors of $mathcalO(|Gamma|2)$ and $
arXiv Detail & Related papers (2023-10-23T18:26:00Z) - A sublinear time quantum algorithm for longest common substring problem
between run-length encoded strings [0.951828574518325]
We give a sublinear quantum algorithm for the longest common (LCS) problem on the run-length encoded (RLE) inputs.
Our algorithm costs $tildeO(n5/6)cdot O(mathrmpolylog(tilden))$ time, where $n$ and $tilden$ are the encoded and decoded length of the inputs, respectively.
arXiv Detail & Related papers (2023-10-02T08:14:34Z) - Do you know what q-means? [50.045011844765185]
Clustering is one of the most important tools for analysis of large datasets.
We present an improved version of the "$q$-means" algorithm for clustering.
We also present a "dequantized" algorithm for $varepsilon which runs in $Obig(frack2varepsilon2(sqrtkd + log(Nd))big.
arXiv Detail & Related papers (2023-08-18T17:52:12Z) - Quantum Algorithms for the Shortest Common Superstring and Text
Assembling Problems [11.048346250166073]
We consider two versions of the Text Assembling problem.
We are given a sequence of strings $s1,dots,sn$ of total length $L$ that is a dictionary, and a string $t$ of length $m$ that is texts.
For both problems, we suggest new quantum algorithms that work better than their classical counterparts.
arXiv Detail & Related papers (2023-06-18T14:16:49Z) - Quantum Resources Required to Block-Encode a Matrix of Classical Data [56.508135743727934]
We provide circuit-level implementations and resource estimates for several methods of block-encoding a dense $Ntimes N$ matrix of classical data to precision $epsilon$.
We examine resource tradeoffs between the different approaches and explore implementations of two separate models of quantum random access memory (QRAM)
Our results go beyond simple query complexity and provide a clear picture into the resource costs when large amounts of classical data are assumed to be accessible to quantum algorithms.
arXiv Detail & Related papers (2022-06-07T18:00:01Z) - Logarithmic Regret from Sublinear Hints [76.87432703516942]
We show that an algorithm can obtain $O(log T)$ regret with just $O(sqrtT)$ hints under a natural query model.
We also show that $o(sqrtT)$ hints cannot guarantee better than $Omega(sqrtT)$ regret.
arXiv Detail & Related papers (2021-11-09T16:50:18Z) - An Optimal Separation of Randomized and Quantum Query Complexity [67.19751155411075]
We prove that for every decision tree, the absolute values of the Fourier coefficients of a given order $ellsqrtbinomdell (1+log n)ell-1,$ sum to at most $cellsqrtbinomdell (1+log n)ell-1,$ where $n$ is the number of variables, $d$ is the tree depth, and $c>0$ is an absolute constant.
arXiv Detail & Related papers (2020-08-24T06:50:57Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Classical and Quantum Algorithms for Constructing Text from Dictionary
Problem [0.0]
We study algorithms for solving the problem of constructing a text from a dictionary (sequence of small strings)
The problem has an application in bioinformatics and has a connection with the Sequence assembly method for reconstructing a long DNA sequence from small fragments.
arXiv Detail & Related papers (2020-05-28T22:44:01Z) - Quantum Algorithms for the Most Frequently String Search, Intersection
of Two String Sequences and Sorting of Strings Problems [0.0]
We study algorithms for solving three problems on strings.
The first one is the Most Frequently String Search Problem.
The second is searching intersection of two sequences of strings.
The third problem is sorting of $n$ strings of length $k$.
arXiv Detail & Related papers (2020-01-07T07:22:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.