Upper and lower bounds for dynamic data structures on strings

We consider a range of simply stated dynamic data structure problems on strings. An update changes one symbol in the input and a query asks us to compute some function of the pattern of length $m$ and a substring of a longer text. We give both conditional and unconditional lower bounds for variants of exact matching with wildcards, inner product, and Hamming distance computation via a sequence of reductions. As an example, we show that there does not exist an $O(m^{1/2-\varepsilon})$ time algorithm for a large range of these problems unless the online Boolean matrix-vector multiplication conjecture is false. We also provide nearly matching upper bounds for most of the problems we consider.


Introduction
The search for lower bounds provides one of the greatest challenges in computer science.Progress in finding better truly unconditional lower bounds continues in slow but steady steps.There appears however, in the short term at least, to be no realistic prospect of finding unconditional lower bounds which are polynomial in the size of the input.One of the most exciting discoveries in recent years has been that such polynomial lower bounds can be given for a range of problems in P conditional on the hardness of a small set of well known and conjectured to be hard problems [2,3,14,10,1,16].These include the Strong Exponential Time Hypothesis (SETH), 3-SUM and online Boolean matrix-vector product (OMv).
In this paper we study the hardness of a number of simply stated dynamic string problems and show both conditional lower bounds based on the OMv conjecture (see Conjecture 1 for a precise statement) as well as unconditional lower bounds.We will also give new upper bounds which in many cases will nearly match our new conditional lower bounds.Each problem will have the following form.
Problem 1.Consider a text T of length n and a pattern P of length m.An update to the pattern (or text) is a pair (j, σ) which indicates that the letter at index j in the pattern (or text) is to be substituted with the letter σ.The task is to develop a dynamic data structure on P and T that supports the following queries: Given a position i of T , output f (P, T [i, . . ., i + m − 1]).
Unless stated otherwise, we allow updates to both the pattern P and the text T .The different functions f we will consider are Hamming distance (DynHD), inner product (DynIP) and exact matching with Ω(m 1/2−δ ) Ω((log 1/2 m/ log log m) 3 ) approx.polynom.
wildcards (DynEM).These functions have formed the core of pattern matching with errors and wildcards for many years and have been extensively studied in both the standard offline pattern matching setting and to a lesser extent online and streaming.To the best of our knowledge, this is the first exploration of the complexity of pattern matching with errors and wildcards as a fully dynamic data structure problem.By way of preparation, we give O( √ m log m) query and update times for exact inner product, exact matching with wildcards, and for dynamic Hamming distance over constant-sized alphabets, as well as O(m 3/4 log 1/4 m)-time algorithm for dynamic Hamming distance over polynomial-size alphabets.These algorithms are derived via a lazy rebuilding scheme.We then show in Theorem 4 that there does not exist an O(m 1/2−ǫ ) time solution to any of these problems unless the online Boolean matrix-vector conjecture is false.The lower bound for dynamic exact matching with wildcards is particularly interesting as it is exponentially higher than the known O(log m) time complexity for dynamic exact matching without wildcards.
Our conditional lower bound also extends to (1 + ε)-approximate DynIP, DynIP modulo 2 and remarkably, to DynHD modulo 2 with a ternary input alphabet.This latter result is in stark contrast to the complexity of DynHD modulo 2 with a binary input alphabet which we show in Lemma 7 can be solved in O(log m/ log log m) query and update time.
We complement all these conditional lower bounds with a set of unconditional lower bounds derived via reductions from different 2d-dynamic range counting problems.First we show that DynIP is at least as hard as weighted 2d-range counting.As a result, we get an unconditional lower bound of Ω((log m/ log log m) 2 ) for DynIP.This matches the highest unconditional lower bound known for any dynamic data structure problem.We then go on to show Ω((log 1/2 m/ log log m) 3 ) unconditional lower bounds for DynHD over binary alphabets, DynIP modulo 2 over binary alphabets and DynHD modulo 2 over ternary alphabets.These lower bounds are derived from a recent breakthrough in the complexity of the unweighted version of 2d-range counting.To finish our unconditional lower bounds we then show Ω(log m/ log log m) unconditional lower bounds for DynHD modulo 2 over binary alphabets, DynEM and (1 + ε)-approximate DynIP.
As our final set of dynamic problems, we move on to consider (1 + ε)-approximate DynHD for which we do not have matching conditional lower bounds, despite its superficial similarity to approximate DynIP.Unlike for approximate DynIP and exact DynHD, in Section 4 we show markedly different upper bounds for approximate DynHD depending on whether updates may occur in only the pattern and text or in both.For the former case we derive O(ε −c polylog m) time algorithms via Johnson-Lindenstrauss sketching.The exact value of c depends on the size of the input alphabet and in fact for some update operations the running time dependency on log m is completely removed.For the latter case with updates in both the pattern and text, our upper bound is O(ε −2 √ m polylog m) time.It is an interesting and open question whether there exist matching conditional lower bounds for these versions of approximate DynHD as well.We give a summary of the results in Table 1.

Related work
In the dynamic setting we consider with single character updates, the most closely related previous work considers the problem of dynamic exact matching.In [8] an O(log log m) time algorithm was shown for dynamic exact matching when updates are only permitted in the text [8].In [5] a more general data structure was developed supporting insertion and deletion of characters and movements of arbitrary large blocks of text.This was improved in a succession of papers culminating in the work [21] who give a data structure that supports, amongst other properties, concatenation, splitting and equality testing in O(log m) update and O(1) query time.The same data structure solves, for example, the dynamic exact matching problem without wildcards problem in O(log m) time.At the expense of O(log 2 m) updates this latter work also supports finding occurrences of a specified pattern P in O(|P |) time.A separate line of work has considered the static data structure problem of text indexing for approximate matching [12,7,19,13,23,13,9].
There has also been a number of papers working on conditional hardness for other types of string problems.Larsen et al. proved lower bounds for document retrieval and forbidden pattern document retrieval conditional on hardness of boolean matrix multiplication [29].Later, Kopelowitz et al. showed 3SUM-conditional lower bounds for these two problems [26].Backurs and Indyk [10] proved lower bounds computing the edit distance of two string.Bringmann and Künnemann [16] proved lower bounds for dynamic time warping and longest common subsequence.Finally, Backurs and Indyk [11] and follow up work by Bringmann et al. [15] proves conditional lower bounds for regular expression matching.
In the DynEM problem we assume that P and T are strings over Σ ∩ {?}, where Σ is an integer alphabet and ? is a special wildcard symbol that matches any letter in Σ.We define f (P, T [i, . . ., i + m − 1]) to be equal to zero if P matches T [i, . . ., i + m − 1] and the number of mismatching positions otherwise.We define n to be the length of the text, and m to be the length of the pattern, n ≥ m.
We will in fact present a general solution for dynamic string problems where f can be represented in a particular form.DynIP, DynHD and DynEM will seen as special cases.The restriction is simply that f (P, T [i, . . ., i + m − 1]) = j=m j=1 g(P [j], T [i + j − 1]), where the function g can be evaluated in constant time.This functional form is closely related to the idea of local distance functions that were key to the development of fast streaming pattern matching algorithms [18].We first show that our string problems do indeed satisfy the stated requirements.Lemma 1.If f is inner product, Hamming distance, or exact matching with wildcards, then there exists a function g such that f (P, T [i, . . ., i ), where the function g can be evaluated in constant time.
Proof.If f is inner product, we put g(P j , T i+j−1 ) = P j • T i+j−1 .In the case of Hamming distance, we define g(P j , T i+j−1 ) = 0 if P j = T i+j−1 and g j (P j , T i+j−1 ) = 1 otherwise.
For DynEM we assume that wildcards are represented by the value 0. It is not hard to see that we can take g(P j , T i+j−1 ) to be the characteristic function of (P j − T i+j−1 ) 2 P j T i+j−1 > 0 and indeed this observation is the basis for one of the fastest offline exact matching with wildcards algorithms [17].The key property we use is that either (a) if one of P j and T i+j−1 is a wildcard or P j = T i+j−1 then g(P j , T i+j−1 ) = 0, or (b) P j = T i+j−1 and then g(P j , T i+j−1 ) > 0. It follows that f (P, T [i, . . ., i + m − 1]) equals zero if and only if P and T [i, . . ., i + m − 1] match.
We now show a solution for all dynamic string problems defined by a function f that can be represented in the form above.We consider the most general update model, where we are allowed to update both the text and the pattern.
Theorem 1.Let T be a text of length n, and P be a pattern of length m.Assume f can be represented as , where g can be computed in constant time, and the values f (P, T [1, . . ., m]), f (P, T [2, . . ., m+1]), . . ., f (P, T [n−m+1, . . ., n]) can be computed in T (n) time and S(n) space.We can then solve the corresponding dynamic string problem in O( T (n)) worst case update/query time using O(S(n) + n) space.
Proof.Let us first show a solution with O( T (n)) amortised time.We start by computing values time and S(n) space.At all times, we maintain a list of updates U that have occurred since the last moment we recomputed the values A[i].Suppose that the size of U is at most ⌈ T (n)⌉ and a query i arrives.We can then compute and U in the following way.We initialise A ′ [i] = A[i], and consider each update in order.Suppose that an update change letters in a position k of P or T [i, . . ., i + m − 1], and let P ′ k and T ′ i+k−1 be the updated letters.We remember P ′ k and T ′ i+k−1 , and set Since g can be evaluated in constant time, this step takes constant time as well.Therefore, the time to perform each query is O( T (n)).When the size of U reaches ⌈ T (n)⌉, we apply the updates in U to T and P , empty U , and recompute the values A[i] from scratch.The amortised cost of an update is therefore We can de-amortise the solution in a standard way.Namely, we restart the computation of the values A[i] each ⌈ T (n)/2⌉ updates, and run Θ( T (n)) steps of the computation per each of the ⌈ T (n)/2⌉ subsequent updates.While the computation is not over, we make use of the previously computed values f (P, T [1, . . ., m]), . . ., f (P, T [n− m+ 1, . . ., n]) to answer queries.As before, we will need to correct the value of the function g in at most ⌈ T (n)/2⌉ positions.Note that apart from the space we need for computing the values Proof.For both problems, For inner product, this is a direct corollary of the FFT algorithm.The bound for exact matching with wildcards was demonstrated in [20,17].The claim follows from Lemma 1 and Theorem 1.
We now extend our solution to a general value of n.In this case there is also an additional cost of computing the full set of solutions before the first query or update is performed which we omit from the following theorem.Proof.We first partition T into blocks of length 2m overlapping by m positions (the last block may be shorter).Note that for each i a string T [i, . . ., i + m − 1] is a substring of one of such blocks, and each position of T belongs to at most two blocks.Suppose that we have a solution for a text of length 2m and a pattern of length m with update time t u , query time t q , and space S.
(a) If only updates to the text are allowed, we can apply this solution independently to each of the blocks.
Note that an update of the text changes at most two blocks, and therefore we obtain a solution for T with update time O(t u ), query time t q , and space O( n m • S).(b) If updates are allowed both to the text and to the pattern, we obtain a solution for T with update time O( n m • t u ), query time t q , and space O( n m • S).The claim follows from Lemmas 2 and 3.

Upper bounds for dynamic approximate Hamming distance
In this section we develop algorithms for an approximate version of DynHD.We will refer to this version as DynApproxHD.In this problem a query i must return a (1 + ε)-approximation of the Hamming distance between P and T [i, . . ., i + m − 1], where ε > 0 is a parameter of the algorithm.Unlike the other problems we have considered, the complexity of DynApproxHD appears to have a strong dependence on whether updates are permitted only in the pattern or text or in both.At one extreme, when updates are only permitted in the pattern and the input alphabet is binary, we show in Theorem 3 a data structure that takes O(1/ε) update and O(1/ε 2 ) query time.However if updates can occur in both the pattern and the text, then the complexity increases dramatically to be at least that of exact DynIP, DynEM and DynHD over binary alphabets.
In Section 3 we showed that the DynHD problem can be solved in O(m 1/2 log 1/2 m) query/update time for constant-size alphabets, and in O(m 3/4 log 1/4 m) query/update time for polynomial-size alphabets.We start our exploration of the complexity of DynApproxHD by showing that this dependence on the alphabet size is almost completely removed in this approximate setting.The solution we give is deterministic and is based on the mapping idea of Karloff [25].

Lemma 4 ([25]
).Let Σ be the alphabet of P and T .There exists Θ((1/ε 2 ) log 2 n) deterministic mappings map j : Σ → {0, 1} such that a (1 + ε)-approximation of the Hamming distance between P and T at a particular alignment can be given by a normalised average of the Hamming distances between map j (P ) = map j (P 1 ) . . .map j (P n ) and map j (T ) = map j (T 1 ) . . .map j (T n ) at this alignment.Each mapping can be stored as a look-up table that permits to compute each map j (P k ) or map j (T k ) in O(1) time.Proof.We consider Karloff's mappings map j .For each j, we run our DynHD solution for constant-size alphabets (Lemma 2) on map j (P ) and map j (T ).The claim immediately follows.
We now present several randomised solutions for DynApproxHD in two special update models where we are allowed to update either only the text or only the pattern.We first assume a binary input alphabet, and then show how to extend our solutions to constant-size and then later polynomial-size alphabets as well.Each answer is correct with constant probability.
Proof.Let us first assume the input alphabet is of constant size.We will make use of the sparse Johnson-Lindenstrauss transform by Kane and Nelson [24] defined by a random Θ(1/ε 2 ) × n matrix M such that its entries are from {−1, 0, 1}, and each of its columns contains s = Θ(1/ε) non-zero entries.The result of a transform, which we call a sketch, is defined to be equal to s −1/2 M • x.Kane and Nelson showed how to choose a distribution on such matrices such that, with constant probability, the square of the L 2 norm of the difference of the sketches of two strings gives a (1 + ε)-approximation of Hamming distance.(1 + ε)-approximation of the Hamming distance between each S i and the corresponding substring of P using the sketches.Finally, we sum up all approximations to obtain the answer.Since the probability to error on each pair of substrings is Θ(1/ log m), the total error probability is constant by the union bound.
Both algorithms can be extended to work for any constant sized alphabet by expanding the input alphabet in unary.That is we replace the letter i with a binary vector 0 . . .010 . . .0, where the set bit is in the i-th position.
Corollary 2. For a text T of length n ≥ m, and a pattern P of length m, and ε > 1/n, there is a randomised data structure for the DynApproxHD problem over polynomial-size alphabets with Each answer is correct with constant probability.

Lower bounds
In this section we demonstrate conditional and unconditional lower bounds for different variants of DynEM, DynIP, and DynHD.The conditional lower bounds are derived from the hardness of a well-known problem, online Boolean matrix-vector product (OMv).Fig. 1 summarises the reductions we use.

Reductions between DynIP, DynHD and DynHD modulo 2
Before we get to our main lower bounds results we will first establish the relationship between some of the dynamic string problems we consider.
Lemma 5. DynHD is at least as hard as DynIP over binary alphabets.
Proof.We map the input alphabet of the text and the pattern separately.Take an instance of DynIP where the input alphabet is binary.In order to transform it into an instance of DynHD each 1 in the pattern or text is mapped to the string 111 in the DynHD instance.Similarly, a 0 in the pattern is mapped to the string 010 and a 0 in the text is mapped to the string 100.This transformation ensures that any two symbols that align in the DynIP instance will give Hamming distance 2 in the DynHD instance except when two 1s align.In this case the Hamming distance will be 0. We can therefore infer the inner product from the Hamming distance: The inner product will be equal to the length of the pattern minus the Hamming distance divided by two.
We will later show both conditional and unconditional lower bounds not only for DynIP but also for DynIP modulo 2. The following two lemmas will lead to perhaps our most surprising result which is that DynHD modulo 2 over ternary alphabets is exponentially harder to solve than DynHD modulo 2 over a binary alphabet.It is worth emphasising by way of contrast that in the standard offline pattern matching setting, the asymptotic complexity of computing the Hamming distance at all alignments of a pattern and text is identical for any constant sized input alphabet.Lemma 6. DynHD modulo 2 over a ternary alphabet is at least as hard as DynIP modulo 2 over a binary alphabet.
Proof.We again map the input alphabet of the text and pattern separately.Take an instance of DynIP modulo 2 where the input alphabet is binary.Each 1 in the pattern is mapped to the string 22 and each 0 in the pattern is mapped to the string 01.Each 1 in the text is mapped to the string 11 and each 0 in the text is mapped to the string 02.This transformation ensures that any two symbols that align in the DynIP modulo 2 instance will give Hamming distance 1 in the DynHD modulo 2 instance except for when two 1s align in the DynIP modulo 2 instance when the resulting Hamming distance is 2. Therefore, the inner product modulo 2 is equal to the length of the pattern minus the Hamming distance modulo 2.
However, DynHD modulo 2 over a binary alphabet is much easier than DynHD modulo 2 over a ternary alphabet.
Lemma 7.For a binary text T of length n ≥ m, and a binary pattern P of length m the DynHD modulo 2 problem can be solved in O(log m/ log log m) update/query time using O(n) space.There is a matching unconditional lower bound for update/query time as well.
Proof.As before, we divide the text T into 2m-length blocks overlapping by m positions.We will show that for each block DynHD modulo 2 can be solved in O(log m/ log log m) update/query time using O(m) space, hence giving the claim.
Consider a 2m-length block of T .In order to answer a query at alignment i for DynHD modulo 2 we need only to sum, modulo 2, the number of 1s in the pattern and the corresponding substring of the text T [i, . . ., i + m − 1].This can be seen via a simple proof by induction as follows.As the base case consider two strings of length 1 and let all arithmetic be over Z 2 .In this case the Hamming distance is the sum of the Hamming weights of the two strings.For the inductive step, extend each of these two strings by one bit and observe that the new Hamming distance is the old Hamming distance before extending the strings plus the sum of the two new bits over Z 2 .
The Hamming weight of the pattern can be maintained straightforwardly.We argue that answering queries for the Hamming weight of substrings of the block is equivalent to the prefix sum problem modulo 2.
To reduce from this problem to prefix sum we need only observe that we can compute the number of 1s in T [i, . . ., i + m − 1] by subtracting the prefix sum up to index i − 1 from the prefix sum up to index i + m − 1.
To reduce from prefix sum to the DynHD modulo 2 problem we construct a text of length 2m with the first half all zeros and the second half as a copy of the prefix sum array.Setting the pattern to all 1s we can compute the prefix sum modulo 2 up to index i of its array of length m by performing a query at index i of the text.It follows from the upper and lower bounds of [32] that the complexity of DynHD modulo 2 over a binary alphabet is Θ(log m/ log log m).

Conditional lower bounds
We will now give lower bounds for our dynamic string problems conditional on the hardness of a well known problem.The OMv problem was introduced in [22] as a means to prove conditional lower bounds for a number of dynamic problems.In this problem we are first given an r × r Boolean matrix M .We then receive r vectors v 1 , . . ., v r , one by one.After seeing each vector v i , we have to output the product M v i (over the Boolean semi-ring) before we receive the next vector.A naive algorithm can solve this problem using O(r 3 ) time in total with the current fastest solution taking O(r 3 /2 Ω( √ log r) ) time [31].The OMv conjecture is as follows: Conjecture 1 (OMv Conjecture [22]).For any constant ǫ > 0, there is no O(r 3−ǫ )-time algorithm that solves the OMv problem with error probability of at most 1/3.Proof.We first give a reduction from the online Boolean matrix-vector multiplication problem to DynEM.We create a text T of length 2m = 2r 2 from the matrix M by concatenating the r rows of M one after another and filling the rest of T with the symbol 1 repeated r 2 times.Now consider a single Boolean matrix vector product M v i .The pattern P has length m = r 2 .Its first r symbols are a copy of the vector v i but with all 0s replaced by the wildcard symbol ?and all 1s replaced by the symbol 0. The remaining r 2 − r symbols are set to the wildcard symbol ?.To perform a Boolean matrix vector multiplication we perform m exact match with wildcard queries at indices 1, r + 1, 2r + 1, . . ., (r − 1)r + 1.If a query i returns a match then M v i [j] = 0 and M v i [j] = 1 otherwise.If follows that any algorithm for DynEM running in O(m 1/2−ε ) for the maximum of query and update time implies an O(r 3−ε )-time algorithm that solves the online Boolean matrix-vector multiplication problem, thereby contradicting the OMv conjecture.
DynIP and DynHD are at least as hard as DynIP modulo 2, so it suffices to show the lower bound for the latter.We give a similar reduction from OMv but this time with an extra randomisation step.We create a text T of length 2m = 2r 2 from the matrix M by concatenating the r rows of M one after another and filling the rest of T with the symbol 0 repeated r 2 times.Now consider a single Boolean matrix vector product M v i .We create a pattern P of length m = r 2 with the first r symbols being a copy of v i and the remaining r 2 − r symbols set to 0. We now flip each set bit in P with probability 1/2 and compute inner product modulo 2 queries at indices 1, r + 1, 2r + 1, . . ., (r − 1)r + 1.If M v i [j] = 0 then an inner product query j will always return 0. If M v i [j] = 1 then the inner product query will return 1 with probability 1/2.This gives a probability of at least 1/2 of giving the correct answer for each M v i [j].We amplify the probabilities by repeating the randomised procedure O(log m) times using the fact that we have one-sided error at each iteration.It then follows that there does not exist an algorithm running in O(m 1/2−ε ) for the maximum of query and update time for DynIP modulo 2 unless the OMv conjecture is false.
The lower bound for (1 + ε)-approximate DynIP follows from the same reduction with the arithmetic performed over the reals rather than modulo 2 and without the randomisation step.This is because a (1 + ε)-approximation must be able to distinguish zero and non-zero inner products which is sufficient for our reduction from OMv.
The lower bound for DynHD modulo 2 over a ternary alphabet now follows from Lemma 6.If updates are only allowed in the text then we derive the same lower bound as before by modifying our reductions.Let us take the reduction from the online Boolean matrix-vector multiplication problem to DynIP modulo 2 as an example.The other lower bounds follow analogously.We create a pattern P of length m = r 2 from the matrix M by concatenating the r rows of M one after another.The text is of length 2m = 2r 2 and will be all 0s except for the substring T [r 2 − r + 1, . . ., r 2 ].In order to perform a single Boolean matrix vector product M v i the substring is updated so that T [r 2 − r + 1, . . ., r 2 ] = v i and we then flip each set bit in T with probability 1/2.We then compute inner product queries modulo 2 at indices 1, r + 1, 2r + 1, . . ., (r − 1)r + 1 which give the correct answer for each query with probability at least 1/2.We can amplify the probability as before giving us the desired lower bound.
Our lower bound also holds for DynIP modulo c for any c ≥ 2.
Corollary 3. Let integer c ≥ 2. Assuming the OMv conjecture, there does not exist an algorithm running in O(m 1/2−ǫ ) for the maximum of query and update time for DynIP modulo c.
Proof.Let the input alphabet be binary as before and perform the same randomised reduction from OMv as in the proof of Theorem 4. If the inner product equals 0 then we always give the correct answer.If the inner product is greater than 0 then after flipping the set bits, the inner product modulo c is greater than 0 with probability that tends asymptotically to c−1 c .We can then amplify the probabilities to ensure that every value in the matrix-vector product is correct with constant probability as before.

Unconditional lower bounds
In this section we will give unconditional lower bounds for all the problems we have considered except DynApproxHD.Although these bounds are necessarily much lower than the conditional lower bounds we gave previously, they nonetheless match in many cases the limits of what is known unconditionally for any dynamic data structure.
We first show lower bounds for the DynIP and the DynHD problems by reduction from the dynamic weighted range counting problem.In this problem, we are given a r × r grid D. The points in the grid are assigned integer weights, and at any moment there can be at most r non-zero weights w i .For our problem r = m 1/3 .Updates may change the weight of a point and a query (i, j) asks for x≤i,y≤j D x,y .In [28] Larsen gave an Ω((log r/ log log r) 2 ) lower bound for the maximum of query and update time for dynamic weighted range counting.This lower bound does not hold however in the unweighted case (where the weights are in {0, 1}) and giving an ω(log r) lower bound for this situation remained an important open problem for a number of years.Recently in [30] a new Ω((log 1/2 r/ log log r) 3 ) lower bound was given for this unweighted range counting problem which also holds over F 2 .
Theorem 5.The DynIP problem has an unconditional Ω((log m/ log log m) 2 ) lower bound for the maximum of query and update time for polynomial-size alphabets.DynHD over binary alphabets, DynIP modulo 2 over binary alphabets and DynHD modulo 2 over ternary alphabets have an Ω((log 1/2 m/ log log m) 3 ) lower bound.
Proof.We give a reduction from dynamic range counting to DynIP.We take an instance of the problem for r = m 1/3 and create a text T of length 2m and a pattern P of length m.The text has all symbols set to 0 except T m−m 1/3 +1 ,. . ., T m that are set to w 1 , . . ., w m 1/3 respectively.For each of the m 2/3 different possible queries to D, a subset of the w i 's will be included in the query.We create a pattern P so that P jm 1/3 +i−1 = 1 if weight w i is included in the range for query j and P jm 1/3 +i−1 = 0 otherwise.
To perform a range counting query, we need to align the relevant substring of the pattern of length m 1/3 with T [m − m 1/3 + 1, . . ., m] and perform an inner product query.Our lower bounds then follow from the lower bounds for the weighted and F 2 versions of dynamic range counting and Lemmas 5 and 6.
Finally, we give lower bounds for the DynEM and the (1 + ε)-approximate DynIP problems by reduction from the dynamic range emptiness problem.In this problem, the set-up is exactly like in the unweighted dynamic range counting problem above, and a query (i, j) asks if x≤i,y≤j D x,y = 0.In [6], Alstrup et al. showed a Ω(log r/ log log r) lower bound for this problem.Proof.Consider an instance of two dimensional range emptiness on D for r = m 1/3 .We take an instance of this problem and create a text T of length 2m and a pattern P of length m.The text has all values set to 0 except T m−m 1/3 +1 , . . ., T m set to w 1 ,. . .,w m 1/3 respectively.For each of the m 2/3 different possible queries to D in the dynamic range emptiness problem, a subset of the w i 's will be included in the query.We create a pattern P so that P jn 1/3 +i−1 = 0 if weight w i is included in the range for query j and P jn 1/3 +i−1 = ?otherwise.If an exact match with wildcards query returns True then we know that all the weights in the corresponding range are 0. If it returns False then we know the range is not empty.We therefore have reduced from two dimensional range emptiness to DynEM giving an Ω(log m/ log log m) lower bound for DynEM.
For the (1 + ε)-approximate dynamic inner product problem we must be able to distinguish an inner product of zero from all other values.We therefore use the same reduction from the proof of Theorem 5 but this time only report whether the approximate inner product is greater than zero.The result of this query is sufficient to determine the answer to a range emptiness query and we therefore derive the same Ω(log m/ log log m) lower bound.

Lemma 2 .Lemma 3 .
For a text T of length m ≤ n ≤ 2m, and a pattern P of length m, problem DynHD can be solved in O( √ m log m) query/update time for constant-size alphabets, and in O(m 3/4 log 1/4 m) query/update time for polynomial-size alphabets.Both solutions use O(m) space, and both updates to the text and to the pattern are allowed.Proof.If the alphabet is binary, the values f (P, T [1, . . ., m]),. . ., f (P, T [n − m + 1, . . ., n]) can be computed by running the FFT algorithm twice.Recall that the FFT algorithm computes the inner product for each alignment of two strings.By running the FFT algorithm on P and T for the first time, we obtain, for each i, the number of positions j such that P [j] = T [i + j] = 1.By running it for the second time on the copies P and T where each bit is flipped, we obtain, for each i, the number of positions j such that P [j] = T [i + j] = 0. We can then compute the values f (P, T [1, . . ., m]), . . ., f (P, T [n − m + 1, . . ., n]) in linear time.For this algorithm, T (n) = O(n log n) = O(m log m).For alphabets of constant size |Σ|, we run the FFT algorithm |Σ| times, once for each letter a ∈ Σ, on the copies of P and T where a is replaced with 1 and all letters in Σ \ {a} are replaced with 0. T (n) = O(m log m) as well.For polynomial-size alphabets, T (n) = O(n √ n log n) = O(m √ m log m) and S(n) = O(n) = O(m) bounds were shown independently by Abrahamson [4] and Kosaraju [27] in 1987.The claim immediately follows from Lemma 1 and Theorem 1.For a text T of length m ≤ n ≤ 2m, and a pattern P of length m, problems DynIP and DynEM can be solved in O( √ m log m) query/update time using O(m) space.Both updates to the text and to the pattern are allowed.

Theorem 2 .
For a text T of length n ≥ m, and a pattern P of length m, there is a linear-space data structure that solves (a) the DynHD problem in O( √ m log m) query/update time for constant-size alphabets, and in O(m 3/4 log 1/4 m) query/update time for polynomial-size alphabets, and the DynIP and the DynEM problems in O( √ m log m) query/update time if only updates to the text are allowed; (b) the DynHD problem in O( n m • √ m log m) query/update time for constant-size alphabets, and in O( n m • m 3/4 log 1/4 m) query/update time for polynomial-size alphabets, and the DynIP and the DynEM problems in O( n m √ m log m) query/update time when updates are allowed both to the text and to the pattern.

Corollary 1 .
For a text T of length m ≤ n ≤ 2m, and a pattern P of length m, the DynApproxHD problem over polynomial-size alphabets can be solved in O((1/ε 2 ) √ m • polylog m) query/update time and O((1/ε 2 )m log 2 m) space.

Theorem 3 .
For a text T of length n ≥ m, and a pattern P of length m, there is a randomised data structure for the DynApproxHD problem over a constant-sized alphabet with (a) O(1/ε) update time, O(1/ε 2 ) query time, and O((1/ε 2 ) • n) space if only updates to the pattern are allowed; (b) O((1/ε) • polylog n) update time and O((1/ε 2 ) • polylog n) query time using O((1/ε 2 ) • n polylog n) space if only updates to the text are allowed.
(a) During the preprocessing step we compute the sketch of P and of each m-length substring of T .When an update to P arrives, we update its sketch in a naive way in O(1/ε) time.When a query i arrives, we can compute a (1 + ε)-approximation of the Hamming distance between P and T by computing the L 2 norm of the difference of the sketches of P and T [i, . . ., i + m − 1].Since the sketches are the vectors of length 1/ε 2 , this can be done in O(1/ε 2 ) time.(b) For this model, we will need a sketch that gives (1 + ε)-approximation of Hamming distance with error probability Θ(1/ log m).This can be achieved by repeating the scheme Θ(log log m) times.During the preprocessing, we first compute Θ(log log m) sketches for each 2 k -length substring of the pattern P , where k = 1, 2, . . ., log m.We then compute Θ(log log m) sketches for each substring T [i • 2 k + 1, . . ., (i + 1) • 2 k ].We call such substrings of T canonical.When an update (i, σ) arrives, we need to fix the sketches of O(log m) canonical substrings (since T i belongs to O(log m) such substrings), which can be done in O((1/ε) log m log log m) time.A query i can be answered in O((1/ε 2 ) log m log log m) time: First, we partition T [i, . . ., i + m − 1] into O(log m) canonical substrings S 1 , . . ., S k .Secondly, we compute a (a) O((1/ε 3 ) • polylog n) update time, O((1/ε 4 ) • polylog n) query time, and O((1/ε 4 ) • n polylog n) space if only updates to the pattern are allowed; (b) O((1/ε 4 ) • polylog n) update time, O((1/ε 4 ) • polylog n) query time, and O((1/ε 4 ) • n polylog n) space if only updates to the text are allowed.

Theorem 4 .
Assuming the OMv conjecture, there does not exist an algorithm running in O(m 1/2−ǫ ) for the maximum of query and update time for DynEM, DynIP, and DynHD.The same lower bound holds for DynIP modulo 2, for (1 + ε)-approximate DynIP, and for DynHD modulo 2 over ternary alphabets.The same lower bound holds even when updates are permitted only in the pattern or only in the text.

Theorem 6 .
DynEM and (1 + ε)-approximate DynIP have unconditional Ω(log m/ log log m) lower bounds for the maximum of query and update time.