Crossing the Logarithmic Barrier for Dynamic Boolean Data Structure Lower Bounds
Abstract
This paper proves the first superlogarithmic lower bounds on the cell probe complexity of dynamic boolean (a.k.a. decision) data structure problems, a longstanding milestone in data structure lower bounds.
We introduce a new method for proving dynamic cell probe lower bounds and use it to prove a lower bound on the operational time of a wide range of boolean data structure problems, most notably, on the query time of dynamic range counting over ([Pat07]). Proving an lower bound for this problem was explicitly posed as one of five important open problems in the late Mihai Pǎtraşcu’s obituary [Tho13]. This result also implies the first lower bound for the classical 2D range counting problem, one of the most fundamental data structure problems in computational geometry and spatial databases. We derive similar lower bounds for boolean versions of dynamic polynomial evaluation and 2D rectangle stabbing, and for the (nonboolean) problems of range selection and range median.
Our technical centerpiece is a new way of “weakly” simulating dynamic data structures using efficient oneway communication protocols with small advantage over random guessing. This simulation involves a surprising excursion to lowdegree (Chebychev) polynomials which may be of independent interest, and offers an entirely new algorithmic angle on the “cell sampling” method of Panigrahy et al. [PTW10].
1 Introduction
Proving unconditional lower bounds on the operational time of data structures in the cell probe model [Yao81] is one of the holy grails of complexity theory, primarily because lower bounds in this model are oblivious to implementation considerations, hence they apply essentially to any imaginable data structure (and in particular, to the ubiquitous wordRAM model). Unfortunately, this abstraction makes it notoriously difficult to obtain data structure lower bounds, and progress over the past three decades has been very slow. In the dynamic cell probe model, where a data structure needs to maintain a database under an “online” sequence of operations (updates and queries) by accessing as few memory cells as possible, a number of lower bound techniques have been developed. In [FS89], Fredman and Saks proved lower bounds for a list of dynamic problems. About 15 years later, Pǎtraşcu and Demaine [PD04, PD06] proved the first lower bound ever shown for an explicit dynamic problem. The celebrated breakthrough work of Larsen [Lar12a] brought a near quadratic improvement on the lower bound frontier, where he showed an cell probe lower bound for the 2D range sum problem (a.k.a. weighted orthogonal range counting in 2D). This is the highest cell probe lower bound known to date.
Larsen’s result has one substantial caveat, namely, it inherently requires the queries to have large (bit) output size. Therefore, when measured per outputbit of a query, the highest lower bound remains only per bit (for dynamic connectivity due to Pǎtraşcu and Demaine [PD06]).
In light of this, a concrete milestone that was identified en route to proving dynamic cell probe lower bounds, was to prove an cell probe lower bound for boolean (a.k.a. decision) data structure problems (the problem was explicitly posed in [Lar12a, Tho13, Lar13] and the caveat with previous techniques requiring large output has also been discussed in e.g. [Pat07, CGL15]). We stress that this challenge is provably a prerequisite for going beyond the barrier for general (bit output) problems: Indeed, consider a dynamic data structure problem maintaining a database with updates and queries , where each query outputs bits. If one could prove an lower bound for , this would directly translate into an lower bound for the following induced dynamic boolean problem : has the same set of update operations , and has queries . Upon a query , the data structure should output the th bit of the answer to the original query w.r.t the database . An lower bound then follows, simply because each query of can be simulated by queries of , and the update time is preserved. Thus, to break the barrier for cell probe lower bounds, one must first prove a superlogarithmic lower bound for some dynamic boolean problem. Of course, many classic data structure problems are naturally boolean (e.g., reachability, membership, etc.), hence studying decision data structure problems is interesting on its own.
Technically speaking, the common reason why all previous techniques hitherto (e.g., [Pat07, Lar12a, WY16]) fail to prove superlogarithmic lower bounds for dynamic boolean problems, is that they all heavily rely on each query revealing a large amount of information about the database. In contrast, for boolean problems, each query could reveal at most one bit of information, and thus any such technique is doomed to fail. We elaborate on this excruciating obstacle and how we overcome it in the following subsection.
In this paper, we develop a fundamentally new lower bound method and use it to prove the first superlogarithmic lower bounds for dynamic boolean data structure problems. Our results apply to natural boolean versions of several classic data structure problems. Most notably, we study a boolean variant of the dynamic 2D range counting problem. In 2D range counting, points are inserted onebyone into an integer grid, and given a query point , the data structure must return the number of points dominated by (i.e., and ). This is one of the most fundamental data structure problems in computational geometry and spatial database theory (see e.g., [Aga04] and references therein). It is known that a variant of dynamic “range trees” solve this problem using amortized update time and worst case query time ([BGJS11]). We prove an lower bound even for a boolean version, called 2D range parity, where one needs only to return the parity of the number of points dominated by . This is, in particular, the first lower bound for the (classical) 2D range counting problem. We are also pleased to report that this is the first progress made on the 5 important open problems posed in Mihai Pǎtraşcu’s obituary [Tho13].
In addition to the new results for 2D range parity, we also prove the first lower bounds for the classic (nonboolean) problems of dynamic range selection and range median, as well as an lower bound for a boolean version of polynomial evaluation. We formally state these problems, our new lower bounds, and a discussion of previous stateoftheart bounds in Section 1.2.
The following two subsections provide a streamlined overview of our technical approach and how we apply it to obtain new dynamic lower bounds, as well as discussion and comparison to previous related work.
1.1 Techniques
To better understand the challenge involved in proving superlogarithmic lower bounds for boolean data structure problems, and how our approach departs from previous techniques that fail to overcome it, we first revisit Larsen’s lower bound technique for problems with bit output size, which is most relevant for our work. (We caution that a few variations [CGL15, WY16] of Larsen’s [Lar12a] approach have been proposed, yet all of them crucially rely on large query output size). The following overview is presented in the context of the 2D range sum problem for which Larsen originally proved his lower bound. 2D range sum is the variant of 2D range counting where each point is assigned a bit integer weight, and the goal is to return the sum of weights assigned to points dominated by the query . Clearly this is a harder problem than 2D range counting (which corresponds to all weights being ) and 2D range parity (which again has all weights being , but now only bit of the output must be returned).
Larsen’s Lower Bound [Lar12a].
Larsen’s result combines the seminal chronogram method of Fredman and Saks [FS89] together with the cell sampling technique introduced by Panigrahy et al. [PTW10]. The idea is to show that, after random updates have been performed,^{1}^{1}1Each update inserts a random point and assigns it a random bit weight. any data structure (with update time) must probe many cells when prompted on a random range query. To this end, the random updates are partitioned into epochs , where the th epoch consists of updates for . The goal is to show that, for each epoch , a random query must read in expectation memory cells whose last modification occurred during the th epoch . Summing over all epochs then yields a query lower bound.
To carry out this approach, one restricts the attention to epoch , assuming all remaining updates in other epochs () are fixed (i.e., only is random). For a data structure , let denote the set of memory cells associated with epoch , i.e., the cells whose last update occurred in epoch . Clearly, any cell that is written before epoch cannot contain any information about , while the construction guarantees there are relatively few cells written after epoch , due to the geometric decay in the lengths of epochs. Thus, “most” of the information provides on comes from cell probes to (hence, intuitively, the chronogram method reduces a dynamic problem into nearly independent static problems).
The highlevel idea is to now prove that, if a toogoodtobetrue data structure exists, which probes cells associated with epoch on an average query, then can be used to devise a compression scheme (i.e., a “oneway” communication protocol) which allows a decoder to reconstruct the random update sequence from an bit message, an informationtheoretic contradiction.
Larsen’s encoding scheme has the encoder (Alice) find a subset of a fixed size, such that sufficiently many range queries can be resolved by , meaning that these queries can be answered without probing any cell in . Indeed, the assumption that the query algorithm of probes only cells from , implies that a random subset of size cells resolves at least a fraction of the possible queries, an observation first made in [PTW10]. This observation in turn implies that by sending the contents and addresses of , the decoder (Bob) can recover the answers to some specific subset of at least queries. Intuitively, if the queries of the problem are “sufficiently independent”, e.g., the answers to all queries are wise independent over a random , then answering or even any subset of of size would be sufficient to reconstruct the entire update sequence . Thus, by simulating the query algorithm and using the set to “fill in” his missing memory cells associated with , Bob could essentially recover . On the other hand, the update sequence itself contains at least bits of entropy, hence it cannot possibly be reconstructed from , yielding an informationtheoretic contradiction. Here, and throughout the paper, denotes the number of bits in a memory cell. We make the standard assumption that , such that a cell has enough bits to store an index into the sequence of updates performed.
It is noteworthy that range queries do not directly possess such “wise independence” property perse, but using (nontrivial) technical manipulations (ala [Pat07, Lar12a, WY16]) this argument can be made to work, see the discussion in Section 6.
Alas, a subtle but crucial issue with the above scheme is that Bob cannot identify the subset , that is, when simulating the query algorithm of on a given query, he can never know whether an unsampled () encountered cell in the querypath in fact belongs to or not. This issue is also faced by Pǎtraşcu’s approach in [Pat07]. Larsen resolves this excruciating problem by having Alice further send Bob the indices of (a subset of) that already reveals enough information about to get a contradiction. In order to achieve the anticipated contradiction, the problem must therefore guarantee that the answer to a query reveals more information than it takes to specify the query itself ( bits for 2D range sum). This is precisely the reason why Larsen’s lower bound requires bit weights assigned to each input point, whereas for the boolean 2D range parity problem, all bets are off.
1.1.1 Our Techniques
We develop a new lower bound technique which ultimately circumvents the aforementioned obstacle that stems from Bob’s inability to identify the subset . Our highlevel strategy is to argue that an efficient dynamic data structure for a boolean problem, induces an efficient oneway protocol from Alice (holding the entire update sequence as before) to Bob (who now receives a query and ), which enables Bob to answer his boolean query with some tiny yet nontrivial advantage over random guessing. For a dynamic boolean data structure problem , we denote this induced communication game (corresponding to the th epoch) by . The following “weak simulation” theorem, which is the centerpiece of this paper, applies to any dynamic boolean data structure problem :
Theorem 1 (OneWay Weak Simulation Theorem, informal).
Let be any dynamic boolean data structure problem, with random updates grouped into epochs followed by a single query . If admits a dynamic data structure with wordsize , worstcase update time and average (over ) expected query time with respect to , satisfying , then there exists some epoch for which there is a oneway randomized communication protocol for in which Alice sends Bob a message of only bits, and after which Bob successfully computes with probability at least .^{2}^{2}2Throughout the paper, we use to denote .
The formal statement and proof of the above theorem can be found in Section 4. Before we elaborate on the proof of Theorem 1, let us explain informally why such a seemingly modest guarantee suffices to prove superlogarithmic cell probe lower bounds on boolean problems with a certain “listdecoding” property. If we view queryanswering as mapping an update sequence to an answer vector,^{3}^{3}3An answer vector is a dimensional vector containing one coordinate per query, whose value is the answer to this query. then answering a random query correctly with probability would correspond to mapping an update sequence to an answer vector that is far from the true answer vector defined by the problem. Intuitively, if the correct mapping defined by the problem is listdecodable in the sense that in the ball centered at any answer vector, there are very few codewords (which are the correct answer vectors corresponding to some update sequences), then knowing any vector within distance from the correct answer vector would reveal a lot of information about the update sequence. Standard probabilistic arguments [Vad12] show that when the code rate is (i.e., as for 2D range parity), a random code is “sufficiently” listdecodable with , i.e., for most data structure problems, the protocol in the theorem would reveal too much information if Bob can predict the answer with probability, say . Therefore, Theorem 1 would imply that the query time must be at least . Assuming the data structure has worstcase update time and standard wordsize , the above bound gives . Indeed all our concrete lower bounds are obtained by showing a similar listdecoding property with , yielding a lower bound of . See Subsection 1.2 for more details.
Overview of Theorem 1 and the “PeaktoAverage” Lemma.
We now present a streamlined overview of the technical approach and proof of our weak oneway simulation theorem, the main result of this paper. Let be any boolean dynamic data structure problem and denote by the size of each epoch of random updates (where and ). Recall that in , Alice receives the entire sequence of epochs , Bob receives and , and our objective is to show that Alice can send Bob a relatively short message ( bits) which allows him to compute the answer to w.r.t , denoted , with advantage over .
Suppose admits a dynamic data structure with worstcase update time and expected query time with respect to and . Following Larsen’s cell sampling approach, a natural course of action for Alice is to generate the updated memory state of (w.r.t ), and send Bob a relatively small random subset of the the cells associated with epoch , where each cell is sampled with probability . Since the expected query time of is and there are epochs, the average (over ) number of cells in probed by a query is , hence the probability that Alice’s random set resolves Bob’s random query is at least . Let us henceforth denote this desirable event by . It is easy to see that, if Alice further sends Bob all cells that were written (associated) with future epochs (which can be done using less than bits due to the geometric decay of epochs and the assumption that probes at most cells on each update operation), then conditioned on , Bob would have acquired all the necessary information to perfectly simulate the correct querypath of on his query .
Thus, if Bob could detect the event , the above argument would have already yielded an advantage of roughly (as Bob could simply output a random cointoss unless occurs), and this would have finished the proof. Unfortunately, certifying the occurrence of is prohibitively expensive, precisely for the same reason that identifying the subset is costly in Larsen’s argument. Abandoning the hope for certifying the event (while insisting on low communication) means that we must take a fundamentally different approach to argue that the noticeable occurrence of this event can somehow still be exploited implicitly so as to guarantee a nontrivial advantage. This is the heart of the paper, and the focal point of the rest of this exposition.
The most general strategy Bob has is to output his “maximum likelihood” estimate for the answer given the information he receives, i.e., the more likely posterior value of (for simplicity of exposition, we henceforth ignore the conditioning on and on the set of updates makes to future epochs which Alice sends as well). Assuming without loss of generality that the answer to the query is , when occurs, this strategy produces an advantage (“bias”) of (since when occurs, the answer is completely determined by and the updates to ), and when it does not occur, the strategy produces a bias of . Thus, the overall bias is . This quantity could be arbitrarily close to , since we have no control over the distribution of the answer conditioned on the complement event , which might even cause perfect cancellation of the two terms.
Nevertheless, one could hope that such unfortunate cancellation of our advantage can be avoided if Alice reveals to Bob some little extra “relevant” information. To be more precise, let be the set of memory addresses would have probed when invoked on the query according to Bob’s simulation. That is, Bob simulates until epoch , updates the contents for all cells that appear in Alice’s message, and simulates the query algorithm for on this memory state. In particular, if the event occurs, then is the correct set of memory cells the data structure probes. Of course, the set is extremely unlikely to be “correct” as is tiny, so should generally be viewed as an arbitrary subset of memory addresses. Now, the true contents of the cells (w.r.t the true memory state ) induce some posterior distribution on the correct answer (in particular, when occurs, the path is correct and its contents induce the true answer).
Imagine that Alice further reveals to Bob the true contents of some small subset , i.e., an assignment . The posterior distribution of the answer conditioned on is simply the convex combination of the posterior distributions conditioned on “” for all ’s that are consistent with (), weighted by the probability of () up to some normalizer. The contribution of each term in this convex combination (i.e., of each posterior distribution induced by a partial assignment ) to the overall bias, is precisely the average, over all full assignments to cells in which are consistent, of the posterior bias induced by the event “” (i.e., when the entire is revealed). For each full assignment , we denote its latter contribution by , hence the expected bias contributed by the event “” is nothing but the sum of over all ’s satisfying . Furthermore, we know that there is some assignment , namely the contents of when occurs, such that is “large” (recall the bias is in this event). Thus, the key question we pose and set out to answer, is whether it is possible to translate this “peak” of into a comparable lower bound on the “average” bias , by conditioning on the assignments to a small subset of coordinates . Indeed, if such exists, Alice can sample independently another set of memory cells and send it to Bob. With probability , all contents of are revealed to Bob, and we will have the desired advantage. In essence, the above question is equivalent to the following informationtheoretic problem:
Let be a variate random variable and a uniform binary random variable in the same probability space, satisfying: (i) for some ; (ii) . What is the smallest subset of coordinates such that ?
The crux of our proof is the following lemma, which asserts that conditioning on only many coordinates suffices to achieve a nonnegligible average advantage .
Lemma 1 (PeaktoAverage Lemma).
Let be any real function satisfying: (i) ; and (ii) . Then there exists a subset of indices, , such that .
An indispensable ingredient of the proof is the usage of lowdegree (multivariate) polynomials with “threshold”like phenomena, commonly known as (discrete) Chebyshev polynomials.^{4}^{4}4These are real polynomials defined on the hypercube, of degree and whose value is uniformly bounded by everywhere on the cube except the all point which attains the value . The lemma can be viewed as an interesting and efficient way of “decomposing” a distribution into a small number of conditional distributions, “boosting” the effect of a single desirable event, hence the PeaktoAverage Lemma may be of independent interest (see Section 4.1 for a highlevel overview and the formal proof). In Appendix B, we show that the lemma is in fact tight, in the sense that there are functions for which conditioning on of their coordinates provides no advantage at all.
To complete the proof of the simulation theorem, we apply the PeaktoAverage Lemma with , and . The lemma guarantees that Bob can find a small (specific) set of coordinates , such that his maximumlikelihood estimate conditioned on the true value of the coordinates in must provide an advantage of at least . Since is small, the probability that is contained in Alice’s second sample is . Overall, Bob’s maximumlikelihood strategy provides the desired advantage we sought.
1.2 Applications: New Lower Bounds
We apply our new technique to a number of classic data structure problems, resulting in a range of new lower bounds. This section describes the problems and the lower bounds we derive for them, in context of prior work. As a warmup, we prove a lower bound for a somewhat artificial version of polynomial evaluation:
Polynomial Evaluation.
Consider storing, updating and evaluating a polynomial over the Galois field . Here we assume that elements of are represented by bit strings in , i.e. there is some bijection between and . Elements are represented by the corresponding bit strings. Any bijection between elements and bit strings suffice for our lower bound to apply.
The leastbit polynomial evaluation data structure problem is defined as follows: A degree polynomial over is initialized with all coefficients being . An update is specified by a tuple where is an index and is an element in . It changes the coefficient such that (where addition is over ). A query is specified by an element and one must return the least significant bit of . Recall that we make no assumptions on the concrete representation of the elements in , only that the elements are in a bijection with so that precisely half of all elements in have a as the least significant bit.
Using our weak oneway simulation theorem, Section 5 proves the following lower bound:
Theorem 2.
Any cell probe data structure for leastbit polynomial evaluation over , having cell size , worst case update time and expected average query time must satisfy:
Note that this lower bound is not restricted to have (corresponding to having polynomially many queries). It holds for arbitrarily large and thus demonstrates that our lower bound actually grows as log of the number of queries, times a . At least up to a certain (unavoidable) barrier (the bound in the min is precisely when the query time is large enough that the data structure can read all cells associated to more than half of the epochs). We remark that the majority of previous lower bound techniques could also replace a in the lower bounds by a for problems with queries. Our introduction focuses on the most natural case of polynomially many queries () for ease of exposition.
Polynomial evaluation has been studied quite intensively from a lower bound perspective, partly since it often allows for very clean proofs. The previous work on the problem considered the standard (nonboolean) version in which we are required to output the value , not just its least significant bit. Miltersen [Mil95] first considered the static version where the polynomial is given in advance and we disallow updates. He proved a lower bound of where is the space usage of the data structure in number of cells. This was improved by Larsen [Lar12b] to , which remains the highest static lower bound proved to date. Note that the lower bound peaks at for linear space . Larsen [Lar12b] also extended his lower bound to the dynamic case (though for a slightly different type of updates), resulting in a lower bound of . Note that none of these lower bounds are greater than per output bit and in that sense they are much weaker than our new lower bound.
In [GM07], Gál and Miltersen considered succinct data structures for polynomial evaluation. Succinct data structures are data structures that use space close to the information theoretic minimum required for storing the input. In this setting, they showed that any data structure for polynomial evaluation must satisfy when for any constant . Here is the redundancy, i.e. the additive number of extra bits of space used by the data structure compared to the information theoretic minimum. Note that even for data structures using just a factor more space than the minimum possible, the time lower bound reduces to the trivial . For data structures with nondeterminism (i.e., they can guess the right cells to probe), Yin [Yin10] proved a lower bound matching that of Miltersen.
On the upper bound side, Kedlaya and Umans [KU08] presented a wordRAM data structure for the static version of the problem, having space usage and worst case query time , getting rather close to the lower bounds. While not discussed in their paper, a simple application of the logarithmic method makes their data structure dynamic with an amortized update time of and worst case query time .
Parity Searching in Butterfly Graphs.
In a seminal paper [Pǎt08], Pǎtraşcu presented an exciting connection between an entire class of data structure problems. Starting from a problem of reachability oracles in the Butterfly graph, he gave a series of reductions to classic data structure problems. His reductions resulted in lower bounds for static data structures solving any of these problems.
We modify Pǎtraşcu’s reachability problem such that we can use it in reductions to prove new dynamic lower bounds. In our version of the problem, which we term parity searching in Butterfly graphs, the data structure must maintain a set of directed acyclic graphs (Butterfly graphs of the same degree , but different depths) under updates which assign binary weights to edges, and support queries that ask to compute the parity of weights assigned to edges along a number of paths in these graphs. The formal definition of this version of the problem is deferred to Section 6.2.
While this new problem might sound quite artificial and incompatible to work with, we show that parity searching in Butterfly graphs in fact reduces to many classic problems, hence proving lower bounds on this problem is the key to many of our results. Indeed, our starting point is the following lower bound:
Theorem 3.
Any dynamic data structure for parity searching in Butterfly graphs of degree , with a total of edges, having cell size , worst case update time and expected average query time must satisfy:
In the remainder of this section, we present new lower bounds which we derive via reductions from parity searching in Butterfly graphs . For context, our results are complemented with a discussion of previous work.
2D Range Counting.
In 2D range counting, we are given points on a integer grid, for some . We must preprocess the points such that given a query point , we can return the number of points that are dominated by (i.e. and ). In the dynamic version of the problem, an update specifies a new point to insert. 2D range counting is a fundamental problem in both computational geometry and spatial databases and many variations of it have been studied over the past many decades.
Via a reduction from reachability oracles in the Butterfly graph, Pǎtraşcu [Pǎt08] proved a static lower bound of for this problem, even in the case where one needs only to return the parity of the number of points dominated by . Recall that this is the 2D range parity problem.
It turns out that a fairly easy adaptation of Pǎtraşcu’s reduction implies the following:
Theorem 4.
Any dynamic cell probe data structure for 2D range parity, having cell size , worst case update time and expected query time , gives a dynamic cell probe data structure for parity searching in Butterfly graphs (for any degree ) with cell size , worst case update time and average expected query time .
Combining this with our lower bound for parity searching in Butterfly graphs (Theorem 3), we obtain:
Corollary 1.
Any cell probe data structure for 2D range parity, having cell size , worst case update time and expected query time must satisfy:
In addition to Pǎtraşcu’s static lower bound, Larsen [Lar12a] studied the aforementioned variant of the range counting problem, called 2D range sum, in which points are assigned bit integer weights and the goal is to compute the sum of weights assigned to points dominated by . As previously discussed, Larsen’s lower bound for dynamic 2D range sum was and was the first lower bound to break the barrier, though only for a problem with bit output. Weinstein and Yu [WY16] later reproved Larsen’s lower bound, this time extending it to the setting of amortized update time and a very high probability of error. Note that these lower bounds remain below the logarithmic barrier when measured per output bit of a query. While 2D range counting (not the parity version) also has bit outputs, it seems that the techniques of Larsen and Weinstein and Yu are incapable of proving an lower bound for it. Thus the strongest previous lower bound for the dynamic version of 2D range counting is just the static bound of (since one cannot build a data structure with space usage higher than in operations). As a rather technical explanation for why the previous techniques fail, it can be observed that they all argue that a collection of queries have bits of entropy in their output. But for 2D range counting, having queries means that on average, each query contains just new points, reducing the total entropy to something closer to . This turns out to be useless for the lower bound arguments. It is conceivable that a clever argument could show that the entropy remains , but this has so forth resisted all attempts.
From the upper bound side, JáJá, Mortensen and Shi [JMS04] gave a static 2D range counting data structure using linear space and query time, which is optimal by Pǎtraşcu’s lower bound. For the dynamic case, Brodal et al. [BGJS11] gave a data structure with . Our new lower bound shrinks the gap between the upper and lower bound on to only a factor for .
2D Rectangle Stabbing.
In 2D rectangle stabbing, we must maintain a set of 2D axis aligned rectangles with integer coordinates, i.e. rectangles are of the form . We assume coordinates are bounded by a polynomial in . An update inserts a new rectangle. A query is specified by a point , and one must return the number of rectangles containing . This problem is known to be equivalent to 2D range counting via a folklore reduction. Thus all the bounds in the previous section, both upper and lower bounds, also apply to this problem. Furthermore, 2D range parity is also equivalent to 2D rectangle parity, i.e. returning just the parity of the number of rectangles stabbed.
Range Selection and Range Median.
In range selection, we are to store an array where each entry stores an integer bounded by a polynomial in . A query is specified by a triple . The goal is to return the index of the ’th smallest entry in the subarray . In the dynamic version of the problem, entries are initialized to . Updates are specified by an index and a value and has the effect of changing the value stored in entry to . In case of multiple entries storing the same value, we allow returning an arbitrary index being tied for ’th smallest.
We give a reduction from parity searching in Butterfly graphs:
Theorem 5.
Any dynamic cell probe data structure for range selection, having cell size , worst case update time and expected query time , gives a dynamic cell probe data structure for parity searching in Butterfly graphs (for any degree ) having cell size , worst case update time and expected average query time . Furthermore, this holds even if we force in queries and require only that we return whether the ’th smallest element in is stored at an even or odd position.
Since we assume , combining this with Theorem 3 immediately proves the following:
Corollary 2.
Any cell probe data structure for range selection, having cell size , worst case update time and expected query time must satisfy:
Furthermore, this holds even if we force in queries and require only that we return whether the ’th smallest element in is stored at an even or odd position.
While range selection is not a boolean data structure problem, it is still a fundamental problem and for the same reasons as mentioned under 2D range counting, the previous lower bound techniques seem incapable of proving lower bounds for the dynamic version. Thus we find our new lower bound very valuable despite the problem not beeing boolean . Also, we do in fact manage to prove the same lower bound for the boolean version where we need only determine whether the index of the ’th smallest element is even or odd.
For the static version of the problem, Jørgensen and Larsen [JL11] proved a lower bound of . Their proof was rather technical and a new contribution of our work is that their static lower bound now follows by reduction also from Pǎtraşcu’s lower bound for reachability oracles in the Butterfly graph. For the dynamic version of the problem, no lower bound stronger than the bound following from the static bound was previously known.
On the upper bound side, Brodal et al. [BGJS11] gave a linear space static data structure with query time . This matches the lower bound of Jørgensen and Larsen. They also gave a dynamic data structure with .
Since we prove our lower bound for the version of range selection where , also known as prefix selection, we can reexecute a reduction of Jørgensen and Larsen [JL11]. This means that we also get a lower bound for the fundamental range median problem. Range median is the natural special case of range selection where .
Corollary 3.
Any cell probe data structure for range median, having cell size , worst case update time and expected query time must satisfy:
Furthermore, this holds even if we are required only to return whether the median amongst is stored at an even or odd position.
We note that the upper bound of Brodal et al. for range selection is also the best known upper bound for range median.
2 Organization
In Section 3 we introduce both the dynamic cell probe model and the oneway communication model, which is the main proxy for our results. In Section 4 we state the formal version of Theorem 1 and give its proof as well as the proof of the PeaktoAverage lemma. Section 5 and onwards are devoted to applications of our new simulation theorem, starting with a lower bound for polynomial evaluation. In Section 6 we formally define parity searching in Butterfly graphs and prove a lower bound for it using our simulation theorem. Finally, Section 7 presents a number of reductions from parity searching in Butterfly graphs to various fundamental data structure problems, proving the remaining lower bounds stated in the introduction.
3 Preliminaries
The dynamic cell probe model.
A dynamic data structure in the cell probe model consists of an array of memory cells, each of which can store bits. Each memory cell is identified by a bit address, so the set of possible addresses is . It is natural to assume that each cell has enough space to address (index) all update operations performed on it, hence we assume that when analyzing a sequence of operations.
Upon an update operation, the data structure can perform read and write operations to its memory so as to reflect the update, by probing a subset of memory cells. This subset may be an arbitrary function of the update and the content of the memory cells previously probed during this process. The update time of a data structure, denoted by , is the number of probes made when processing an update (this complexity measure can be measured in worstcase or in an amortized sense). Similarly, upon a query operation, the data structure performs a sequence of probes to read a subset of the memory cells in order to answer the query. Once again, this subset may by an arbitrary (adaptive) function of the query and previous cells probed during the processing of the query. The query time of a data structure, denoted by , is the number of probes made when processing a query.
3.1 Oneway protocols and “Epoch” communication games
A useful way to abstract the informationtheoretic bottleneck of dynamic data structures is communication complexity. Our main results (both upper and lower bounds) are cast in terms of the following twoparty communication games, which are induced by dynamic data structure problems:
Definition 1 (Epoch Communication Games ).
Let be a dynamic data structure problem, consisting of a sequence of update operations divided into epochs , where (and ), followed by a single query . For each epoch , the twoparty communication game induced by is defined as follows:

Alice receives all update operations .

Bob receives (i.e., all updates except those in epoch ) and a query for .

The goal of the players is to output the correct answer to , that is, to output .
We shall consider the following restricted model of communication for solving such communication games.
Definition 2 (OneWay Randomized Communication Protocols).
Let be a twoparty boolean function. A oneway communication protocol for under input distribution proceeds as follows:

Alice and Bob have shared access to a public random string of their choice.

Alice sends Bob a single message, , which is only a function of her input and the public random string.

Based on Alice’s message, Bob must output a value .
We say that solves under with cost , if :

For any input , Alice never sends more than bits to Bob, i.e., , for all .

Let us denote by
the largest advantage achievable for predicting under via an bit oneway communication protocol. For example, when applied to the boolean communication problem , we say that has an bit oneway communication protocol with advantage , if . We remark that we sometimes use the notation to denote the messagelength (i.e., number of bits ) of the communication protocol .
4 OneWay Weak Simulation of Dynamic Data Structures
In this section we prove our main result, Theorem 1. For any dynamic decision problem , we show that if admits an efficient data structure with respect to a random sequence of updates divided into epochs , then we can use it to devise an efficient oneway communication protocol for the underlying twoparty communication problem of some (large enough) epoch , with a nontrivial success (advantage over random guessing).
Throughout this section, let us denote the size of epoch by , where we require , and . We prove the following theorem.
Theorem 1 (restated).
Let be a dynamic boolean data structure problem, with random updates grouped into epochs , such that , followed by a single query . If admits a dynamic data structure with worstcase update time and average (over ) expected query time satisfying , then there exists some epoch for which
as long as for a constant .
Proof.
Consider the memory state of after the entire update sequence , and for each cell , define its associated epoch to be the last epoch in during which was probed (note that is a random variable over the random update sequence ). For each query , let be the random variable denoting the number of probes made by on query (on the random update sequence). For each query and epoch , let denote the number of probes on query to cells associated with epoch (i.e., cells for which ).
By definition, we have and . By averaging, there is an epoch such that . By Markov’s inequality and a union bound, there exists a subset of queries such that both
(1) 
for every query . By Markov’s inequality and union bound, for each , we have
(2) 
Note that, while Bob cannot identify the event “” (as it depends on Alice’s input as well), he does know whether his query is in or not, which is enough to certify (2).
Now, suppose that Alice samples each cell associated with epoch in independently with probability , where
(note that, by definition of , Alice can indeed generate the memory state and compute the associated epoch for each cell, as her input consists of the entire update sequence). Let be the resulting set of cells sampled by Alice. Alice sends Bob (both addresses and contents). For a query , let denote the event that the set of cells Bob receives, contains all cells associated with epoch and probed by the data structure. By Equation (2), we have that for every
(3) 
If Bob could detect the event , we would be done. Indeed, let denote the set of (addresses and contents of) cells associated with all future epochs , i.e., all the cells probed by succeeding epoch . Due to the geometrically decreasing sizes of epochs, sending requires less than bits of communication. Since Bob has all the updates preceding epoch , he can simulate the data structure and generate the correct memory state of right before epoch . In particular, Bob knows for every cell, assuming it is not probed since epoch (thus associated with some epoch ), what its content will be. Therefore, when he is further given the messages , Bob would be able to simulate the data structure perfectly on query , assuming the event occurs. If Bob could detect , he could simply output a random bit if it does not occur, and follow the data structure if it does. This strategy would have already produced an advantage of , which would have finished the proof. As explained in the introduction, Bob has no hope of certifying the occurrence of the event , hence we must take a fundamentally different approach for arguing that condition (3) can nevertheless be (implicitly) used to devise a strategy for Bob with a nontrivial advantage. This is the heart of the proof.
To this end, note that, given a query , a received sample and all cells associated with some epoch , Bob can simulate on his partial update sequence (), filling in the memory updates according to and , and pretending that all cells in the querypath of which are associated with epoch are actually sampled in (i.e., pretending that the event occurs). See Step 5 of Figure 1 for the formal simulation argument. Let denote the resulting memory state obtained by Bob’s simulation in the figure, given and his received sets of cells .
Now, let us consider the (deterministic) sequence of cells that would probe given query in the above simulation with respect to Bob’s memory state . Let us say that the triple is good for a query , if and . That is, is good for , if the posterior probability of is (relatively) high and is not too large. By Equation (3) and Markov’s inequality, the probability that the triple satisfies , is at least (indeed, the expectation in (3) can be rewritten as , since is a deterministic function of ). Note that when occurs, the value of is completely determined given , and , in which case , and thus the probability that is good is at least . From now on, let us focus only on the case that that Alice sends is good, since Bob can identify whether is good based on and Alice’s message, and if it is not, he will output a random bit.
We caution that is simply a set of memory addresses in , not necessarily the correct one – in particular, while the addresses of the cells are determined by the above simulation, the contents of these cells (in ) are not – they are a random variable of , as the sample is very unlikely to contain all the associated cells). For any assignment to the contents of the cells in , let us denote by
the probability that the memory content of the sequence of cells is equal to , conditioned on .
Every content assignment to , generates some posterior distribution on the correct query path (i.e., with respect to the true memory state ) and therefore on the output of the query with respect to . Hence we may look at the joint probability distribution of the event “” and the assignment which is
Now, consider the function
(4) 
Equivalently, conditioned on , and , is the bias of the random varaible conditioned on , multiplied by the probability of .
Note that, since for every assignment , we have , and since is a probability distribution, this fact implies that: (i) . Furthermore, we shall argue that (as we always condition on good ), in which case the contents of are completely determined by (we postpone the formal argument to the Analysis section below). Denoting by the content assignment to induced by , we observe that conditioned on , will be precisely the correct set of cells probed by on , in which case is determined by . Formally, this fact means that: (ii) .
Conditions (i)(ii) above imply that satisfies the premise of the PeaktoAverage Lemma (Lemma 1) with . Recall that the lemma guarantees there is a nottoolarge subset of coordinates ( addresses) of , which Bob can privately compute,^{5}^{5}5Indeed, is only a function of ,