Bounded-Degree Network Designs of Bounded Degree

Bounded-Degree Network Designs of Bounded Degree

Chen Avin  Kaushik Mondal  Stefan Schmid
Ben Gurion University of the Negev, Israel   University of Vienna, Austria
avin@cse.bgu.ac.il  mondal@post.bgu.ac.ilstefan_schmid@univie.ac.at

Almost Dynamic Optimality Almost

Chen Avin  Kaushik Mondal  Stefan Schmid
Ben Gurion University of the Negev, Israel   University of Vienna, Austria
avin@cse.bgu.ac.il  mondal@post.bgu.ac.ilstefan_schmid@univie.ac.at

Dynamic Optimality for Self-Adjusting Complete Tree Networks

Chen Avin  Kaushik Mondal  Stefan Schmid
Ben Gurion University of the Negev, Israel   University of Vienna, Austria
avin@cse.bgu.ac.il  mondal@post.bgu.ac.ilstefan_schmid@univie.ac.at

Dynamically Optimal Self-Adjusting Complete Trees

Chen Avin  Kaushik Mondal  Stefan Schmid
Ben Gurion University of the Negev, Israel   University of Vienna, Austria
avin@cse.bgu.ac.il  mondal@post.bgu.ac.ilstefan_schmid@univie.ac.at

Optimal Self-Adjusting Complete Trees

Chen Avin  Kaushik Mondal  Stefan Schmid
Ben Gurion University of the Negev, Israel   University of Vienna, Austria
avin@cse.bgu.ac.il  mondal@post.bgu.ac.ilstefan_schmid@univie.ac.at

Push-Down Trees: Optimal Self-Adjusting Complete Trees

Chen Avin  Kaushik Mondal  Stefan Schmid
Ben Gurion University of the Negev, Israel   University of Vienna, Austria
avin@cse.bgu.ac.il  mondal@post.bgu.ac.ilstefan_schmid@univie.ac.at

Optimal Self-Adjusting Complete Trees

Chen Avin  Kaushik Mondal  Stefan Schmid
Ben Gurion University of the Negev, Israel   University of Vienna, Austria
avin@cse.bgu.ac.il  mondal@post.bgu.ac.ilstefan_schmid@univie.ac.at

Push-Down Trees: Optimal Self-Adjusting Complete Trees

Chen Avin  Kaushik Mondal  Stefan Schmid
Ben Gurion University of the Negev, Israel   University of Vienna, Austria
avin@cse.bgu.ac.il  mondal@post.bgu.ac.ilstefan_schmid@univie.ac.at

Dynamically Optimal Complete Trees

Chen Avin  Kaushik Mondal  Stefan Schmid
Ben Gurion University of the Negev, Israel   University of Vienna, Austria
avin@cse.bgu.ac.il  mondal@post.bgu.ac.ilstefan_schmid@univie.ac.at

Push-Down Trees: Optimal Self-Adjusting Complete Trees

Chen Avin  Kaushik Mondal  Stefan Schmid
Ben Gurion University of the Negev, Israel   University of Vienna, Austria
avin@cse.bgu.ac.il  mondal@post.bgu.ac.ilstefan_schmid@univie.ac.at
Abstract

Since Sleator and Tarjan’s seminal work on self-adjusting lists, heaps and binary search trees, researchers have been fascinated by dynamic datastructures and the questions related to their performance over time. This paper initiates the study of another classic datastructure, self-adjusting (binary) Complete Trees (s): trees which do not provide a simple search mechanism but allow to efficiently access items given a global map. Our problem finds applications, e.g., in the context of warehouse optimization or self-adjusting communication networks which can adapt to the demand they serve.

We show that self-adjusting complete trees assume an interesting position between the complexity of self-adjusting (unordered) lists and binary search trees. In particular, we observe that in contrast to lists, a simple move-to-front strategy alone is insufficient to achieve a constant competitive ratio. Rather, and similarly to binary search trees, an additional (efficient) tree update rule is needed. Intriguingly, while it is unknown whether the working set is a lower bound for binary search trees, we show that this holds in our model. So while finding an update rule is still an open problem for binary search trees, this paper shows that there exists a simple, random update rule for complete trees.

Our main result is a dynamically optimal (i.e., constant competitive) self-adjusting called Push-Down Tree, on expectation against an oblivious adversary. At the heart of our approach lies a distributed algorithm called Random-Push: this algorithm approximates a natural notion of Most Recently Used (MRU) tree (essentially an approximate working set), by first performing move-to-front, but then pushing less recently accessed items down the tree using a random walk.

1 Introduction

Self-adjusting datastructures, introduced over 30 years ago by Sleator and Tarjan [21], are an important and intensively studied concept in the algorithm research community. A self-adjusting datastructure has the appealing property that it can optimize itself to the workload, leveraging temporal locality, but without knowing the future. Ideally, self-adjusting datastructures should store items which will be accessed (frequently) in the future, in a way that they can be accessed quickly, while also accounting for reconfiguration costs.

In this work, we study a novel flavor of such self-adjusting structures where lookup is supported by a map (i.e., the structure does not have to be searchable). Let us consider the following motivating example. Consider a simplified packing & shipping scenario arising, e.g., in a book warehouse with a single worker, Bob. Bob’s desk is placed in the front of the warehouse where he receives a continuous sequence of requests for items (here books) that need to be packed and shipped, one item at a time. When a new request for an item arrives, Bob uses a map (e.g., drawn on a whiteboard) of the warehouse to look up the book’s rack ; he then walks there to collect the book for shipping (or he sends a robot). Bob’s goal is to minimize the total walking distance (e.g., saving time, energy). Fortunately, the warehouse infrastructure supports dynamic placements of items. The warehouse infrastructure contains a set of nodes, each of which can host a single bookrack. Nodes are connected by a network of tracks (railways) on which racks can be shifted (dragged) from one node location to another (along with the books they store). Upon receiving an item request, Bob first needs to access the item, but then he can also rearrange (i.e., self-adjust) rack locations as he wishes, taking into account the distance he needs to travel to do so. Subsequently, he can update the warehouse map (at no cost) for the next request. We ask: How can Bob optimize his actions, without knowing a priori the items request sequence ? Can Bob exploit temporal locality in the sequence if it exist?

We present a formal abstraction for this problem as a self-adjusting datastructure later, but few observations are easy to make. If the network of tracks is an (infinite) line then this problem can be formulated as a list update problem [21] for which it is well-known that a simple Move-To-Front (MTF) rule results in a “dynamically optimal” self-adjusting list: an online algorithm On, which serves each access without any knowledge of future accesses, is said to be dynamically optimal if it executes in time , where is the cost of an optimal offline algorithm Opt which knows a priori. We also note that if the network is a line (i.e., a list), then the warehouse map is of no use: Bob can just walk along the single route until he finds the correct bookrack.

In this paper we consider the case where the network of nodes (racks) is a (binary) Complete Tree () and the distance is measured as the number of hops on the shortest path between nodes. A tree network is a generalization of the list, and as we mentioned, maintaining temporal locality in a list is simple: the move-to-front algorithm fulfills the most-recently used property, i.e., the furthest away item from the front is the most recently used item. In the list, this property is enough to guarantee optimality [21], and is essentially a working set property: the working set number of an item at time is the number of distinct elements accessed since the last access of prior to time , including ; a data structure has the working-set property if the amortized cost of access is . Naturally, we wonder whether the most-recently used property is enough to guarantee optimality in binary trees. The answer turns out to be non-trivial. Our first contribution is to show that if we count only access cost (ignoring any rearrangement cost), the answer is affirmative: the most-recently used tree is what is called access optimal. But securing this property, i.e., maintaining the most-recently used items close to the root in the tree, introduces a new challenge: how to achieve this at low cost? In particular, assuming that swapping the locations of items comes at a unit cost, can the property be maintained at cost proportional to the access cost? As we show, strictly enforcing the most-recently used property in a tree is too costly to achieve optimality. But, when turning to an approximate most-recently used property, we are able to show two important properties: i) such an approximation is good enough to guarantee access optimality; and ii) it can be maintained in expectation using a random walk based algorithm.

1.1 Our Contributions

Our main contribution is a dynamically optimal, i.e., constant competitive (on expectation) self-adjusting (binary) Complete Tree () called Push-Down Tree.

Theorem 1

Push-Down Tree is dynamically optimal on expectation.

In particular, we show that for s, an efficient tree update rule exists: an open problem in the context of self-adjusting binary search trees. At the heart of our technical contribution lies the study and approximation of a natural notion of Most-Recently Used (MRU) tree, which can be seen as a working set property: An MRU tree , at any given time , has the property that for any item pair , if has been accessed more recently than , the depth of in is no larger than the depth of in .

Since maintaining strict MRU properties is costly (in terms of adjustment costs), we define the approximate MRU tree (resp. approximate working set). In a -approximate MRU tree , short , it holds that for any item which is at depth in , it must be within depth in for any . We investigate different algorithms to ensure the approximate MRU property. In particular, we introduce a distributed algorithm Random-Push which provides a constant MRU approximation: it promotes accessed items to the tree root immediately. However, unlike existing self-adjusting binary search tree algorithms, in order to make space for the new item at the root, it uses a simple strategy, based on random walks, to push items down the tree in a balanced manner, preserving most recently used items close to the root.

To the best of our knowledge, the design of self-adjusting s which do not provide a simple search operation, has not been investigated in the literature so far.

1.2 Novelty and Relationship to Prior Work

Role of the Map. It is important to discuss the role of the map in our model. It our setting, the map is global and centralized, and allows us to trivially access a node (or item) at distance from the front at a cost . In our example, Bob does not need to search for an item in the network; rather, there is an instruction of how to reach the item, before he leaves its desk. This is in striking contrast to self-adjusting binary search trees, where no global map is needed and Bob can reach an item at distance , at a cost , by greedily searching it in the tree after leaving his desk. Interestingly, since Sleator and Tarjan introduced self-adjusting binary search trees over 30 years ago [22], the quest for a constant competitive ratio, i.e., “dynamically optimal” algorithm, is still a major open problem. Nevertheless, there are self-adjusting binary search trees that are known to be access optimal [6], but their rearrangement cost it too high.

This positions our model, self-adjusting binary trees with a map, in an intriguing new location on the spectrum between dynamic lists and binary search trees. The novelty in our model is that searching items is done centrally (at no or negligible cost), but access and rearranging is distributed and comes at a cost.

One practical motivation (besides Bob’s warehouse) to study this model stems from its applications in distributed and networked systems, which are becoming increasingly flexible. For example, emerging optical technologies allow to adjust the physical topology of a datacenter or wide-area network, in an online manner (e.g., [17]). In particular, a tree may describe a reconfigurable network topology where a communication source arranges its communication partners in the form of a bounded degree tree [5, 19]. Accordingly, we in this paper are interested in distributed algorithms for self-adjusting trees in which items (e.g., communication partners) can be swapped between neighboring nodes, at unit cost. This poses two key challenges: (1) we need to strike a balance between the benefits of adjustments (e.g., reduced communication costs) and their costs (the reconfigurations); and (2) provide an efficient routing algorithm on the unordered tree. We discuss this more in Section 5.

Dynamic List Update: Linked List (LL). The dynamically optimal linked list datastructure is a seminal [21] result in the area: algorithms such as Move-To-Front (MTF), which moves each accessed element to the front of the list, are known to be 2-competitive, which is optimal [1, 21, 4]. We note that the Move-To-Front algorithm results in the Most Recently Used property where items that were more recently used are closer to the head of the list. The best known competitive ratio for randomized algorithms for LLs is 1.6, which almost matches the randomized lower bound of 1.5 [3, 23].

Binary Search Tree (BST). Self-adjusting BSTs were the first efficient datastructures studied in the literature. In contrast to s, self-adjustments in BSTs are based on rotations (which are assumed to have unit cost). While BSTs have the working set property, we are missing a matching lower bound: the Dynamic Optimality Conjecture, the question whether splay trees [22] are dynamically optimal, continues to puzzle researchers even in the randomized case [2]. On the positive side, over the last years, many deep insights into the properties of self-adjusting BSTs have been obtained [10], including improved (but non-constant) competitive ratios [7], regarding weaker properties such as working sets, static, dynamic, lazy, and weighted, fingers, regarding pattern-avoidance [9], and so on. It is also known (under the name dynamic search-optimality) that if the online algorithm is allowed to make rotations for free after each request, dynamic optimality can be achieved [6]. Known lower bounds are by Wilber [24], by Demaine et al. [13]’s interleaves bound (a variation), and by Derryberry et al. [14] (based on graphical interpretations). It is not known today whether any of these lower bounds is tight.

Unordered Tree (UT). We are not the first to consider unordered trees and it is known that existing lower bounds for (offline) algorithms on BSTs also apply to UTs that use rotations: Wilber’s theorem can be generalized [15]. However, it is also known that this correspondance between ordered and unordered trees no longer holds under weaker measures such as key independent processing costs and in particular Iacono’s measure [16]: the expected cost of the sequence which results from a random assignment of keys from the search tree to the items specified in an access request sequence. Iacono’s work is also one example of prior work which shows that for specific scenarios, working set and dynamic optimality properties are equivalent. Regarding the current work, we note that the reconfiguration operations in UTs are more powerful than the swapping operations considered in our paper: a rotation allows to move entire subtrees at unit costs, while the corresponding cost in s is linear in the subtree size. We also note that in our model, we cannot move freely between levels, but moves can only occur between parent and child. In contrast to UTs, s are bound to be balanced.

Skip List (SL) and B-Trees (BT). Intriguingly, althrough SLs and BSTs can be transformed to each other [12], Bose et al. [8] were able to prove dynamic optimality for (a restricted kind of) SLs as well as BTs. Similarly to our paper, the authors rely on a connection between dynamic optimality and working set: they show that the working set property is sufficient for their restricted SLs (for BSTs, it is known that the working set is an upper bound, but it is not known yet whether it is also a lower bound). However, the quest for proving dynamic optimality for general skip lists remains an open problem: two restricted types of models were considered in [8], bounded and weakly bounded. In the bounded model, the adversary can never forward more than times on a given skip list level, without going down in the search; and in the weakly bounded model, the first highest levels contain no more than elements. Optimality only holds for constant . The weakly bound model, is related to a complete -ary tree (similar to our complete binary tree), but there is no obvious or direct connection between our result and the weakly bounded optimality. Due to the relationship between SLs and BSTs, a dynamically optimal SL would imply a working set lower bound for BST. Moreover, while both in their model and ours, proving the working set property is key, the problems turn out to be fundamentally different. In contrast to SLs, s revolve around unordered (and balanced) trees (that do not provide a simple search mechanism), rely on a different reconfiguration operation (i.e., swapping or pushing an item to its parent comes at unit cost), and, as we show in this paper, actually provide dynamic optimality for their general form.

Online Paging. More generally, our work is also reminiscent of distributed heaps [18] resp. online paging models for hierarchies of caches [25], which aim to keep high-capacity nodes resp. frequently accessed items close to each other, however, without accounting for the reconfiguration cost over time. Similar to the discussion above, self-adjusting s differ from paging models in that in our model, items cannot move arbitrarily and freely between levels (but only between parent and child at unit cost).

1.3 Paper Organization

Section 2 presents our model and Section 3 introduces access-optimal MRU trees. Section 4 describes how to maintain such trees and hence implement self-adjusting trees, and Section 5 discusses applications to networks. We conclude our contribution in Section 6. Some technical details are postponed to the Appendix.

2 Model and Preliminaries

We are given a complete tree of nodes which store items , one item per node. We will denote by the root of the tree , or when is clear from context, and by resp.  the left resp. right child of node . For any and any time , we will denote by the item mapped to at time . Similarly, denotes the node hosting item . Note that if then .

All access requests to items originate from the root . Access requests occur over time, forming a (finite or infinite) sequence , where denotes that item is requested, and needs to be accessed at time . This sequence (henceforth also called the workload) is revealed one-by-one to an online algorithm On.

The depth of a node is fixed and describes the distance from the root; it is denoted by . The depth of an item at time is denoted by , and is given by the depth of the node to which is mapped at time . Note that .

In order to leverage temporal locality and eventually achieve dynamic optimality, we aim to keep recently accessed items close to the root. Accordingly, we are interested in the time of the most recent use (mru) (the most recent access) of an item, henceforth denoted by . Both serving the request and adjusting the configuration comes at a cost:

Definition 1 (Cost)

The cost incurred by an algorithm Alg to serve a request to access item is denoted by , short . It consists of two parts, access cost () and adjustment cost (). We define access cost simply as since Alg can maintain a global map and access via the shortest path. Adjustment cost, , is the number of total swaps done by items which change location subsequently, where a single swap means changing position of an item with its parent. The total cost, incurred by Alg is then

Our main objective is to design online algorithms that perform almost as well as optimal offline algorithms (which know ahead of time), even in the worst-case. In other words, we want to devise online algorithms which minimize the competitive ratio:

Definition 2 (Competitive Ratio )

We consider the standard definition of (strict) competitive ratio , i.e.,

where is any input sequence and where Opt denotes the optimal offline algorithm.

If an online algorithm is constant competitive, independently of the problem input, it is called dynamically optimal.

Definition 3 (Dynamic Optimality)

An (online) algorithm On achieves dynamic optimality if it asymptotically matches the offline optimum on every access sequence. In other words, the algorithm On is -competitive.

We also consider a weaker form of competitivity (similarly to the notion of search-optimality in related work [6]), and say that On is access-competitive if we consider only the access cost of On (and ignore any adjustment cost) when comparing it to Opt (which needs to pay both for access and adjustment). For a randomized algorithm, we consider an oblivious online adversary which does not know the random bits of the online algorithm a priori.

3 Access Optimality of MRU Trees

While for fixed trees we know that keeping frequent items close to the root is optimal (cf. Appendix), the design of online algorithms for adjusting trees is more involved. In particular, it is known that a Most-Frequently Used (MFU) policy is not optimal for lists [21]. A natural strategy could be to try and keep items close to the root which have been frequent “recently”. However, this raises the question over which time interval to compute the frequencies. Moreover, changing from one MFU tree to another one may entail high adjustment costs. This section introduces a natural pendant to the MFU tree for a dynamic setting: the Most Recently Used (MRU) tree. Intuitively, the MRU tree tries to keep the “working set” resp. recently accessed items close to the root.

Interestingly, we find that maintaining trees in which recently accessed items are close to the root, is the key to the dynamic optimality of our Push-Down Trees. While the move-to-front algorithm, known to be dynamically optimal for self-adjusting lists, naturally provides such a “most recently used” property, generalizing it to the tree is non-trivial.

Accordingly, we present our Push-Down Tree and its underlying online algorithms in two stages: first, we show that any algorithm that maintains an additive approximation of an MRU tree is access-competitive; subsequently, we discuss several options to maintain an approximate MRU tree.

At the heart of our approach lies an algorithm to maintain a constant approximation of the MRU tree at any time. In particular, we prove in the following that a constant additive approximation is sufficient to obtain dynamic optimality. On the other hand, also note that the notion of MRU tree itself needs to be understood as an approximation, and MRU trees are not to be confused with optimal trees: when taking into account both access and adjustment costs, moving an item which is accessed only once to the root (as required by the MRU) is not optimal. With this in mind, let us first formally define MRU and MRU-approximate trees, namely MRU tree.

Definition 4 (MRU Tree)

For a given time , is an MRU tree if and only if,

(1)

Alternatively we can define an MRU tree using the rank of items, which explicitly defines the level in the MRU tree. The rank of an item , i.e., , at time , is equal to its working set number . It follows from Definition 4 that a tree is an MRU tree if and only if . Therefore, an MRU tree has the working-set property.

The notion of rank helps us understand the structure of an MRU tree. The root of the tree (level zero) will always host an item of one. More generally, nodes in level will host items that have a between . Upon a request of an item, say with , the of is updated to one, and only the ranks of items with smaller than are increased, each by 1. Therefore, the of items with rank higher than do not change and their level (i.e., depth) in the MRU tree remains unchanged (but they may switch location within the same level). Next we define MRU trees for any constant .

Definition 5 (Mru() Tree)

A tree is called an MRU tree if it holds for any item that, , where is an MRU tree. Or equally, using the rank, .

Recall that , since the rank is independent of the item location. Note that, any MRU tree is also an MRU tree.

Definition 6 (Mru algorithm)

An online algorithm On is MRU if it uses move-to-front and additionally, for each time , the tree that On maintains, is an MRU tree.

Next we show that MRU algorithms are access-competitive. Recall that the access cost to an item at depth is .

Theorem 2

Any On MRU algorithm is access-competitive.

The full proof is in the appendix, and we just present the main points here. For simplicity, first assume that On is an MRU algorithm. We employ a potential function argument: Our potential function is based on the difference in the items’ locations between On’s tree and Opt’s tree. From the definition of MRU tree, implies i.e., has been accessed more recently. Accordingly, we define a pair of nodes as bad on Opt’s tree if but , i.e., is at a lower level although has been accessed more recently. Note that in an MRU algorithm, none of the pairs is bad since it maintains a perfect MRU tree at all times. Hence bad pairs appear only on Opt’s tree. Here after, in this proof, we use resp. to indicate resp. if not otherwise mentioned. For a given node , let be equal to one plus the number of bad pairs for which . Define .

We define the potential function . We consider the occurrence of events in the following order. Upon a request, On adjusts its tree, then Opt performs the rearrangements it requires. Opt needs to access an item before it can swap it with one of its neighbors (e.g., to move it closer to the root). The access cost is equal to the depth of the item and each of the swaps costs one unit, and is legal only between neighbors. There can be multiple swaps of items by Opt between two accesses of On, after the first item is accessed.

Now consider the potential at time (i.e., before On’s adjustment for serving request and Opt’s rearrangements between requests and ), . Moreover, consider the potential after On adjusted its tree, . The potential change due to On’s adjustment is

(2)

We assume that the initial potential is (i.e., no item was accessed). Since the potential is always positive by definition, we can use it to bound the amortized cost of On, . Consider a request at time to an item at depth in the tree of On. The access-cost is and we would like to have the following bound: . Assume that the requested item is at node at depth in Opt’s tree, so Opt must pay at least an access cost of . First we assume that .

Let us compute the potential after On updated its MRU tree. For all nodes for which , it holds that : only the access time of the last accessed node, , changed. That is, for all nodes for which (excluding ), . The potential of the accessed node, , will be , since its last access time changed to . Now:

(3)

The results follows from when , and by multiplying and dividing by . Also recall that .

Now consider the change in potential . Note that so (recall that ).

(4)

Now we compute the potential change due to Opt’s rearrangements between accesses. Consider the potential after Opt adjusted its tree, . Then the potential change due to Opt’s adjustment is .

Let Opt access an item at from level , raising it to level by swapping it with its parent at . We denote the -ancestor of by and the node containing it by . For all nodes with , except , holds, as goes to level from . For , holds as all the items in layer may become bad w.r.t. . All with remain unchanged, i.e., Also all with remains unchanged except . Since from level becomes at level , the bad pairs, if any, associated to from level becomes good. Hence, . Similarly, if Opt brings to level ( contain ), by performing swaps, then we have (see appendix for details),

(5)

Now we compute the potential change due to Opt’s swaps:

(6)

OPT may replace several items between two accesses. Let Opt replace items between the -th and the -th access. For the -th replacement, the potential change is less or equal to where as Opt pays . Putting it all together, we get

and finally

(7)

The case and are presented in the appendix. We now turn our attention to the problem of efficiently maintaining an approximate MRU tree. To achieve optimality, we need that the tree adjustment cost will be proportional to the access cost.

4 Self-Adjusting MRU Trees

We are now left with the challenge that we need to design a tree which on the one hand provides a good approximation of MRU to capture temporal locality by providing fast access (resp. routing) to items, and on the other hand is also adjustable at low cost over time.

Let us now assume that a certain item is accessed at some time . In order to re-establish the (strict) MRU property, needs to be promoted to the root. And in fact, as we will see, this fast promotion to the root is also performed by our self-adjusting tree algorithm described later on. This however raises the question of where to move the item currently located at the root, let us call it . In order to make space for at the root while preserving locality, we propose to push down items from the root, including item . However, note that simply pushing items down along the path between and (as done in lists) will result in a poor performance in the tree. To see this, let us denote the sequence of items along the path from to by , where , before the adjustment. Now assume that the access sequence is as such that it repeatedly cycles through the sequence , in this order. The resulting cost per request is in the order of , i.e., for . However, an algorithm which assigns (and then fixes) the items in to the top levels of the tree, will converge to a cost of only per request: an exponential improvement.

A better approach seems to be to push down the root along paths “in a balanced manner”: the root node should be pushed down (by swaping items locally) along a path of items which have not been accessed recently. Note that in order to keep the adjustment cost comparable to the access cost, the push down operation should also be limited to depth (before the adjustment). Since the item at depth along this balanced path can be different from , let us call it , where is currently occupied. Accordingly, we propose to move to , noting that we would like to keep all these operations to be performed at cost , i.e., proportional to the original access cost. Figure 2 presents the outline of this type of algorithms. More systematically, we consider the following three intuitive strategies to push down :

  1. Min-Push: Push down, for each row, the least-recently used item.

  2. Rotate-Push: Every node pushes down in an alternating (“round-robin”) manner.

  3. Random-Push: The push down occurs along a simple random walk path.

Before discussing the three strategies, we assume a request to item which is at depth . Let us first consider the Min-Push strategy. The strategy chooses for each depth , : the least recently accessed item from level . We then push to the host of . It is not hard to see that this strategy will actually maintain a perfect MRU tree. However, items with least in different levels, i.e., and , may not lie on a connected path. So to push to , we may need to travel all the way from to the root and then from the root to , resulting in a cost of per level. This accumulates a rearrangement cost of to push all the items with least at each layer up to layer . This is not proportional to the original access cost of the requested item and therefore, leads to a non-constant competitive ratio of .

Next, we consider the Rotate-Push strategy. This deterministic strategy is very appealing. Its adjustment cost is , as the access cost. Moreover it can be shown to guarantee an MRU tree for a wide range of adversaries. This strategy hence seems to be a natural candidate for a dynamically optimal deterministic strategy. However, we are not able to prove this here. In fact, we can show that the Rotate-Push strategy is not able to maintain a constant MRU approximation against every adversary. For example, a clever (but costly) adversary can cause the tree to store items only on the nodes along the path from the root to the right-most leaf (see Appendix for details). But, in a perfect MRU tree, these nodes will be hosted at the top levels of the tree. Thus, we turn to the third option.

4.1 The Random-Push Strategy

To overcome the problems with the above two strategies, we propose the Random-Push strategy. This is a simple random strategy which selects a random path starting at the root by stepping down the tree to depth (the accessed item), choosing uniformly at random between the two children of each node. This can be seen as a simple random walk in a directed version of the tree, starting from the root of the tree and of length steps. Clearly, the adjustment cost of Random-Push is also and its actions are independent of any oblivious online adversary.

Theorem 3

Random-Push maintains an MRU (Definition 5) tree on expectation, i.e., the expected depth of the item with rank is less than for any sequence and any time .

Proof: To analyze Random-Push we will define several random variables for an arbitrary and time (so we ignore them in the notation). W.l.o.g., let be the item with rank and let denote the depth of . First we note that the support of is the set of integers . Next we show the following claim.

Lemma 4

For every , we have that .

Proof: The result will use Stochastic Domination [20] (see in the appendix): we show that for , is stochastically larger than , denoted by . It will then follow from Theorem 9 that . Let be an item with rank , hence, it was requested more recently than . The dominance follows from the fact that conditioning that and first reached the same depth (after the last request of ) then their expected progress of depth will be the same. More formally, let be a random variable that denotes the depth when ’s depth equals the depth of for the first time (since the last request of where its depth is set to 0); and if this never happens. Then by the law of total probability, (and similar for ). But since the random walk (i.e., push) is independent of the nodes’ ranks, we have for that . But additionally there is the possibility that they will never be at the same depth (after the last request of ) and that will always have a higher depth, so , and the claim follows.

Figure 1: The Markov chain that is used to prove Theorem 3 and Lemma 5: possible depths for item of rank in the complete tree.

To understand and upper bound , we will use the Markov chain over the integers which denote the possible depths in the tree, see Figure 1. For each depth in the tree , the probability to move to depth is and the probability to stay at is , , in an absorbing state. This chain captures the idea that the probability of an item at level to be pushed by a random walk down the tree (to level larger than ) is . The chain does describe exactly our algorithm and , but we will use it to prove our bound. First, we consider a random walk described exactly by the Markov chain with an initial state . Let denote the random variable of the state of a random walk of length on . Then we can show:

Lemma 5

The expected state of is such that  , and  is concave in .

Proof: Before going to the proof, we have the following corollary.

Corollary 6
(8)

First note that is strictly monotonic in and can be shown to be concave using the decreasing rate of increase: , for . To bound we consider a another random walk that starts on state in a modified chain . The modified chain is identical to up to state , but for all states the probability to move to state is and the probability to stay at is . So clearly makes faster progress than from state onward. The expected progress of , starting from state is now easier to bound and can shown to be: . But since starts at state we have .

Next we bound the expected number of times that could be pushed down by a random push. Let be a random variable that denotes the number requests for nodes with higher depth than , since ’s last request until time .

Lemma 7

The expected number of requests for items with higher depth than , since was last requested, is bounded by .

Proof: We can divide into two types of requests: which count the number of requests for items with rank higher than , and which count the number of requests for items with lower rank than (but with higher depth at the time of the request). Then . Clearly since every such request increases the rank of and this happens times (note that some of these requests may have lower depth than ). is harder to analyze. How many requests for items are below in the tree (i.e., have higher depth than ), but also have lower rank than ? (Note that such requests do not increase ’s rank, but may increase its depth.) Let be an item with rank , hence was more recently requested than . Let denote the number of requests for (since it was last requested) in which it had a higher depth than . Then . We now claim that . Assume by contradiction that . But then this implies that the expected depth of is larger than the expected depth of , contradicting Lemma 4. Putting it all together:

(9)

We now have all we need to prove Theorem 3. The proof follows by showing that . Let be a random variable that denotes the depth of conditioning that there are requests of items with higher depth than , since the last request for . Note that by the total probability law, we have that . Next we claim that . This is true since the transition probabilities (to increase the depth) in the Markov chain are at least as high as in the Markov chain that describes . When item is at some level , only requests of item with higher depth can increase ’s depth. The probability that a random walk to depth visits (and pushes it down) is exactly . Since , we have that . Clearly we also have . Let be a random variable which is a function of the random variable . Recall that is concave, then by Jensen’s inequality [11] and Lemma 7 we get:

(10)

4.2 Push-Down Trees Are Dynamically Optimal

To conclude, upon access of an item, Push-Down Tree proceeds as follows, see also Algorithm 1 and Figure 2 for an illustration.

Figure 2: Illustration of the algorithmic framework: After finding the accessed item , we promote it to the root. The former root item is then pushed (resp. shifted) down according to a push-down path, up to depth . The item previously stored at the end of the path, , is then moved to the former position of .
  1. Accessed node : Upon a request to , access along the shortest path from root.

  2. Move-to-root: Item is moved to the root, evicting the current at the root, call it .

  3. Push down items in a balanced manner: Execute Random-Push up to depth .

  4. Solve overflow: Move the last item on the push-down path, , to .

In summary, if , then using Random-Push, we have an adjustment cost of , in addition to the access cost of . Note that the move of to does not require any additional swaps and all the participating nodes and have already been accessed, so we assume that no additional cost is required. Even if we assume that this comes at a cost, it is bounded by and does not change the main results. Taking the adjustment cost into account, with the help of Theorem 3, we get that Push-Down Tree is 12-competitive, which is our main result.

See 1

Proof: According to Random-Push, if the -th requested item has rank , then the access cost is and the push down strategy requires additional swaps (we assume that the last step of moving the last item of the push-down path at the last requested node has constant cost: assuming a cost of will increase the competitive ratio but will not change the conceptual result). Hence the maintenance cost is , which is equal to the access cost. So the expected total cost is two times the access cost on the MRU tree. Formally, using Theorem 2 and Theorem 3:

(11)
(12)

5 Application: Self-Adjusting Tree Network

We are particularly interested in distributed applications, and hence, we will now interpret the Push-Down Tree as a communication network between a source and its communication partners. Whenever the source wants to communicate to a partner, it routes to it via the tree. Since nodes in the tree do not know the location of other nodes, a routing algorithm is needed: nodes need to know whether to forward a given request to the left or the right child toward the destination.

Note that in a distributed setting, Random-Push is performed by iteratively shifting items of neighboring nodes along the downward path. We can observe that both the access and adjustment operations can be performed at cost : all operations are limited to the depth of the accessed node . To achieve this in a distributed setting, we propose the following. When the root receives a request for item , it needs to route to . For this purpose, each node in the tree needs to know, for each request, whether it needs to be forwarded to the left or the right child toward . A simple way to achieve this in a distributed setting is to employ source routing: in principle, a message passed between nodes can include, for each node it passes, a bit indicating which child to forward the message next. This simple solution requires bits in the message header, where is the number of nodes, and can be used both for accessing the node and for the random push-down path.

The source routing header can be built based on a dynamic global map of the tree that is maintained at the source node. The source node is a direct neighbor of the root of the tree, aware of all requests, and therefore it can maintain the map. Implementing, updating and accessing such a map is memory based and can be done efficiency with negligible cost.

6 Conclusion

This paper presented a dynamically optimal datastructure, called Push-Down Tree, which is based on complete trees. Our algorithms so far are randomized, and the main open theoretical question concerns deterministic constructions. Push-Down Trees are also distributed and find interesting applications in emerging self-adjusting communication networks. In this regard, we understand our work as a first step, and the design of fully decentralized and self-adjusting communication networks constitutes our main research vision for the future.

References

  • [1] Susanne Albers. A competitive analysis of the list update problem with lookahead. Mathematical Foundations of Computer Science 1994, pages 199–210, 1994.
  • [2] Susanne Albers and Marek Karpinski. Randomized splay trees: theoretical and experimental results. Information Processing Letters, 81(4):213–221, 2002.
  • [3] Susanne Albers, Bernhard Von Stengel, and Ralph Werchner. A combined bit and timestamp algorithm for the list update problem. Information Processing Letters, 56(3):135–139, 1995.
  • [4] Susanne Albers and Jeffery Westbrook. Self-organizing data structures. In Online algorithms, pages 13–51. Springer, 1998.
  • [5] Chen Avin, Kaushik Mondal, and Stefan Schmid. Demand-aware network designs of bounded degree. In Proc. International Symposium on Distributed Computing (DISC), 2017.
  • [6] Avrim Blum, Shuchi Chawla, and Adam Kalai. Static optimality and dynamic search-optimality in lists and trees. In Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1–8, 2002.
  • [7] Prosenjit Bose, Karim Douïeb, Vida Dujmović, and Rolf Fagerberg. An o (log log n)-competitive binary search tree with optimal worst-case access times. In Scandinavian Workshop on Algorithm Theory, pages 38–49. Springer, 2010.
  • [8] Prosenjit Bose, Karim Douïeb, and Stefan Langerman. Dynamic optimality for skip lists and b-trees. In Proc. 19th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1106–1114, 2008.
  • [9] Parinya Chalermsook, Mayank Goswami, László Kozma, Kurt Mehlhorn, and Thatchaphol Saranurak. Pattern-avoiding access in binary search trees. In Proc. Foundations of Computer Science (FOCS), 2015 IEEE 56th Annual Symposium on, pages 410–423. IEEE, 2015.
  • [10] Parinya Chalermsook, Mayank Goswami, László Kozma, Kurt Mehlhorn, and Thatchaphol Saranurak. The landscape of bounds for binary search trees. arXiv preprint arXiv:1603.04892, 2016.
  • [11] T.M. Cover and J. Thomas. Elements of information theory. Wiley, 2006.
  • [12] Brian C. Dean and Zachary H. Jones. Exploring the duality between skip lists and binary search trees. In Proceedings of the 45th Annual Southeast Regional Conference, ACM-SE 45, pages 395–399, New York, NY, USA, 2007. ACM.
  • [13] Erik D. Demaine, Dion Harmon, John Iacono, and Mihai Patrascu. Dynamic optimality - almost. SIAM J. Comput., 37(1):240–251, 2007.
  • [14] Jonathan Derryberry, Daniel Dominic Sleator, and Chengwen Chris Wang. A lower bound framework for binary search trees with rotations. School of Computer Science, Carnegie Mellon University, 2005.
  • [15] Michael L Fredman. Generalizing a theorem of wilber on rotations in binary search trees to encompass unordered binary trees. Algorithmica, 62(3-4):863–878, 2012.
  • [16] John Iacono. Key-independent optimality. Algorithmica, 42(1):3–10, 2005.
  • [17] M. Ghobadi et al. Projector: Agile reconfigurable data center interconnect. In Proc. ACM SIGCOMM, pages 216–229, 2016.
  • [18] Christian Scheideler and Stefan Schmid. A distributed and oblivious heap. In Proc. International Colloquium on Automata, Languages, and Programming (ICALP), pages 571–582, 2009.
  • [19] Stefan Schmid, Chen Avin, Christian Scheideler, Michael Borokhovich, Bernhard Haeupler, and Zvi Lotker. Splaynet: Towards locally self-adjusting networks. IEEE/ACM Transactions on Networking (ToN), 2016.
  • [20] Moshe Shaked and J George Shanthikumar. Stochastic orders. Springer Science & Business Media, 2007.
  • [21] Daniel D. Sleator and Robert E. Tarjan. Amortized efficiency of list update and paging rules. Commun. ACM, 28(2):202–208, February 1985.
  • [22] Daniel Dominic Sleator and Robert Endre Tarjan. Self-adjusting binary search trees. J. ACM, 32(3):652–686, July 1985.
  • [23] Boris Teia. A lower bound for randomized list update algorithms. Information Processing Letters, 47(1):5–9, 1993.
  • [24] Robert Wilber. Lower bounds for accessing binary search trees with rotations. SIAM Journal on Computing, 18(1):56–67, 1989.
  • [25] Gala Yadgar, Michael Factor, Kai Li, and Assaf Schuster. Management of multilevel, multiclient cache hierarchies with application hints. ACM Transactions on Computer Systems (TOCS), 29(2):5, 2011.

Appendix

Appendix A Optimal Fixed Trees

The key difference between binary search trees and binary trees is that the latter provides more flexibilities in how items can be arranged on the tree. Accordingly, one may wonder whether more flexibilities will render the optimal data structure design problem algorithmically simpler or harder.

In this section, we consider the static problem variant, and investigate offline algorithms to compute optimal trees for a fixed frequency distribution over the items. To this end, we assume that for each item , we are given a frequency , where .

Definition 7 (Optimal Fixed Tree)

We call a tree optimal static tree if it minimizes the expected path length .

Our objective is to design an optimal static tree according to Definition 7. Now, let us define the following notion of Most Frequently Used (MFU) tree which keeps items of larger empirical frequencies closer to the root:

Definition 8 (MFU Tree)

A tree in which for every pair of items , it holds that if then , is called MFU tree.

Observe that MFU trees are not unique but rather, there are many MFU trees. In particular, the positions of items at the same depth can be changed arbitrarily without violating the MFU properties.

Theorem 8 (Optimal Fixed Trees)

Any MFU tree is an optimal fixed tree.

Proof: Recall that by definition, MFU trees have the property that for all node pairs : . For the sake of contradiction, assume that there is a tree which achieves the minimum expected path length but for which there exists at least one item pair which violates our assumption, i.e., it holds that but . From this, we can derive a contradiction to the minimum expected path length: by swapping the positions of items and , we obtain a tree with an expected path length which is shorter by .

MFU trees can also be constructed very efficiently, e.g., by performing the following ordered insertion: we insert the items into the tree in a top-down, left-to-right manner, in descending order of their frequencies (i.e., item is inserted before item if ).

Appendix B Adversary for Rotate-Push

Figure 3: Example for Rotate-Push not maintaining the working set on CT.

We show a complete binary tree of height four in Figure 3 where , are some labeled nodes. The root is at level zero and the leaves are at level four. Let us assume that each node pushes to its left child first and then rotates every time. We construct a sequence of requests which prevents Rotate-Push to maintain the working set. Let the requests be from level 4, 1, 2, 1, 3, 1, 4, 1. Consider the first request from level 4. According to the strategy (and Figure 3), the pushdown path would be . Paths for the next requests are respectively , , , , , , and respectively. Now if the adversary continues to generate requests from these nodes maintaining the above sequence, then Rotate-Push is unable to maintain the working set property, which is to keep these requested nine elements within level three in the tree.

Interestingly, although we are not maintaining the working set, we are unable to prove that Rotate-Push does not guarantee a constant approximation of the optimal. This raises the following question: is it necessary to maintain the working set property to guarantee dynamic optimality on a complete binary tree? We are unable to answer this but believe that Rotate-Push maintains dynamic optimality on a complete binary tree. We leave it as a conjecture.

Appendix C Deferred Pseudo-Code

1:   route to along tree branches \hfill(cost: )
2:   let be the item at the current root
3:   move to the root node , setting
4:   employ Random-Push to shift down up to depth \hfill(cost: )
5:   let be the item at the end of the push-down path, where
6:   move to , i.e., setting \hfill(cost: )
Algorithm 1 Upon access to in Push-Down Tree

Appendix D Deferred Proofs

d.1 Proof of Theorem 2

For simplicity, first assume that On is an MRU algorithm. We employ a potential function argument: Our potential function is based on the difference in the items’ locations between On’s tree and Opt’s tree. From the definition of MRU tree, implies i.e., has been accessed more recently. Accordingly, we define a pair of nodes as bad on Opt’s if but , i.e., is at a lower level although has been accessed more recently. Note that in an MRU algorithm, none of the pairs is bad since it maintains a perfect MRU tree at all times. Hence bad pairs appear only on Opt’s tree. Here after, in this proof, we use resp. to indicate resp. if not otherwise mentioned. For a given node , let be equal to one plus the number of bad pairs for which . Define .

We define the potential function . We consider the occurrence of events in the following order. Upon a request, On adjusts its tree, then Opt performs the rearrangements it requires. Opt needs to access an item before it can swap it with one of its neighbors (e.g., to move it closer to the root). The access cost is equal to the depth of the item and each of the swaps costs one unit, and is legal only between neighbors. There can be multiple swaps of multiple items by Opt between two accesses of On, after the first item is accessed.

Now consider the potential at time (i.e., before On’s adjustment for serving request and Opt’s rearrangements between requests and ), . Moreover, consider the potential after On adjusted its tree, . Then the potential change due to On’s adjustment is

(13)

We assume that the initial potential is (i.e., no item was accessed). Since the potential is always positive by definition, we can use it to bound the amortized cost of On, . Consider a request at time to an item at depth in the tree of On. The access-cost is and we would like to have the following bound: . Assume that the requested item is at node at depth in Opt’s tree, so Opt must pay at least an access cost of . First we assume that .

Let us compute the potential after On updated its MRU tree. For all nodes for which , it holds that : only the access time of the last accessed node, , changed. That is, for all nodes for which (excluding ), . The potential of the accessed node, , will be , since its last access time changed to . Now:

(14)
(15)

The second line results from when , and by multiplying and dividing by . Also recall that .

Now consider the change in potential . Note that so (recall that ).