Oblivious Storage with Low I/O Overhead
Abstract
We study oblivious storage (OS), a natural way to model privacypreserving data outsourcing where a client, Alice, stores sensitive data at an honestbutcurious server, Bob. We show that Alice can hide both the content of her data and the pattern in which she accesses her data, with high probability, using a method that achieves amortized rounds of communication between her and Bob for each data access. We assume that Alice and Bob exchange small messages, of size , for some constant , in a single round, where is the size of the data set that Alice is storing with Bob. We also assume that Alice has a private memory of size . These assumptions model realworld cloud storage scenarios, where tradeoffs occur between latency, bandwidth, and the size of the client’s private memory.
1 Introduction
Outsourced data management is a large and growing industry. For example, as of July 2011, Amazon S3 [2] reportedly stores more than 400 billion objects, which is four times its size from the year before, and the Windows Azure service [15], which was started in late 2008, is now a multibillion dollar enterprise.
With the growing impact of online cloud storage technologies, there is a corresponding growing interest in methods for privacypreserving access to outsourced data. Namely, it is anticipated that many customers of cloud storage services will desire or require that their data remain private. A necessary component of private data access, of course, is to encrypt the objects being stored. But information can be leaked from the way that data is accessed, even if it is encrypted (see, e.g., [5]). Thus, privacypreserving data access must involve both encryption and techniques for obfuscating the patterns in which users access data.
Oblivious RAM Simulation
One proposed approach to privacypreserving data access involves oblivious random access machine (ORAM) simulation [8]. In this approach, the client, Alice, is modeled as a CPU with a limitedsize cache that accesses a large indexed memory managed by the owner of the data service, Bob. The goal is for Alice to perform an arbitrary RAM computation while completely obscuring from Bob the data items she accesses and the access pattern. Unfortunately, although known ORAM simulations [1, 6, 9, 8, 10, 17, 14, 20, 22] can be adapted to the problem of privacypreserving access to outsourced data, they do not naturally match the interfaces provided by existing cloud storage services, which are not organized according to the RAM model (e.g., see [4]).
Oblivious Storage
A notable exception to this aspect of previous work on ORAM simulation is a recent paper by Boneh et al. [4], who introduce the oblivious storage (OS) model. In this model, the storage provided by Bob is viewed more realistically as a collection of keyvalue pairs and the query and update operations supported by his API are likewise more accurately viewed in terms of operations dealing with keyvalue pairs, which we also call items. An OS solution is oblivious in this context if an honestbutcurious polynomialtime adversary is unable to distinguish between the (obfuscated) versions of two possible access sequences of equal length and maximum set size, which are polynomially related, beyond a negligible probability. Although the solution to the OS problem given by Boneh et al. is somewhat complicated, it is nevertheless considerably simpler than most of the existing ORAM solution techniques. In particular, it avoids additional details required of ORAM simulations that must deal with the obfuscation of an arbitrary RAM algorithm. Thus, an argument can be made that the OS approach is both more realistic and supports simpler oblivious simulations. The goal of this paper, then, is to explore further simplifications and improvements to achieve practical solutions to the oblivious storage problem.
1.1 Related Previous Work
Method  Access Overhead  Message Size  Client  Server Storage  
Online  Amortized  ()  Memory  
Shi et al. [19]  
Williams et al. [23]  
Goodrich et al. [10]  
Boneh et al. [4]  
Our Method  
Our Method 
Research on oblivious simulation of one computational model by another began with Pippenger and Fischer [18], who show that one can obliviously simulate a computation of a onetape Turing machine computation of length with an twotape Turing machine computation of length . That is, they show how to perform such an oblivious simulation with a computational overhead that is .
Goldreich and Ostrovsky [8] show that one can perform an oblivious RAM (ORAM) simulation using an outsourced data server and they prove a lower bound implying that such simulations require an overhead of at least , for a RAM memory of size , under some reasonable assumptions about the nature of such simulations. For the case where Alice has only a constantsize private memory, they show how Alice can easily achieve an overhead of , using a scheme called the “squareroot solution,” with storage at Bob’s server. With a more complicated scheme, they also show how Alice can achieve an overhead of with storage at Bob’s server, using a scheme called the “hierarchical solution.”
Williams and Sion [22] provide an ORAM simulation for the case when the data owner, Alice, has a private memory of size . They achieve an expected amortized time overhead of using memory at the external data provider, Bob. Additionally, Williams et al. [23] claim a result that uses an sized private memory and achieves amortized time overhead with a linearsized outsourced storage, but some researchers (e.g., see [17]) have raised concerns with the assumptions and analysis of this result. Likewise, Pinkas and Reinman [17] published an ORAM simulation result for the case where Alice maintains a constantsize private memory, claiming that Alice can achieve an expected amortized overhead of while using storage space, but Kushilevitz et al. [14] have raised correctness issues with this result as well [14]. Goodrich and Mitzenmacher [9] show that one can achieve an overhead of in an ORAM simulation, with high probability, for a client with constantsized local memory, and , for a client with memory, for a constant . Kushilevitz et al. [14] also show that one can achieve an overhead of in an ORAM simulation, with high probability, for a client with constantsized local memory. Ajtai [1] proves that ORAM simulation can be done with polylogarithmic overhead without cryptographic assumptions about the existence of random hash functions, as is done in the papers mentioned above (and this paper), and a similar result is given by Damgård et al. [6].
The importance of privacy protection in outsourced data management naturally raises the question of the practicality of the previous ORAM solutions. Unfortunately, the abovementioned theoretical results contain several complications and hidden constant factors that make these solutions less than ideal for realworld use. Stefanov et al. [20] study the ORAM simulation problem from a practical point of view, with the goal of reducing the worstcase bounds for data accesses. They show that one can achieve an amortized overhead of and worstcase performance , with storage on the client, for a constant , and an amortized overhead of and similar worstcase performance, with a clientside storage of , both of which have been hidden constant factors than previous ORAM solutions. Goodrich et al. [10] similarly study methods for improving the worstcase performance of ORAM simulation, showing that one can achieve a worstcase overhead of with a clientside memory of size , for any constant .
As mentioned above, Boneh et al. [4] introduce the oblivious storage (OS) problem and argue how it is more realistic and natural than the ORAM simulation problem. They study methods that separate access overheads and the overheads needed for rebuilding the data structures on the server, providing, for example, amortized overhead for accesses with overhead for rebuilding operations, assuming a similar bound for the size of the private memory on the client.
1.2 Our Results
In this paper, we study the oblivious storage (OS) problem, providing solutions that are parameterized by the two critical components of an outsourced storage system:

: the number of items that are stored at the server

: the maximum number of items that can be sent or received in a single message, which we refer to as the message size.
We assume that the objects being outsourced to Bob’s cloud storage are all of the same size, since this is a requirement to achieve oblivious access. Thus, we can simply refer to the memory and message sizes in terms of the number of items that are stored. This notation is borrowed from the literature on externalmemory algorithms (e.g., see [21]), since it closely models the scenario where the memory needed by a computation exceeds its local capacity so that external storage is needed. In keeping with this analogy to externalmemory algorithms, we refer to each message that is exchanged between Alice and Bob as an I/O, each of which, as noted above, is of size at most . We additionally assume that Alice’s memory is of size at least , so that she can hold two messages in her local memory. In our case, however, we additionally assume that , for some constant . This assumption is made for the sake of realism, since even with , we can model Bob storing exabytes for Alice, while she and he exchange individual messages measured in megabytes. Thus, we analyze our solutions in terms of the constant
We give practical solutions to the oblivious storage problem that achieve an efficient amortized number of I/Os exchanged between Alice and Bob in order to perform put and get operations.
We first present a simple “squareroot” solution, which assumes that is , so . This solution is not oblivious, however, if the client requests items that are not in the set. So we show how to convert any oblivious storage solution that cannot tolerate requests for missing items to a solution that can support obliviously also such requests. With these tools in hand, we then show how to define an inductive solution to the oblivious storage problem that achieves a constant amortized number of I/Os for each access, assuming . We believe that , , and are reasonable choices in practice, depending on the relative sizes of and .
The operations in these solutions are factored into access operations and rebuild operations, as in the approach advocated by Boneh et al. [4]. Access operations simply read or write individual items to/from Bob’s storage and are needed to retrieve the requested item, whereas rebuild operations may additionally restructure the contents of Bob’s storage so as to mask Alice’s access patterns. In our solutions, access operations use messages of size while messages of size are used only for rebuild operations.
An important ingredient in all oblivious storage and oblivious RAM solutions is a method to obliviously “shuffle” a set of elements so that Bob cannot correlate the location of an element before the shuffle with that after the shuffle. This is usually done by using an oblivious sorting algorithm, and our methods can utilize such an approach, such as the externalmemory oblivious sorting algorithm of Goodrich and Mitzenmacher [9].
In this paper, we also introduce a new simple shuffling method, which we call the buffer shuffle. We show that this method can shuffle with high probability with very little information leakage, which is likely to be sufficient in practice in most realworld oblivious storage scenarios. Of course, if perfectly oblivious shuffling is desired, then this shuffle method can be replaced by externalmemory sorting, which increases the I/O complexity of our results by at most a constant factor (which depends on ).
In Table 1, we summarize our results and compare the main performance measures of our solutions with those of selected previous methods that claim to be practical.
1.3 Organization of the Paper
The rest of this paper is organized as follows. In Section 2, we overview the oblivious storage model and its security properties and describe some basic techniques used in previous work. Our buffer shuffle method is presented and analyzed in Section 3. We give a preliminary missintolerant squareroot solution in Section 4. Section 5 derives a misstolerant solution from our squareroot solution using a cuckoo hashing scheme. In Section 6, we show how to reduce the storage requirement at the client. Finally, in Section 7, we describe our experimental results and provide estimates of the actual time overhead and monetary cost of our method, obtained by a prototype implementation and simulation of the use of our solution on the Amazon S3 storage service.
2 The Oblivious Storage Model
In this section, we discuss the OS model using the formalism of Boneh et al. [4], albeit with some minor modifications. As mentioned above, one of the main differences between the OS model and the classic ORAM model is that the storage unit in the OS model is an item consisting of a keyvalue pair. Thus, we measure the size of messages and of the storage space at the client and server in terms of the number of items
2.1 Operations and Messages
Let be the set of data items. The server supports the following operations on .

get: if contains an item, , with key , then return the value, , of this item, else return null.

put: if contains an item, , with key , then replace the value of this item with , else add to a new item .

remove: if contains an item, , with key , then delete from this item and return its value , else return null.

getRange: return the first items (by key order) in with keys in the range . Parameter is a cutoff to avoid data overload at the client because of an error. If there are fewer than such items, then all the items with keys in the range are returned.

removeRange: remove from all items with keys in the range .
The interactions between the client, Alice, and the server, Bob, are implemented with messages, each of which is of size at most , i.e., it contains at most items. Thus, Alice can send Bob a single message consisting of put operations, each of which adds a single item. Such a message would count as a single I/O. Likewise, the response to a getRange operation requires I/Os; hence, Alice may wish to limit to be . Certainly, Alice would want to limit to be in most cases, since she would otherwise be unable to locally store the entire result of such a query if it reaches its cutoff size.
As mentioned above, our use of parameter is done for the sake of practicality, since it is unreasonable to assume that Alice and Bob can only communicate via constantsized messages. Indeed, with network connections measured in gigabits per second but with latencies measured in milliseconds, the number of rounds of communication is likely to be the bottleneck, not bandwidth. Thus, because of this ordersofmagnitude difference between bandwidth and latency, we assume
for some fixed constant , but that Alice’s memory is smaller than . Equivalently, we assume that is a constant. For instance, as highlighted above, if Bob’s memory is measured in exabytes and we take , then we are reasonably assuming that Alice and Bob can exchange messages whose sizes are measured in megabytes. To assume otherwise would be akin to trying to manage a large reservoir with a pipe the size of a drinking straw.
We additionally assume that Alice has a private memory of size , in which she can perform computations that are hidden from the server, Bob. To motivate the need for Alice outsourcing her data, while also allowing her to communicate effectively with Bob, we assume that and .
2.2 Basic Techniques
Our solution employs several standard techniques previously introduced in the oblivious RAM and oblivious storage literature. To prevent Bob from learning the original keys and values and to make it hard for Bob to associate subsequent access to the same item, Alice replaces the original key, , of an item with a new key , where is a cryptographic hash function (i.e., oneway and collisionresistant) and is a secret randomlygenerated nonce that is periodically changed by Alice so that a subsequent access to the same item uses a different key. Note that Bob learns the modified keys of the items. However, he cannot derive from them the original keys due to the oneway property of the cryptographic hash function used. Also, the uniqueness of the new keys occurs with overwhelming probability due to collision resistance.
Likewise, before storing an item’s value, , with Bob, Alice encrypts using a probabilistic encryption scheme. E.g., the ciphertext is computed as , where is a deterministic encryption algorithm and is a random nonce that gets discarded after decryption. Thus, a different ciphertext for is generated each time the item is stored with Bob. As a consequence, Bob cannot determine whether was modified and cannot track an item by its value. The above obfuscation capabilities are intended to make it difficult for Bob to correlate the items stored in his memory at different times and locations, as well as make it difficult for Bob to determine the contents of any value.
We distinguish two types of OS solutions. We say that an oblivious storage solution is missintolerant if it does not allow for get requests that return null. Thus, Alice must know in advance that Bob holds an item with the given key. In applications that by design avoid requests for missing items, this restriction allows us to design an efficient obliviousstorage solution, since we don’t have to worry about any information leakage that comes from queries for missing keys. Alternatively, if an oblivious storage solution is oblivious even when accesses can be made to keys that are not in the set, then we say that the solution is misstolerant.
2.3 Security Properties
Our OS solution is designed to satisfy the following security properties, where the adversary refers to Bob (the server) or a third party that eavesdrops the communication between Alice (the client) and Bob. The adversary is assumed to have polynomially bounded computational power.
 Confidentiality.

Except with negligible probability, the adversary should be unable to determine the contents (key or value) of any item stored at the server. This property is assured by the techniques described in the previous subsection.
 Hardness of Correlation.

Except with negligible or very low probability beyond , the adversary should be unable to distinguish between any two possible access sequences of equal length and maximum set size. That is, consider two possible access sequences, and , that consist of operations, get, put, and remove, that could be made by Alice, on a set of size up to , where is polynomial in . Then an oblivious storage (OS) solution has correlation hardness if it applies an obfuscating transformation so that, after seeing the sequence of I/Os performed by such a transformation, the probability that Bob can correctly guess whether Alice has performed (the transformed version of) or is more than by at most a or a negligible amount, depending on the degree of obfuscation desired, where is a constant.^{1}^{1}1We assume in this case.
Note that is used in the definition of “correlation hardness” in both the upper bound on the size of Alice’s set and also in the probability of Bob correctly distinguishing between two of her possible access sequences. Of course, the efficiency of an OS solution should also to be measured in terms of .
3 The Buffer Shuffle Method
One of the key techniques in our solutions is the use of oblivious shuffling. The input to any shuffle operation is a set, , of items. Because of the inclusion of the getRange operation in the server’s API, we can view the items in as being ordered by their keys. Moreover, this functionality also allows us to access a contiguous run of such items, starting from a given key. The output of a shuffle is a reordering of the items in with replacement keys, so that all permutations are equally likely. During a shuffle, the server, Bob, can observe Alice read (and remove) of the items he is storing for her, and then write back more items, which provides some degree of obfuscation of how the items in these read and write groups are correlated. An additional desire for the output of a shuffle is that, for any item in the input, the adversary should be able to correlate with any item in the output only with probability that is very close to (which is what he would get from a random guess).
During such a shuffle, we assume that Alice is wrapping each of her keyvalue pairs, , as , where is the new key that is chosen to obfuscate . Indeed, it is likely that in each round of communication that Alice makes she will take a wrapped (input) pair, , and map it to a new (output) pair, , where the is assumed to be a reencryption of . The challenge is to define an encoding strategy for the and wrapper keys so that it is difficult for the adversary to correlate inputs and outputs.
3.1 Theoretical Choice: Oblivious Sorting
One way to do this is to assign each item a random key from a very large universe, which is separate and distinct from the key that is a part of this keyvalue pair, and obliviously sort [9] the items by these keys. That is, we can wrap each keyvalue pair, , as , where is the new random key, and then wrap these wrapped pairs in a way that allows us to implement an oblivious sorting algorithm in the OS model based on comparisons involving the keys. Specifically, during this sorting process, we would further wrap each wrapped item, , as , where is an address or index used in the oblivious sorting algorithm. So as to distinguish such keys even further, Alice can also add a prefix to each such , such as “Addr:” or “Addr:”, where is a counter (which could, for instance, be counting the steps in Alice’s sorting algorithm). Using such addresses as “keys” allows Alice to consider Bob’s storage as if it were an array or the memory of a RAM. She can then use this scheme to simulate an oblivious sorting algorithm.
If the randomly assigned keys are distinct, which will occur with very high probability, then this achieves the desired goal. And even if the new keys are not distinct, we can repeat this operation until we get a set of distinct new keys without revealing any datadependent information to the server.
From a theoretical perspective, it is hard to beat this solution. It is wellknown, for instance, that shuffling by sorting items via randomlyassigned keys generates a random permutation such that all permutations are equally likely (e.g., see [13]). In addition, since the means to go from the input to the output is dataoblivious with respect to the I/Os (simulated using the address keys), the server who is watching the inputs and outputs cannot correlate any set of values. That is, independent of the set of I/Os, any input item, , at the beginning of the sort can be mapped to any output item, , at the end. Thus, for any item in the input, the adversary can correlate with any item in the output with probability exactly . Finally, we can use the externalmemory deterministic oblivioussorting algorithm of Goodrich and Mitzenmacher [9], for instance, so as to use messages of size , which will result in an algorithm that sorts in I/Os. That is, such a sorting algorithm uses a constant amortized number of I/Os per item.
But using an oblivious sorting algorithm requires a fairly costly overhead, as the constant factors and details of this algorithm are somewhat nontrivial. Thus, it would be nice in applications that don’t necessarily require perfectly oblivious shuffling to have a simple substitute that could be fast and effective in practice.
3.2 The Buffer Shuffle Algorithm
So, ideally, we would like a different oblivious shuffle algorithm, whose goal is still to obliviously permute the collection, , of values, but with a simpler algorithm. The buffershuffle algorithm is such an alternative:

Perform a scan of , numbers at a time. With each step, we read in wrapped items from , each of the form , and randomly permute them in Alice’s local memory.

We then generate a new random key, , for each such wrapped item, , in this group, and we output all those new keyvalue pairs back to the server.

We then repeat this operation with the next numbers, and so on, until Alice has made a pass through all the numbers in .
Call this a single pass. After such a pass, we can view the new keys as being sorted at the server (as observed above, by the properties of the OS model). Thus, we can perform another pass over these new keyvalue pairs, generating an even newer set of wrapped keyvalue pairs. (This functionality is supported by range queries, for example, so there is little overhead for the client in implementing each such pass.) Finally, we repeat this process for some constant, , times, which is established in our analysis below. This is the buffershuffle algorithm.
3.3 BufferShuffle Analysis
To analyze the buffershuffle algorithm, we first focus on the following goal: we show that with probability that after four passes, one cannot guess the location of an initial keyvalue pair with probability greater than , assuming , where is the number of items being permuted. After we prove this, we discuss how the proof extends to obtain improved probabilities of success and tighter bounds on the probability of tracking so that they are closer to , as well as how to extend to cases where for integers .
We think of the keys at the beginning of each pass as being in keysorted order, in groups of size . Let be the perceived probability that after passes the key we are tracking is in group , according to the view of the tracker, Bob. Note that Bob can see, for each group on each pass, the set of keys that correspond to that group at the beginning and end of the pass, and use that to compute values corresponding to their perceived probabilities. Without loss of generality, we consider tracking the first key, so .
Our goal will be to show that , for and for all , conditioned on some events regarding the random assignment of keys at each pass. The events we condition on will hold with probability . This yields that the key being tracked appears to a tracker to be (up to lower order terms) in a group chosen uniformly at random. As the key values in each group are randomized at the next pass, this will leave the tracker with a probability only of guessing the item, again assuming the bad events do not occur.
Let be the number of keys that go from group to group in pass . One can quickly check that is 0 with probability near 1. Indeed, the probability that is bounded above by
We have the recurrence
The explanation for this recurrence is straightforward. The probability the key being tracked is in the th group in after pass is the sum over all groups of the probability the key was in group , given by , times the probability the corresponding new key was mapped to group , which .
Our goal now is to show that over successive passes that as long as the values behave nicely, the will quickly converge to roughly . We sketch an argument that with probability and then comment on how the term can be reduced to any inverse polynomial probability in a constant number of passes. Our main approach is to note that bounding the corresponds to a type of balls and bins problem, in which case negative dependence can be applied to get a suitable concentration result via a basic Chernoff bound.
Theorem 1
When , after four passes, Bob cannot guess the location of an initial keyvalue pair with probability greater than .
Proof: We consider passes in succession.

Pass 1: It is easy to check that with probability (using just union bounds and the binomial distribution to bound the number of keys from group 1 that land in every other group) there are at most groups for which and 0 groups for which . There are therefore groups for which and groups for which .

Pass 2: Our interpretation here (and going forward) is that each key in group after pass has a ”weight” that it gives to the group it lands in in pass ; the sum of weights in a group then yields .
With this interpretation, with probability , there are keys at the end of pass 1 with positive weight (of either or ). These keys are rerandomized, so at the end of pass 2, the number of keys with positive with in a given bucket is expected to be constant, and again simple binomial and union bounds imply it the maximum number of keys with positive weight in any bucket is at most with probability . Indeed, one can further show at the end of pass 2 that the number of groups with must be at least with probability ; this follows from the fact that, for example, if is the 0/1 random variable that represents whether a group received at least one weighted key, then , and the are negatively associated, so Chernoff bounds apply. (See, for example, Chapter 3 of [7].)

Pass 3: Conditioned on the events from the first two passes, at the end of the second pass there are keys with positive weight going into pass 3, and the possible weight values for each key are bounded by for some constant . The expected weight for each group after pass 3 is obviously . The weight of the keys within a group are negatively associated, so we can apply a Chernoff bound to the weight associated with each group, noting that to apply the Chernoff bound we should rescale so the range of the weights is . Consider the first group, and let be the weight of the th keys in the first group (scaled by multiplying the weight by ). Let . Then
Or, rescaling back, the weight in the first group is within of with high probability, and a union bound suffices to show that this the same for all groups.

Pass 4: After pass 3, with probability each key has weight , and so after randomizing, assuming the events of probability all hold, the probability that any key is the original one being tracked from Bob’s point of view is .
Extending the argument
We remark that the failure probability can be reduced to any inverse polynomial by a combination of choosing constant and to be sufficiently high, and/or repeating passes a (constant) number of times to reduce the probability of the bad events. (For example, if pass 1, fails with probability , repeating it times reduces the failure probability to ; the failure probabilities are all inverse polynomial in in the proof above.)
Similarly, one can ensure that the probability that any key is the tracked key to for any constant by increasing the number of passes further, but still keeping the number of passes constant. Specifically, note that we have shown that after the first four passes, with high probability the weight of each key bounded between and for some , and the total key weight is 1. We recenter the weights around and multiply them by ; now the new reweighted weights sum to 1. We can now reapply above the argument; after four passes we know that reweighted weights for each key will again be and . Undoing the rescaling, this means the weights for the keys are now bounded and , and we can continue in this fashion to obtain the desired closeness to .
Finally, we note that the assumption that we can read in keyvalue pairs is and assign them new random key values can be reduced to pairs for any . We sketch the proof. Each step, as in the original proof, holds with probability .
In this case we have groups. In the first pass, the weight from the shuffling is spread to keyvalue pairs, following the same reasoning as for Pass 1 above. Indeed, we can continue this argument; in the next pass, there weight will spread to keyvalue pairs, and so on, until after passes there are keys with nonzero weight with high probability, with one small modification in the analysis: at each pass, we can ensure that each group has less than weighted keys with high probability.
Then, following the same argument as in Pass 2 above, one can show that after the following pass keys have nonzero weight, and the maximum weight is bounded above by for some constant . Applying the Chernoff bound argument for Pass 3 above to the next pass we find that the weight within each of the groups is equal to after this pass, and again this suffices by the recurrence to show that at most one more pass is necessary for each keyvalue pair to have weight .
4 A Square Root Solution
As is a common practice in ORAM simulation papers, starting with the work of Goldreich and Ostrovsky [8], before we give our more sophisticated solutions to the oblivious storage problem, we first give a simple squareroot solution. Our general solution is an inductive extension of this solution, so the squareroot also serves to form a basis for this induction.
In this squareroot solution, we assume . Thus, Alice has a local memory of size at least , and she and Bob can exchange a message of size up to at least in a single I/O. In addition, we assume that this solution provides an API for performing oblivious dictionary operations where every get or put operation is guaranteed to be for a key that is contained in the set, , that Alice is outsourcing to Bob. That is, we give a missintolerant solution to the oblivious storage problem.
Our solution is based on the observation that we can view Alice’s internal memory as a misstolerant solution to the OS problem. That is, Alice can store items in her private memory in some dictionary data structure, and each time she queries her memory for a key she can determine if is present without leaking any datadependent information to Bob.
4.1 The Construction
Let us assume we have a misstolerant dictionary, , that provides a solution to the OS problem that works for sets up to size , with at most amortized number of I/Os of size at most per access. Certainly, a dictionary stored in Alice’s internal memory suffices for this purpose (and it, in fact, doesn’t even need any I/Os per access), for the case when is at most , the size of Alice’s internal memory.
The memory organization of our solution, , we describe here, consists of two caches:

A cache, , which is of size and is implemented using an instance of a solution.

A cache, , which is of size , which is stored as a dictionary of keyvalue pairs using Bob’s storage.
The extra space in is for storing “dummy” items, which have keys indexed from a range that is outside of the universe used for , which we denote as . Let denote the set of items from , plus items with these dummy keys (along with null values), minus any items in . Initially, is empty and stores the entire set plus the items with dummy keys. For the sake of obliviousness, each item, , in the set is mapped to a substitute key by a nonce pseudorandom hash function, , where is a random number chosen at the time Alice asks Bob to build (or rebuild) his dictionary. In addition, each value is encrypted as , with a secret key, , known only to Alice. Thus, each item in is stored by Bob as the keyvalue pair .
To perform an access, either for a get or put, Alice first performs a lookup for in , using its technology for achieving obliviousness. If she does not find an item with key (as she won’t initially), then she requests the item from Bob by issuing a request, get, to him. Note that, since is a key in , and it is not in Alice’s cache, is a key in , by the fact that we are constructing a missintolerant OS solution. Thus, there will be an item returned from this request. From this returned item, , Alice decrypts the value, , and stores the item in , possibly changing if she is performing a put operation. Then she asks Bob to remove the item with key from .
If, on the other hand, in performing an access for a key , Alice finds a matching item for in , then she uses that item and she issues a dummy request to Bob by asking him to perform a get operation, where is a counter she keeps in her local memory for the next dummy key. In this case, she inserts this dummy item into and she asks Bob to remove the item with key from . Therefore, from Bob’s perspective, Alice is always requesting a random key for an item in and then immediately removing that item. Indeed, her behavior is always that of doing a get from , a get from , a remove from , and then a put in .
After Alice has performed accesses, will be holding items, which is its capacity. So she pauses her performance of accesses at this time and enters a rebuilding phase. In this phase, she rebuilds a new version of the dictionary that is being maintained by Bob.
The new set to be maintained by Bob is the current unioned with the items in (including the dummy items). So Alice resets her counter, , back to . She then performs an oblivious shuffle of the set . This oblivious shuffle is performed either with an externalmemory sorting algorithm [9] or with the buffershuffle method described above, depending, respectively, on whether Alice desires perfect obscurity or if she can tolerate a small amount of information leakage, as quantified above. Finally, after this random shuffle completes, Alice chooses a new random nonce, , for her pseudorandom function, . She then makes one more pass over the set of items (which are masked and encrypted) that are now stored by Bob (using getRange operations as in the buffershuffle method), and she maps each item to the pair and asks Bob to store this item in his memory. This begins a new “epoch” for Alice to then use for the next accesses that she needs to make.
Let us consider an amortized analysis of this solution. For the sake of amortization, we charge each of the previous accesses for the effort in performing a rebuild. Since such a rebuild takes I/Os, provided , for some constant , this means we will charge I/Os to each of these previous accesses. Thus, we have the following.
Lemma 2
Suppose we are given a misstolerant OS solution, , which achieves amortized I/Os per access for messages of up to size , when applied to a set of size . Then we can use this as a component, , of a missintolerant OS solution, , that achieves amortized I/Os per access for messages of size up to , for some constant . The private memory required for this solution is .
Proof: The number of amortized I/Os will be per access, from . The total number of I/Os needed to do a rebuild is , assuming is at least , for some constant . There will be items that are moved in this case, which is equal to the number of previous accesses; hence, the amortized number of I/Os will be per access. The performance bounds follow immediately from the above discussion and the simple charging scheme we used for the sake of an amortized analysis. For the proof of security, note that each access that Alice makes to the dictionary will either be for a real item or a dummy element. Either way, Alice will make exactly requests before she rebuilds this dictionary stored with Bob. Moreover, from the adversary’s perspective, every request is to an independent uniformly random key, which is then immediately removed and never accessed again. Therefore, the adversary cannot distinguish between actual requests and dummy requests. In addition, he cannot correlate any request from a previous epoch, since Alice randomly shuffles the set of items and uses a new pseudorandom function with each epoch.
By then choosing appropriately, we have the following.
Theorem 3
The squareroot solution achieves amortized I/Os for each data access, allowing a client, Alice, to obliviously store items in a missintolerant way with an honestbutcurious server, Bob, using messages that are of size at most and local memory that is of size at least . The probability that this simulation fails to be oblivious is exponentially small for a polynomiallength access sequence, if oblivious sorting is used for shuffling, and polynomially small if buffershuffling is used.
Proof: Plugging into Lemma 2 gives us the complexity bound. The obliviousness follows from the fact that if she has an internal memory of size , then Alice can easily implement a misstolerant OS solution in her internal memory, which achieves the conditions of the solution needed for the cache .
Note that the constant factor in the amortized I/O overhead in the squareroot solution is quite small.
Note, in addition, that by the obliviousness definition in the OS model, it does not matter how many accesses Alice makes to the solution, , provided that her number of accesses are not selfrevealing of her data items themselves.^{2}^{2}2An access sequence would be selfrevealing, for example, if Alice reads a value and then performs a number of accesses equal to this value.
5 MissTolerance
An important functionality that is lacking from the squareroot solution is that it does not allow for accesses to items that are not in the set . That is, it is a missintolerant OS solution. Nevertheless, we can leverage the squareroot solution to allow for such accesses in an oblivious way, by using a hashing scheme.
5.1 Review of Cuckoo Hashing
The main idea behind this extension is to use a missintolerant solution to obliviously implement a cuckoo hashing scheme [16]. In cuckoo hashing, we have two hash tables and and two associated pseudorandom hash functions, and . An item is stored at or . When inserting item , we add it to . If that cell is occupied by another item, , we evict that item and place it in . Again, we may need to evict an item. This sequence of evictions continues until we put an item into a previouslyempty cell or we detect an infinite loop (in which case we rehash all the items). Cuckoo hashing achieves expected time for all operations with high probability. This probability can be boosted even higher to by using a small cache, known as a stash [12], of size to hold items that would have otherwise caused infinite insertion loops. With some additional effort (e.g., see [3]), cuckoo hashing can be deamortized to achieve memory accesses, with very high probability, for insert, remove, and lookup operations.
In most realworld OS solutions, standard cuckoo hashing should suffice for our purposes. But, to avoid inadvertent data leakage and ensure highprobability performance bounds, let us assume we will be using deamortized cuckoo hashing.
5.2 Implementing Cuckoo Hashing with a MissIntolerant OS Solution
Let us assume we have a missintolerant solution, , to the OS problem, which achieves a constant I/O complexity for accesses, using messages of size .
A standard or deamortized cuckoo hashing scheme provides an interface for performing get and put operations, so that get operations are misstolerant. These operations are implemented using pseudorandom hash functions in the random access memory (RAM) model, i.e., using a collection of memory cells, where each such cell is uniquely identified with an index . To implement such a scheme using solution , we simulate a read of cell with get operation and we simulate a write of to cell with put. Thus, each access using is guaranteed to return an item, namely a cell in the memory (tables and variables) used to implement the cuckoohashing scheme. Thus, whenever we access a cell with index , we actually perform a request for (an encryption of) this cell’s contents using the obliviousness mechanism provided by .
That is, to implement a standard or deamortized cuckoo hashing scheme using , we assume now that every (nondummy) key in Alice’s simulation is an index in the memory used to implement the hashing scheme. Thus, each access is guaranteed to return an item. Moreover, because inserts, removals, and lookups achieve a constant number of memory accesses, with very high probability, in a deamortized cuckoohashing scheme (or with constant expectedtime performance in a standard cuckoo hashing scheme), then each operation in a simulation of deamortized cuckoo hashing in involves a constant number of accesses with very high probability. Therefore, using a deamortized cuckoohashing scheme, we have the following result.
Theorem 4
Given a missintolerant OS solution, , that achieves amortized I/O performance with messages of size and achieves confidentiality and hardness of correlation, we can implement a misstolerant solution, , that achieves amortized I/O performance and also achieves confidentiality and hardness of correlation.
A standard cuckoohashing scheme yields instead the following result.
Theorem 5
Given a missintolerant OS solution, , that achieves expected amortized I/O performance, with messages of size , we can implement a misstolerant solution, , that achieves expected amortized I/O performance.
Our use of cuckoo hashing in the above construction is quite different, by the way, than previous uses of cuckoohashing for oblivious RAM simulation [9, 14, 11]. In these other papers, the server, Bob, gets to see the actual indexes and memory addresses used in the cuckoo hashing scheme. Thus, the adversary in these other schemes can see where items are placed in cuckoo tables (unless their construction is itself oblivious) and when and where they are removed; hence, special care must be taken to construct and use the cuckoo tables in an oblivious way. In our scheme, the locations in the cuckoohashing scheme are instead obfuscated because they are themselves built on top of an OS solution.
Also, in previous schemes, cuckoo tables were chosen for the reason that, once items are inserted, their locations are determined by pseudorandom functions. Here, cuckoo tables are used only for the fact that they have constanttime insert, remove, and lookup operations, which holds with very high probability for deamortized cuckoo tables and as an expectedtime bound for standard cuckoo tables.
6 An Inductive Solution
The misstolerant squareroot method given in Section 5 provides a solution of the oblivious storage problem with amortized constant I/O performance for each access, but requires Alice to have a local memory of size and messages to be of size during the rebuilding phase (although constantsize messages are exchanged during the access phase). In this section, we show how to recursively apply this method to create a more efficient solution.
For an integer , let denote a missintolerant oblivious storage solution that has the following properties:

It supports a dictionary of items.

It requires local memory of size at the client.

It uses messages of size .

It executes amortized I/Os per operation (each get or put), where the constant factor in this bound depends on the constant .

It achieves confidentiality and hardness of correlation.
Note that using this notation, the squareroot method derived in Section 5 using cuckoo hashing is a oblivious storage solution.
6.1 The Inductive Construction
For our inductive construction, for , we assume the existence of a oblivious storage solution . We can use this to build a misstolerant oblivious storage solution, , using message size as follows:

Use the construction of Lemma 2 to build a missintolerant OS solution, , from . This solution will have amortized I/Os per access, with very high probability, using messages of size and private memory requirement of size since uses memory of size

Use the construction of Theorem 4 to take the missintolerant solution, , and convert it to a misstolerant solution. This solution uses an amortized number of I/Os, with high probability, using messages of size , and it has the performance bounds necessary to be denoted as .
An intuition of our construction is as follows. We number each level of the construction such that is the top most and is the lowest level, hence there are levels. The top level, , consists of the main memory of size and uses the rest of the construction as a cache for items which we referred to as . This cache is the beginning of our inductive construction and, hence, itself is an OS over items. The inductive construction continues such that level contains a misstolerant data structure and levels are used as a cache of level . The construction terminates when we reach level 2 since size of the cache at level 2 is equal to the message size which Alice can request using a single access or store in her own memory. We give an illustration of our construction for and in Figures 1 and 2.
Theorem 6
The above construction results in an oblivious storage solution, , that is missintolerant, supports a dictionary of items, requires clientside local memory of size at least , uses messages of size , achieves an amortized number of I/Os for each get and put operation, where the constant factor in this bound depends on the constant . In addition, this method achieves confidentiality and hardness of correlation.
7 Performance
We have built a system prototype of our oblivious storage method to estimate the practical performance of our solution and compare it with that of other OS solutions. In our simulation, we record the number of access operations to the storage server for every original data request by the client. Our prototype specifically simulates the use of Amazon S3 as the provider of remote storage, based on their current API. In particular, we make use of operations get, put, copy and delete in the Amazon S3 API. Since Amazon S3 does not support range queries, we substitute operation getRange of our OS model with concurrent get requests, which could be issued by parallel threads running at the client to minimize latency. Operation removeRange is handled similarly with concurrent delete operations. We have run the simulation for two configurations of our OS solution, and . We consider two item sizes, 1KB and 64KB. The size (number of items) of the messages exchanged by the client and server is where , the number of items in the outsourced data set, varies from to .
Storage overhead. The overall storage space (no. of items) used by our solution on the server is , i.e. for and for . For , our method has storage overhead comparable to that of Boneh et al. [4] and much smaller than the space used by other approaches.
Access overhead. In Table 6, we show the number of I/Os to the remote data repository during the oblivious simulation of requests. Recall that the number of I/Os is the number of roundtrips the simulation makes. Thus, the getRange operation is counted as one I/O. In the table, column Minimum gives the number of I/Os performed by Alice to receive the requested item. The remaining I/Os are performed for reshuffling. For , this number is 2 since the client sends a get request to either get an actual item or a dummy item, followed by a delete request. For , this number is slightly higher since Alice needs to simulate accesses to a cuckoo table through an OS interface. We compare our I/O overhead and the total amount of data transferred with that of Boneh et al. [4]. They also achieve request overhead and exchange messages of size with the server. Our new buffer shuffle algorithm makes our approach more efficient in terms of data transfer and the number of operations the user makes to the server.
Time overhead. Given the trace of user’s operations during the simulation and empirical measurements of round trip times of operations on the Amazon S3 system (see Table 6), we estimate the access latency of our OS solutions in Tables 6 and 6 for 1KB items and 64KB items, respectively.
Cost overhead. Finally, we provide estimates of the monetary cost of OS our solution in Table 6 using the pricing scheme of Amazon S3 (see Table 6 and http://aws.amazon.com/s3/pricing/ ^{3}^{3}3Accessed on 9/21/2011). Since our results outperform other approaches in terms of number of accesses to the server we expect that our monetary cost will be also lower.
Minimum/Amortized  

10,000  2/13  7/173 
100,000  2/13  7/330 
1,000,000  2/13  7/416 
I/O Overhead and Data Transferred (#items)  
Boneh et al. [4]  Our Method  
Minimum  Amortized  Minimum  Amortized  
10,000  3/9  13/1.3  2/1  13/1.1 
100,000  3/10  17/5.2  2/1  13/3.5 
1,000,000  3/12  20/2  2/1  13/1.1 
Operation  Price  RTT (ms)  

1KB  64KB  
Get  $0.01/10,000req  36  56 
Put  $0.01/1,000req  65  86 
Copy  free  70  88 
Delete  free  31  35 
Access Time  Total Cost  Access Time  Total Cost  
Minimum  Amortized  Minimum  Amortized  
10,000  67ms  500ms  $55  400ms  8s  $177 
100,000  67ms  500ms  $1,744  400ms  12s  $7,262 
1,000,000  67ms  500ms  $55,066  400ms  18s  $170,646 
Access Time  Total Cost  Access Time  Total Cost  
Minimum  Amortized  Minimum  Amortized  
10,000  91ms  800ms  $55  500ms  12s  $177 
100,000  91ms  800ms  $1,744  500ms  18s  $7,262 
1,000,000  91ms  800ms  $55,066  500ms  24s  $170,646 
Acknowledgments
This research was supported in part by the National Science Foundation under grants 0721491, 0915922, 0953071, 0964473, 1011840, and 1012060, and by the Kanellakis Fellowship at Brown University.
References
 [1] M. Ajtai. Oblivious RAMs without cryptographic assumptions. In Proc. of the 42nd ACM Symp. on Theory of Computing (STOC), pages 181–190. ACM, 2010.
 [2] Amazon. Amazon S3 Service. http://aws.amazon.com/s3sla/.
 [3] Y. Arbitman, M. Naor, and G. Segev. Deamortized cuckoo hashing: Provable worstcase performance and experimental results. In Int. Conf. Automata, Languages and Programming (ICALP), pages 107–118. Springer, 2009.
 [4] D. Boneh, D. Mazières, and R. A. Popa. Remote oblivious storage: Making oblivious RAM practical. Technical report, CSAIL, MIT, 2011. http://dspace.mit.edu/handle/1721.1/62006.
 [5] S. Chen, R. Wang, X. Wang, and K. Zhang. Sidechannel leaks in Web applications: a reality today, a challenge tomorrow. In 31st IEEE Symp. on Security and Privacy, pages 191–206, 2010.
 [6] I. Damgård, S. Meldgaard, and J. B. Nielsen. Perfectly secure oblivious RAM without random oracles. In 8th Theory of Cryptography Conference (TCC), pages 144–163, 2011.
 [7] D. Dubhashi and A. Panconesi. Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press, New York, NY, USA, 2009.
 [8] O. Goldreich and R. Ostrovsky. Software protection and simulation on oblivious RAMs. J. ACM, 43(3):431–473, 1996.
 [9] M. T. Goodrich and M. Mitzenmacher. Privacypreserving access of outsourced data via oblivious RAM simulation. In 38th Int. Colloq. on Automata, Languages and Programming (ICALP), pages 576–587, 2011.
 [10] M. T. Goodrich, M. Mitzenmacher, O. Ohrimenko, and R. Tamassia. Oblivious RAM simulation with efficient worstcase access overhead. CoRR, abs/1107.5093, 2011. To appear in Proc. ACM Cloud Computing Security Workshop (CCSW) 2011.
 [11] M. T. Goodrich, M. Mitzenmacher, O. Ohrimenko, and R. Tamassia. Privacypreserving group data access via stateless oblivious RAM simulation. CoRR, abs/1105.4125, 2011. To appear in SODA 2012.
 [12] A. Kirsch, M. Mitzenmacher, and U. Wieder. More robust hashing: cuckoo hashing with a stash. SIAM J. Comput., 39:1543–1561, 2009.
 [13] D. E. Knuth. Seminumerical Algorithms, volume 2 of The Art of Computer Programming. AddisonWesley, Reading, MA, 3rd edition, 1998.
 [14] E. Kushilevitz, S. Lu, and R. Ostrovsky. On the (in)security of hashbased oblivious RAM and a new balancing scheme. Cryptology ePrint Archive, Report 2011/327, 2011. http://eprint.iacr.org/. To appear in SODA 2012.
 [15] Microsoft Corp. Windows Azure. http://www.microsoft.com/windowsazure.
 [16] R. Pagh and F. Rodler. Cuckoo hashing. Journal of Algorithms, 52:122–144, 2004.
 [17] B. Pinkas and T. Reinman. Oblivious RAM revisited. In T. Rabin, editor, Advances in Cryptology (CRYPTO), volume 6223 of LNCS, pages 502–519. Springer, 2010.
 [18] N. Pippenger and M. J. Fischer. Relations among complexity measures. J. ACM, 26(2):361–381, 1979.
 [19] E. Shi, H. Chan, E. Stefanov, and M. Li. Oblivious RAM with worstcase cost. Cryptology ePrint Archive, Report 2011/407, 2011. http://eprint.iacr.org/. To appear in AsiaCrypt 2011.
 [20] E. Stefanov, E. Shi, and D. Song. Towards Practical Oblivious RAM. CoRR, abs/1106.3652, June 2011.
 [21] J. S. Vitter. External sorting and permuting. In M.Y. Kao, editor, Encyclopedia of Algorithms. Springer, 2008.
 [22] P. Williams and R. Sion. Usable PIR. In NDSS, 2008.
 [23] P. Williams, R. Sion, and B. Carbunar. Building castles out of mud: practical access pattern privacy and correctness on untrusted storage. In CCS, pages 139–148, 2008.