Tiered cloud storage via two-stage, latency-aware bidding
In cloud storage, the digital data is stored in logical storage pools, backed by heterogeneous physical storage media and computing infrastructure that are managed by a Cloud Service Provider (CSP). One of the key advantages of cloud storage is its elastic pricing mechanism, in which the users need only pay for the resources/services they actually use, e.g., depending on the storage capacity consumed, the number of file accesses per month, and the negotiated Service Level Agreement (SLA). To balance the tradeoff between service performance and cost, CSPs often employ different storage tiers, for instance, cold storage and hot storage. Storing data in hot storage incurs high storage cost yet delivers low access latency, whereas cold storage is able to inexpensively store massive amounts of data and thus provides lower cost with higher latency.
In this paper, we address a major challenge confronting the CSPs utilizing such tiered storage architecture - how to maximize their overall profit over a variety of storage tiers that offer distinct characteristics, as well as file placement and access request scheduling policies. To this end, we propose a scheme where the CSP offers a two-stage auction process for (a) requesting storage capacity, and (b) requesting accesses with latency requirements. Our two-stage bidding scheme provides a hybrid storage and access optimization framework with the objective of maximizing the CSP’s total net profit over four dimensions: file acceptance decision, placement of accepted files, file access decision and access request scheduling policy. The proposed optimization is a mixed-integer nonlinear program that is hard to solve. We propose an efficient heuristic to relax the integer optimization and to solve the resulting nonlinear stochastic programs. The algorithm is evaluated under different scenarios and with different storage system parameters, and insightful numerical results are reported by comparing the proposed approach with other profit-maximization models. We see a profit increase of over 60% of our proposed method compared to other schemes in certain simulation scenarios.
[name=Per cusse, color=orange]per \setremarkmarkup(#2)
The demand for online data storage is increasing at an unprecedented rate due to growing trends such as cloud computing, big data analytics, and E-commerce activities , and recently by the rise of social networks. Cloud storage service is now provided by multiple cloud service providers (CSP) such as Amazon’s S3, Amazon’s Cloud drive, Dropbox, Google Drive, and Microsoft Azure. Amazon S3 offer 3 major storage classes for different use cases: i) Amazon S3 Standard for general-purpose storage of frequently accessed data; ii) Amazon S3 Standard for Infrequent Access for long-lived, but less frequently accessed data, and iii) Amazon Glacier for long-term archive, while Dropbox has a simple pricing framework, providing two types of storage (Standard and Advanced) for individuals and one type for enterprises.
Many cloud storage service providers offer throughput or IOPS (Input and Output operations Per Second) guarantees; however, as a large number of files is stored, the latency of accessing stored files becomes an important criterion to evaluate the effectiveness of these storage services. However, there is no consideration of the latency of accessing the stored data in the above pricing schemes in a shorter time scale. Thus, if a user needs to access the file quite often with a lower latency it may not be able to get that service on a given day.
I-B Market Architecture
We consider a two-stage market for providing lower latencies to the users for a certain period in an auction mechanism. All the files are stored in the back-up storage. In stage 1, the users bid in order to store their files in the cold or hot storage which are faster compared to the back-up storage. Additionally, there is a second market which runs more frequently compared to the first stage where the users can update their bids if they require low latency or faster access. Using our two-stage bidding platform, the CSP can maximize the total profits over file storage and file access while meeting the users’ access requirements. On the other hand, the users with lower latency requirements will be able to get their required quality of service. Thus, the users’ utilities will be maximized.
One major challenge confronting the service providers these days is: given the price customers are willing to pay, and the expectation of future access rates, how can a service provider maximize its overall profit over a variety of file storage decisions, file access decisions, and access request scheduling policies. Further, they should also ensure that they provide a reliable, efficient storage that meets customer’s latency requirements. This challenge necessities novel pricing mechanisms that go beyond existing approaches such as resource-based pricing, usage-based pricing, time-dependent pricing in cloud computing and online storage.
In order to store the files in the cold storage or hot storage, we propose a systematic framework for two-stage, latency-dependent bidding, which aims to maximize the cloud storage provider’s net profit in tiered cloud storage systems where tenants may have different budgets, access patterns and performance requirements as described in Section III. The proposed two-stage, latency-aware bidding mechanism works as follows. The cloud service provider (CSP) has two tiers of storage: hot storage and cold storage with different service rates. Users can bid for storage and access, in two separate stages, without knowing how the CSP stores the contents. In the first stage (request for storage), the user specifies storage size, expected access rates, and latency requirements. If the CSP decides to accept the bid, it will place two copies of data: one in the cold storage and another one in either the hot storage or cold storage. In the second stage (request for access), the CSP can decide whether to accept the access requests based on the bid and where to retrieve the files from to meet the access latency requirements. The second-stage auction runs on a shorter time scale (every hour) and the first-stage auction runs on a longer time scale (every day) since the access pattern of files changes faster.
The second-stage decision inherently depends on the first-stage decision. For example, if the CSP decides to store both the original file and its copy in cold storage, the file can be accessed from the cold storage only. However, if accessing from cold storage does not meet the access latency requirement (storage servers might get congested due to high request arrival rates and low service rates), the CSP may not be able to serve the request at once. In this case, the CSP will lose profit due to the loss of the access bids from the users. The optimal first-stage decision decision inherently depends on the second-stage decision. For example, if a user bids at a low price for storage, the file may be stored in the cold storage; however, the user may then bid at a higher price with lower latency requirement in the second stage. In that case, its bid may not be accepted as the latency requirement may not be matched because the file was stored in the cold storage in the first place. Unfortunately, the access bids, latency requirements, and the access arrival rates all are random variables, and the realization of these random variables are not known beforehand.
We first formulate the second-stage decision problem whether to accept the bids and scheduling decision (whether to access the file from the cold or hot storage) given a first-stage decision as an integer programming problem with non-convex constraints. Since the second stage parameters are random, we consider multiple random realizations of these variables and average the objective function over these realizations (or scenarios). We then formulate the first-stage decision problem as a deterministic equivalent program where we maximize the profit from the storage and the expected second stage profit while satisfying the latency requirements for each scenario (Section IV). However, the problem again turns out to be an integer programming with non-convex constraints. We first relax the integer constraints by using sigmoid function as the penalty, which closely matches the required penalty function. The relaxed problem is smooth and we can obtain a local solution using the KKT conditions. The solution of the relaxed problem is then converted to the nearest integers. Because of the sigmoid function, the solution attained by the relaxed problem and the feasible one is quite close. In Section VI, we show the strength of our proposed method in achieving significantly higher profit as compared to the other algorithms which do not consider the second stage recourse decision while taking the first-stage decision.
Our solution exploits a number of key design tradeoffs. First, any efficient cloud storage and access strategies must meet both the service provider’s constraints and customers’ requirements. The constraints from the service provider might come from tiered cloud storage architecture, storage-related costs, reliability level and capacities of each tier of storage. The requirements from customers include bidding prices of storage and access, latency requirements and expected access request arrival rates. Second, while placing as much content as possible in cold storage could potentially reduce storage cost, it may be insufficient to meet clients’ latency requirements. On the other hand, although storing more content in hot storage improves service latency, it results in higher storage price, which might cause customer churn. A solution exploiting this tradeoff is thus necessary to determine the optimal placement (and duplication strategy) of files in tiered storage. As a result, jointly scheduling all the file access requests to avoid congestion in each storage tier becomes challenging and must take into account the impact of request patterns and access decisions of all clients.
The main contribution in this paper can be summarized as follows:
1. Comprehensive future consideration: This paper aims to propose a systematic framework that integrates both file storage and file access, which optimizes the system over four dimensions: file acceptance decision, placement of accepted files, file access decision and access request scheduling. The proposed framework encompasses future access information such as bidding price for access, latency requirements and expected access request arrival rates.
2. Two-Stage, Latency-Aware Bidding: Most storage pricing schemes consider both storage and access at the same time; our scheme is novel as it allows users to bid for storage and access (with latency requirements) separately and gives the CSP more flexibility in optimizing the tiered-storage to maximize the profits.
3. Computational Efficiency: We quantify the service latency with respect to both hot and cold storage. The proposed optimization is modeled as a mixed-integer nonlinear program (MINLP), which is hard to solve. We propose an efficient heuristic to relax the integer optimization and solve the non-convex problem.
4. Insightful Numerical Results: The performance of the proposed approach is evaluated in various cases. It is observed that the profits obtained from the proposed method are higher than those of other methods, and the access request acceptance rate (ARAR) also dominates that of other methods as the capacity of the cold storage or the service rate of hot storage increases. For example, we see a profit increase of over 60% of our proposed method compared to other schemes as the capacity of cold storage increases beyond 500TB with our simulation scenario.
The rest of the paper is organized as follows. Section II describes the related work. The system model for the tiered architecture and the two-stage auction framework is described in Section III, and the two-stage optimization problem is formally defined in Section IV. Section V gives the proposed solution for the mixed integer non-linear program and Section VI validates our proposed policy and evaluates its performance using numerical studies. Finally, Section VII presents our conclusions.
Ii Related Literature
Tiered storage has been used in many contexts so as to achieve better cost-performance tradeoffs by placing the workload on a hybrid storage that includes multiple hot and cold storage tiers       . However, the pricing solution for multi-tier cloud storage is quite limited to resource/usage-based pricing, as shown in . Some of the recent pricing schemes for online storage providers include those AWS S3, Dropbox, Google Drive, etc. and their current pricing plans can be found at , , , respectively. Typically, they often offer a flat price for the storage service with a limited storage capacity or access rates. For example, Amazon provides three types of storage facilities depending on the access rates. However, our model is different from the existing practices. First, we consider a two-stage auction model where in the first-stage, the users can move its file to (tiered) cold/hot storage by adjusting their bids. In the second-stage, the users bid to access the files. Note that the first-stage auction is run once in a day (or week), while the second-stage once an hour (or day). Thus, it provides a greater flexibility to the users to adjust their bids according to their daily requirements. In contrast, the user has to pay a flat rate price for a month if one wants to achieve a faster access rate in the Amazon. Second, in contrast to the pricing mechanisms of Amazon and Dropbox, we consider the latency requirements of the users while accepting the bids even at the first-stage.
Pricing for cloud computing has been widely studied [14, 15, 16, 17, 18, 19]. Game Theory and Auctions are broadly adopted as mechanisms for cloud service. For example, in , a game theoretical model is used to induce a truthful cloud storage selection mechanism where the service providers bid the quality of service; in  , an online procurement auction mechanism is proposed to maximize the long-term social welfare; A Vickery Clarke Grove (VCG) auction-based dynamic pricing scheme is proposed for cloud services in . Recently, a stackelberg game model is proposed in  to derive the pricing scheme. The stackelberg game consists of two stages– i) in the first stage, the service provider determines a price which is both time and location dependent, ii) in the second stage, the users decide the schedule of the mobile traffic depending on the prices. However, compared to the above papers, we consider a scenario where the users bid in a two-stage– in the first stage, the users bid in order to store their files in the hot or cold storage; in the second stage, the users again bid for the latency requirements and the access arrival requests. The cloud service provider in the first stage is unaware of the bids of the users in the second stage. However, the optimal decision is inherently depends on the second stage decisions. Thus, the problem is inherently challenging , and turns out to be a non-convex mixed integer problem.
To the best of our knowledge, such kind of auction mechanisms have not been considered in the literature yet. Additionally, the above papers mainly considered Vickrey-Clarke-Groves (VCG) type auctions  or their variants. However, our problem turns out to be a complex non-convex optimization problem. A VCG-type auction will have high complexity and the optimality cannot be guaranteed because of the non-convexity of the problem.
Iii System Model
Iii-a Tiered Architecture
We consider a cloud storage provider (CSP) which has a tiered storage architecture. Each file is stored in an inexpensive back-up storage facility. For example, Amazon Web service (AWS) charges per GB per month for standard storage. The back-up storage can be considered to consist of hard disk drive (HDD) which is inexpensive, but, the service rate is slow and unreliable. Since it is inexpensive, the latency cannot be guaranteed as a lot of files can be stored. In order to provide a faster service the CSP can offer two types of storage – i)cold storage and ii) hot storage. Cold Storage is made of SSHD (combination of solid state drive (SSD) and HDD) which is expensive compared to the HDD, however, the service rate is faster and there is more reliability against disk failure. The hot storage is the most expensive one as it is made of SSD, however, the service rate is also the fastest. Thus, if files are stored in the hot storage, they will have faster access.
Iii-B Two-Stage Auction Framework
In order to store the files in the cold storage or hot storage, the CSP will operate a market. In the first stage, the users 111We denote all the clients of cloud service providers as users. Thus, users may be the individuals, enterprises, or organizationsbid to store their files in the upgraded storage facilities. The CSP decides whether to accept the file and where to store the original file and its copies. We model the storage platform as providing dual replication of files, so each file has a duplicated copy. 222Multiple copies of the file can be created in practice. However, it will increase the storage requirement and the computational complexity of computing the acceptance/rejection of bid, and the access probabilities. The consideration of the scenario where any specific number of copies can be stored is left for future work.To ensure data durability and availability, data replication is broadly adopted by data center storage systems, such as Hadoop Distributed File System , RAMCloud , and Google File System . If a file is accepted for storage, the user pays the bidding price, otherwise, it pays nothing. The CSP stores one copy in the cold storage. The CSP also decides whether to store the other copy either in the hot or the cold storage.
In the second stage, the users whose files get accepted for storage, bid again for accessing the files. The CSP needs to decide whether to accept the access requests and if accepted, from where the files should be accessed (either cold or hot storage) in order to meet the access latency requirements. If the user’s request is accepted, it pays the bidding price, otherwise, it pays nothing. The second stage decision inherently depends on the first stage decision. For example, if the CSP decides to store the both the original file and its copy in cold storage, the file can be accessed from the cold storage only. However, if accessing from cold storage does not meet the access latency requirement (storage servers might get congested due to high request arrival rates and low service rates), the CSP may not be able to serve the request at once. In this case, the CSP will lose profit due to the loss of the access bids from the users. Fig. 1 depicts graphically the major considerations in the two-stage problem.
The optimal first stage decision of the CSP inherently depends on the second stage decisions. For example, if the CSP decides to store a file in the cold storage because of its low storage bid, it can bid a high value for the access in the second stage. However, the CSP may not accept the bid because of the lower service rate of the cold storage. Hence, the CSP’s profit will be reduced. These access bids, the access arrival requests, and the latency requirements are random variables which cannot be known during the first stage decision process which makes finding an optimal first stage decision is inherently difficult. We assume that the two bidding stages take place at different time-scales. In particular, while users’ files typically remain in the storage system for a long time period (e.g., a day, or several days in Stage 1), the latency-dependent file access decisions (in Stage 2) can be adjusted more frequently on a much smaller time-scale (e.g., every hour), e.g., during busy and off-peak hours. Intuitively, the user’s need to access a file changes on a shorter time scale compared to its storage decision. Hence, the second stage auction must be run more frequently. Note that the frequency of the second stage auction can be changed depending on the change of the access request rates of some of the files.
Note that not all the access bids of the files stored in the cold or hot storage will be accepted. The acceptance depends on the access bids and the latency requirements. However, it is still useful for the user to participate in the first stage i.e., paying a higher price to store its file in the cold or hot storage. This is because the user may have to access the file only for a certain number of hours in a day, the user can participate in the first stage auction where its files will be stored either in the cold storage or the hot storage at the start of the day. When the user needs to access the file, it bids in the second stage auction. Note that since both the cold storage and the hot storage have higher service rates as compared to the back-up storage, the users can access files at a much faster rates compared to the traditional back-up storage even if their access bids are not accepted at all.
Also note that we have a back-up storage for all the files. Initially, all the files are stored in back-up storage. The users then bid in order to store their files slightly faster cold storage or the fastest hot storage. After this fist-stage auction, the files that are accepted will be copied and moved to store in cold or hot storage. However, the rest of files will be stored in the back-up storage. If the user’s storage bid is rejected, she will still be able to access those files from the back-up storage. Our second stage bidding is only designed for premium data access, while a standard, basic service to access data is provided to all files stored in the system. If a user’s access bid is not accepted by CSP in the second stage, she still be able to access the file from hot or cold storage. Thus, service availability is indeed guaranteed. However, there will be no guarantee on the latency or the speed of accessing the files in the above two cases.
Iv Problem Formulation
In this section, we formally define the two-stage optimization problem.
Iv-a First-Stage Decision
In the first stage, the CSP decides – i) whether to store a file or not, ii) if it decides to store the file whether to keep the duplicated copy of the file in the hot storage or cold storage (Original copy of an accepted file is always stored in cold storage). 333 Note that we consider storing the first copy in cold storage due to its relatively low cost and large capacity, while the analysis and optimization remain the same if it is replaced by any other type of storage tier. Let be the total number of files participate in the first-stage auction.
We consider a first price auction where the user pays the price it bids. This auction is run once in a day or once in a week. Let denote that the file is accepted for storage; if it is not accepted. Let denote that the copy of the file is stored in the hot storage; otherwise, . Note that if , then the file is not stored anywhere, thus, . However, if , can be either or . Nonetheless, if , must be .
Also note that if , there are two possibilities: (i) , thus, both of the original and duplicated copies will be stored in cold storage (hence the number copies of file stored in the cold storage is ); or, (ii) the file storage bid is rejected (). Therefore, the number of copies of file stored in the hot storage and cold storage is and respectively.
Let be the size of the file and be the capacity of storage , where denotes the cold storage and denotes the hot storage. Since the total stored files cannot exceed the capacity,
Since must be if is 1,
Iv-B Second Stage Decision Problem
After storing the files, the users bid for accessing the files in different time slots. While bidding, the user also gives the access request arrival rates and the latency requirements in each slot. This market is run on a shorter time scale (e.g., the duration can be an hour or half an hour). The user can update its bid at different time slots depending on its requirements.
The access request arrival rates, the access bid prices, and the latency requirements are random variables, which are governed by the user’s requirements. We assume that the random variables can be modeled by realizations of the random variables ( or scenarios). The decision is time-based. The second stage runs at epochs (e.g., every hour), and different scenarios, i.e., the user’s bid, latency and arrival rate, can vary over these epochs. We take access decisions at each epoch based on the bids. The first stage runs at every periods, (e.g., every day), and we take storage decision by considering the possible scenarios and the associated probabilities across all the periods.
Workloads for accessing data follow some pattern   . However, the CSP is unaware of the exact joint distribution function of the bidding prices, access arrival request rates, and the latency requirements. However, in the scenario-based approach, we do not need to know ay specific distribution function. Specifically, we can generate the empirical distribution from the bidding history. For example, from Fig. 1 we know that the first stage auction runs in a longer time scale (e.g., every day) and second stage auction runs in a shorter time scale (e.g., every hour). Then in the following day, the CSP can learn the (joint) empirical distribution of bid price, latency and arrival rate based on the access information from the last (few) day(s). Thus, our approach can be applied to any scenario where a workload pattern does not need to be learnt.
For each scenario , we denote the latency requirement of file as , the access bid price as , and the access request arrival rate as . The scenarios can be generated from the past history of the user’s data. We assume that scenario occurs with probability . We do not put any restriction on the dependence of the bids, the latency bids, and the bids. Specifically, they can be obtained from a joint distribution. However, the CSP is unaware of the distribution. It learns from the bidding history and updates the set of scenarios.
Iv-B1 Access arrival rates
The access requests are independent and in a certain time slot, the number of these requests are integer and can be considered independent of the past requests. It is often assumed that the inter arrival time follows exponential distribution . Thus, we consider a Poisson arrival process. We use M/G/1 queuing model.444 In this paper, we consider one disk for each hierarchy. Because the multiple servers will reduce the bandwidth of each server. Thus, the capacity of serving requests from each server will be reduced. As a result, the latency of each request will be increased. M/M/1 or M/G/1 queuing model is also used in .
Iv-B2 Access request acceptance
The CSP decides whether to accept the bid of the access request of each and every file. Let denote the decision that whether the file is accepted in scenario . indicates that the access bid is accepted; indicates that the access bid is rejected.
Note that when the second stage decision is taken, the first-stage decision variables and are known. If , then for all since file is not stored in the cold or hot storage, then its access bid cannot be accepted. On the other hand if , can be either or . This is because even if , it cannot be guaranteed that the access bid will be accepted in scenario . The access bid will be accepted based on how much profit will be made and whether the latency requirement can be satisfied by accepting the bid. Hence,
Iv-B3 Probabilistic Scheduling
Probabilistic Scheduling has been successfully applied in display ad allocation problem on the Internet  and high–aggregate bandwidth switches . Such a strategy has been also shown to be nearly optimal in cloud storage .
Recall that if file is accepted for storage, the original copy would be stored in cold storage and the duplicated copy will be either stored in cold storage ( or in hot storage (). As we have copies of a file in both hot and cold storage in the latter case (), the CSP needs to decide where the file should be accessed according to its bidding price and latency requirement. In probabilistic scheduling, each request for file has a certain probability to be scheduled to each storage . For the -th scenario, we have to decide which denotes the probability that the file will be fetched from storage , for the -th scenario. Intuitively, denotes how often the file should be fetched from storage for scenario . Needless to say, if , then for all . Hence,
And for files which have not been accepted for access requests. Thus,
Recall that denotes the access request arrival rate of file in scenario within a slot. Thus, the total expected file access request rates for file to storage in the -th scenario within the slot is given by . The total expected file access request rate to storage must be less that the file service rate (in Mb/s) of storage ; otherwise, the queue length will be and the storage cannot handle requests. Hence,
Our scheduling approach will be optimal in an expected sense. However, the scheduling approach may be sub-optimal for a given scenario. Obtaining an optimal deterministic schedule is a NP-hard problem in general for a given scenario.
Iv-B4 Latency Analysis
Latency is the sum of the time a file access request spends in the queue for service (waiting time) and the service time.
The users strictly prefer a low latency. Studies show that in internet application even s increase in the latency can significantly reduce the profit . The latency for file will inherently depend on the probabilistic scheduling decision , arrival rate , and the service rate of the storage . Given the same number of files with same sizes are being served, a file will spend less time for service because of the higher service rate of the hot storage compared to the cold storage. Thus the latency will be shorter in the hot storage. However, a user may have to pay more for accessing. In the following, we provide the expression for the expected latency of a file. Before that, we introduce a notation which we use throughout.
Let denote the expected latency for file request at scenario .
Let denote the waiting time at storage for scenario . Recall that denotes the probability with which file will be fetched from storage in scenario . Hence, the expected waiting time for file at scenario is . Recall that is the service rate in Mb/s for storage . Since the size of the file is and the probability that the request for file will be sent to storage at scenario is , the expected service time for file in scenario is
From Definition 1 we have
The next result characterizes .
The mean waiting time at storage for scenario , is given as follows.
In order to simplify notations, we introduce three auxiliary functions: , , and .
Note that by differentiating twice one can easily discern that is convex in each . However, is jointly non-convex in . This is because of the terms , which is not jointly convex in and .
Note that the latency depends on the file size : if the file size is large, the latency will be large. Thus, it shows that for the same access bid the files of smaller sizes will be preferred (given that its latency requirement is satisfied) as it will allow the CSP to accept more access requests. Also note that if is large for some , then the latency again increases, hence, the latency of storage facility increases if too many requests are directed towards . Thus, the CSP has to judiciously select . If a large number of requests are directed towards the hot storage, the latency requirement may not be satisfied which may decrease the CSP’s profit. Also note that if the file is not accepted for accessing. Recall that the latency requirement for file in scenario is . Hence, we must have
Iv-B5 Second Stage Optimization Problem
The second stage profit of the CSP if the scenario is realized is given by
Recall that is the access bid for file in scenario . Hence, the second stage optimization problem if scenario is realized is given by
Note that if a user bids high for access, but its size is large or the arrival rate is high, then the latency (11) may increase and the CSP will lose the profit as the CSP may satisfy only few requirements of latencies. Problem (P2) is a integer nonlinear program, which is not trivial to get solved. Hence, it is not apriori clear that how the CPS should select the access bids.
Iv-C Deterministic Equivalent Program
Now, we formally formulate the first-stage stochastic program. Let be the bid price of file for storage. Let and denote the total cost incurred by the service provider for storing a file the hot and cold storage respectively. Recall that (, resp.) denotes that the storage is cold (hot, resp.). Hence, the profit obtained by the CSP for storage is
Since the second stage decision variables inherently depend on the first stage and the CSP wants to maximize the total profit, thus, the CSP needs to consider the second stage decision while taking the first stage decision. Hence, the first stage decision problem is different from the standard knapsack problem.
Note that the CSP knows that the access bid price for file in scenario is . Recall that the probability with which scenario is generated is . Hence, the expected profit from the second stage decision is
is the total number of slots where the access auctions are run. In the first stage, the CSP wants to maximize the total expected profit. However, the expected profit also depends on the second stage decision variables. Therefore, we should find and for each possible scenario. We formulate the first-stage decision problem as the so-called deterministic equivalent program  in the following:
Note that the constraints in (20)-(25) are for the second stage decisions. Also note that though we solve for and , the decision variables are of interest in the first stage, which are and . After and are decided, the optimization problem (P2) is solved if scenario is realized. In the deterministic equivalent program, the number of scenarios may be very large which increases the number of constraints and the decision space. One remedy is to discard those scenarios which occur with very low probability.
The CSP is unaware of a specific scenario in the first stage. Thus, we consider that the CSP will decide whether to accept a bid, and storing the file in the cold or hot while maximizing the expected revenue over all the scenarios that can generate in the second stage. Note that in the second stage, the CSP is aware of the bids of the users. Thus, the CSP is aware of the specific scenario while taking the second stage decision.
Problem (P1) and (P2) are non-convex.
First the decision variables , and are binary, which make the problem non-convex. Second, (cf.(11)) has term , which is not jointly convex in and . ∎
Hence, standard convex optimization solvers such as CVX, MOSEK or integer linear programming optimization solvers such as CPLEX cannot be used.
Problem (P1) is the first stage problem and (P2) is the second stage problem. Note that while solving the first stage problem (P1) the CSP needs to consider the second stage parameters– the access bids, the arrival rates, and the latency requirements. This is because the second stage optimal decision (and thus, the optimal profit) inherently depends on the first stage decisions. Thus, the CSP needs to consider the second stage decision while taking the first stage optimal decisions. We consider a scenario based approach where we optimize the expected profit over all the scenarios and obtain the first stage decision variables. In the second stage, the CSP optimizes (P2) for a specific scenario which has been realized. Note that while solving the second stage problem, the first stage decisions are known.
V Solution Methodology
V-a Discussion of Computational Complexity
We now demonstrate the computational complexity of the original problem with an example. Suppose we have 10 users and 1 time period with 3 scenarios, so here we have 50 binary decision variables (10, 10, 10, 10 and 10). The total number of branches in a decision tree is , which increases exponentially with the increase of number of users, time periods and numbers of scenarios. Thus, the problem scale is very large even with a small number of users and scenarios. In addition, the latency constraint is nonlinear, which makes our problem even harder as we cannot use MILP.
V-B Integer Relaxation
Problem (P1) and (P2) are non-convex as the variables and are binary. If we relax the binary constraints, the rest of the problem will still be non-convex as the latency function ( cf.(11)) is still jointly non-convex in . However, if we relax the integer constraint then the objective function and the constraints will be differentiable. We can use the solver such as CONOPT  to find a locally optimal solution. CONOPT is generally used for smooth continuous functions. It finds the solution which satisfies the KKT conditions. If the gradient is non-linear, it will be approximated via Taylor series up to the first order term.
However, if we relax the integer constraint, the solution may not be integer, rather a value in the interval . To eliminate those solutions, we need to add a penalty function which will put high penalty ( for optimality) when the solution is not either or and penalty when the solution is indeed or (Fig. 3).
Sigmoid function555 is a S-shaped function which can closely approximate the step function . Since we have to put zero penalty when the solution is or and a high penalty when it is in between, thus, we consider the following function
Fig. 3 shows that becomes close to at the value . Fig. 3 also shows that for , closely matches the penalty function that we desire. The function shifts the penalty function to have zero value on the desired extremes and a negative value in the desired range. The next result shows that .
The value of function is zero for and , or .
Thus, we note that as the function gives penalty when is or . For any , as . Further, even for finite , we note that for . Thus, for large , this function it will match the ideal penalty function. Note we do not have to lose any differentiability property as is differentiable. With this penalty function our problem reduces as follows.
is the weight corresponding to the penalty functions. Note that the solution will be integer if . and are decided by solving (P3). After and are solved for a given realization , the second stage decisions are taken. In the second stage, the following optimization problem is solved
The decision variables are and . Note that we do not need to decide and in the second stage, hence, we do need constraints (1)-(3). Note that the solution will be optimal and integer if . Since the problem is non-convex, we cannot guarantee that the solution obtained by CONOPT will be optimal. However, we can infer the following if we find an optimal solution
The optimal solution of the relaxed problem (i.e. (P3), (P4)) is also the optimal solution of the original problem (i.e. (P1), (P2)) as .
V-C Feasible solution from the relaxed problem
When is , both the first-stage and second-stage decision solutions , and will be integers. If will match the ideal penalty function. However, in practice, neither nor can be set at . Hence, we may find a solution which is not feasible, i.e. it is not either or . Note that setting alone to the very high value will not make the solution integer. One also has to make high to give larger penalty to the fractional solution. However, has to be larger for smaller . In the following, we discuss how to find the feasible solution for finite and .
Also note that if is very high, in an optimal solution the solution will only be away from the integral solution by a nominal amount. One can then convert the non-integral solution of either to the nearest integer. However, the above does not guarantee that the capacity constraints or the latency requirements will be satisfied. For example, consider that in a solution of the relaxed problem , where is very small, and is the nearest integer solution to the relaxed problem. However, if , then which violates the constraint in (1). Thus, simple converting the solution of the relaxed problem to the nearest integer may not give a feasible solution. However, in the following, we provide a strategy which can guarantee that even if the solution of the relaxed problem is converted to the nearest integer, then, it will not violate the original constraint.
For every and , there exists an such that if and , such that if the solution of the relaxed problem (i.e., (P3), (P4)) is converted to the nearest integer (if the value is , it will be converted to ) then they will be feasible solution of the original problem (i.e., (P1), (P2)).
Intuitively, if we make and , we solve a restricted problem. Thus, even when we convert the non-integer solutions of the relaxed problem to the nearest integers we will not violate the original constraints. Note that if is very large, we need a very small as the solutions of the relaxed problem and will be close to the integers. As , . is also larger if is low. In our numerical results, we set as , as , and as which gives the feasible solutions as mentioned in the above proposition.
Vi Numerical Studies
Vi-a Simulation Setting
To validate our proposed policy and evaluate its performance, we implement the following numerical studies. Unless stated otherwise, we consider a setting where there are 1,000 files, and the number of slots for the second-stage auction is . The capacities of cold and hot storage are 400 and 200 GB respectively. We consider five types of files: , , , and , which are of sizes 64, 128, 256, 512 and 1024 MB respectively. In the first stage, customers will bid for storage. Bidding prices for storage per MB are considered to be a random variable i.e., it is uniformly distributed with a mean of 0.2 cents. Thus, the bid for file is distributed as . For example, if there is a 64 MB file and the realized price is 0.25 cents for each MB666Note that in Amazon they put $ for each GB., the bidding price to store this file is cents.
We consider different scenarios for the second-stage parameter. Specifically, we generate different instances of access request arrival rates, access bids, and the latency requirements. We consider that is generated independently according to the mean and per hour for the file sizes of 64, 128, 256, 512 and 1024 MB respectively. This is in accordance with the practice as the smaller size files are accessed more frequently. We assume that the latency requirements are related to the file sizes. Specifically, we generate independently according to the distribution in milliseconds.  shows that the utility is in general convex in the latency and concave in the arrival rates. In this paper, we consider that . The parameters are described in Table I. After generating the scenarios we compute the empirical distribution to find the number of times a scenario (prob. ) occurs out of the events. Then scenario is randomly generated among the scenarios where -th scenario occurs with probability .
Based on the above specifications, we compare the performances of the proposed method (PM) with three other methods, which are described as follows. We consider as and as in (P3) and (P4). The factor by which we reduce and is choisen to be . The solution obtained by the relaxed problem and the proposed method are almost the same. Thus, we do not show the solution of the relaxed problem.
IS: Problem with Two Independent Stages:
Solve the first-stage problem without considering the second-stage recourse decisions to get the first-stage solution and for each .
Given the first-stage solution solve for the realized scenario, i.e. solve the second stage optimization problem (P2). We again solve the relaxed version (P4) and then find the optimal solution according to Proposition 4.2 as described in our proposed method
GH I: Greedy Heuristic Based On :
In the second stage, we sort the bids based on in the descending order. We keep accepting bids according to the sorted order as long as the realized the latency requirements are met.
GH II: Greedy Heuristic Based On :
In the second stage, we sort the bids based on in the descending order if scenario is realized. We keep accepting bids according to the sorted order as long as the realized the latency requirements are met.
Profit in each algorithm is considered to be the sum of the first-stage and second-stage profits. Note that all the above mentioned base-line algorithms do not solve the first-stage decision problem by considering the second-stage recourse decision. Algorithm IS solves the second-stage decision problem given the solution of the first-stage decision. However, GH I and GH II are greedy heuristics which accept bids according to some heuristics in order to lower the complexity of finding the optimal solution of the second-stage decision problem. Intuitively, recall from (11) that the latency of a file in scenario inherently depends on the access request arrival rates and file size. Specifically, the latency increases as the file size increases or the access request arrival rate increases. Hence, the CSP should prefer the bids which give more profit per unit of the size and the per unit of the access request arrival rate. GH I greedily prefer the bids which pay more per unit of size. On the other hand, GH II strictly prefers the bid which pays more for per unit of access request rate. Before discussing the results, we introduce a notation which we use throughout this section.
The above metric shows how much bids are accepted in the second stage among the bids that are accepted in the first stage. This will give an idea pertaining the fairness of the process. In each of the result, each algorithm is run times and an average is taken for the profit and ARAR over these runs.
Vi-B Impact of Storage Capacity
To demonstrate the effectiveness of our proposed heuristic, we fix the hot storage capacity as 200 GB and vary the capacity of cold storage () from 300 GB to 800 GB in the steps of 20 GB, and plot the total profits and access request acceptance rate (cf. (VI-A)) by using different methods. Fig. 4(a) shows that as the capacity of cold storage ( or ratio ) increases, the profits obtained from all the algorithms except IS increases; however, the rate of increase decreases with the increase in the . Fig. 4(d) provides the reason behind this variation. As increases, more files can be stored in cold storage which increases profits. However, if is large enough, no more files can be stored, thus, the profit becomes saturated. Fig. 4(c) shows that because of the lower service rate of the cold storage, the profit from accessing the file does not increase with the increase in the capacity of cold storage. This is because files may be stored but cannot be accessed as it violates the latency constraint. Thus, increasing the cold storage capacity without increasing the hot storage capacity will not fetch more profit after a certain threshold. Note from Fig. 4(a) that the profit achieved by Algorithm IS increases initially, then decreases and again increases as increases. Intuitively, the Algorithm IS does not consider the second stage decision variables in its first-stage decision. Hence, more files are stored in the cold storage as it has a lower cost. However, as almost all the files are stored in the cold storage, the files cannot be accessed fast enough which does not increase the profit from accepting the access bids. Similarly, the profits earned by Algorithms GH I and GH II do not increase much as increases as they do not consider the second-stage recourse decisions in the first stage. Also note that when is large, our algorithm outperforms the other base-line algorithms by 50%. This shows the virtue of the consideration of the second stage recourse decision in the first-stage decision.
From the results in Fig. 4(c), the Access Req. Acceptance Rate (ARAR, cf.(VI-A)) decreases as the capacity of cold storage increases. This is because, by increasing the capacity of cold storage, the number of files accepted for storage increase, however with limited cold storage service rate, the number of files accepted for access is limited (which is also verified by Fig. 4(d)). Consequently, the ARAR decreases. Note that the ARAR corresponding to Algorithm IS is higher compared to our proposed method when is low as vary number of files are stored by the IS compared to our proposed method.
Vi-C Impact of Service Rate of Hot Storage
In this subsection, we assume that the service rate of cold storage is 100Gb/s, and the service rate of hot storage is varied from 100 to 2500 Gb/s in steps of 100Gb/s. Fig. 5(a) shows that the profit increases as the service rate of hot storage increases. This is because more access requests can be accepted as the number of accepted bids increase (Fig. 5(d)). Our proposed method outperforms the other methods. In fact, the profit can be increased by 100% compared to the other methods for high service rate.
Note from Fig. 5(b) that the profit from the second-stage auction increases significantly in our proposed method. However, the profit from the storing files does not increase much. This is because when the service rate is low mostly those files who have lower access requests or lower latency requirements (but can pay more) are accepted. As Fig. 5(d) suggests, when the service rate is high, more files are stored, however, the number of accepted access bids increases significantly. This suggests that when the service rate is high, the files which bid lower prices for storage, but still can pay more because of the high access rates are accepted for storing. Hence, the profit from storing the files remains constant as the storage capacities remain constant, however, the profits from accepting bids increase.
Fig. 5(c) shows that the ARAR increases with the service rate of the hot storage for all the algorithms. When the service rate is high, more access bids are accepted, however, the bids accepted for storage remains the same (Fig. 5(d)). Hence, the ARAR is high (cf.(VI-A)). When the service rate is low, mostly the files those have lower access requests are stored. However, the IS still can store files which have higher access rates if they pay more because it solves the two stage problem independently. Hence, the IS can achieve more ARAR in this case. However, when the service rate of the hot storage exceeds a threshold, the ARAR attained by our proposed method is the highest. The ARAR attained by the greedy heuristics GH I and GH II are strictly lower compared to our proposed method.
Vi-D Impact of Storage Cost of Hot Storage
In this subsection, we assume that the storage cost of cold storage is cents per GB, and the storage cost of hot storage is varied from to cents per GB in the step of cents per GB. Fig. 6(a) shows that as the hot storage cost increases the profit decreases. This is because most of the files are stored in the cold storage which decreases the profit as fewer number of access requests are accepted which is also verified from Fig. 6(d). Our proposed method outperforms the baseline algorithms by more than 60% when the storage cost is neither too high nor too low. When the hot storage cost is too low, more files are stored in the hot storage in the first stage and thus, more profit can be attained in the second stage by accepting more access bids. Hence, the profit attained by IS is close to our proposed method when the hot storage cost is low. Note that the profit attained by the IS is also very close to our proposed method when the hot storage cost is high. This is because IS inherently stores more files in the cold storage in the first stage. Since the greedy algorithms GH I and GH II do not optimize the second-stage decision, the profits attained by those are slightly lower compared to the IS.
Note from Fig. 6(b) that the profit in our proposed method from accessing the files decrease with an increase in the cost of hot storage as more files are stored in the cold storage. Though the overall profit decreases, the profit from storage increases. This is because the files which can pay more but do not have low latency requirements can be stored in the cold storage.
We also plot the impact of the storage cost of hot storage on the ARAR in Fig. 6(c). One interesting trend is the access rate obtained from the proposed method first increases, and decrease until close to the one gained from IS, while the others decrease and then go stable with the increase in the cost. This is because we combine the two -stage decision process, thus, when the cost is moderate, the number of files that are stored decreases without decreasing the number of accepted access bids as shown in Fig. 6(d). Hence, the denominator in (VI-A) decreases which increases the ARAR. Note that once the hot storage cost is very high, the number of accepted bids also decrease as the latency requirement may not be met because too little files are stored in the expensive hot storage. Hence, the ARAR decreases at very high cost. On the other hand, in the other algorithms, as the cost of the hot storage increases, very few files are stored in the hot storage; thus, very little files can be accessed which decreases the ARAR. However, if the cost is too high, no more file can be stored in the hot storage which makes the ARAR constant.
Vi-E Impact of Storage Cost of Cold Storage
In this subsection, we assume that the storage cost of hot storage is cents per GB, and vary the storage cost of cold storage from cents to cents per GB in steps of cents. Fig. 7(a) shows that as the cold storage cost increases the profit attained by our proposed method decreases. As Fig. 7(d) shows that when the cost of the cold storage increases the lower number of files are stored. Hence, the profit from the storage decreases (Fig. 7(b)). The profit from accepting the access bids remain the same as the number of files stored in the hot storage almost remains the same. Note from Fig. 7(a) the profits gained from GH I and GH II decrease first and increase dramatically when the cold storage cost is cents per Gb (i.e., the hot storage cost). The main reason behind this is that beyond this point the hot storage has lower storage cost but higher service rate after that point, which means that the files are prioritized to be stored in the hot storage rather than the cold storage. Thus, profits from accepting the access bids increase drastically for those greedy heuristics and become close to the optimal.
Note that as the cold storage cost exceeds the hot storage cost fewer number of files are stored. However, the accepted access bids remain the same as depicted in Fig. 7(d). Hence, the denominator of (VI-A) decreases without decreasing the numerator. Thus, ARAR increases. Note that the IS does better in terms of ARAR. The greedy heuristic GH I also gives a higher ARAR when the cold storage cost exceeds the hot storage cost. This is because in the first-stage decision the larger files are mostly now stored in the hot storage rather than the cold storage as they pay more. However, the total number of files accepted for cold and hot storage decreases. The larger files are also likely to bid higher in the second stage, thus, the ratio in (VI-A) increases for GH I as it accepts bids in the descending order of the bids per size. However, GH II accepts bids in a different manner; hence, the ARAR attained by the GH II is strictly lower.
Vii Conclusions and Future Work
In order to store the files in the cold storage or hot storage, this paper propose a systematic framework for two-stage, latency-dependent bidding, which aims to maximize the cloud storage provider’s net profit in tiered cloud storage systems where tenants may have different budgets, access patterns and performance requirements. In the proposed two-stage, latency-aware bidding mechanism, the users can bid for storage and access, in two separate stages, without knowing how the CSP stores the contents. The proposed optimization is modeled as a mixed-integer nonlinear program (MINLP), for which an efficient heuristic is proposed. The numerical results demonstrate that the profits obtained from the proposed method are higher than those of other methods, and the access request acceptance rate (ARAR) also dominates that of other methods as the capacity of the cold storage or the service rate of hot storage increases.
In reality, the users may be strategic. In other words, the user may optimize the bid in order to maximize its own profit. Our model captures some essence of the strategic users. The user’s maximum possible bid will be the price that it will pay if it selects another CSP for the same guaranteed latency requirement. Thus, our approach can be easily extended to the above scenario where we can consider a upper limit of the bid of a user. We, however, did not consider the full essence. For example, the users may not bid truthfully. We will consider such a scenario in future.
Another interesting direction for the future is to extend the model for erasure coding storage system where multiple copies () of the files can be stored and a subset of those copies () are required to be fetched to get the original file. In that case, the CSP would need to select the storage systems to store the file and among those copies are needed to be fetched to get the original file.
The authors would like to thank Yu Xiang and Robin Chen of AT&T Labs-Research for helpful discussions. This work was supported in part by the National Science Foundation under Grant no. CNS-1618335.
Appendix A Moments of the service time of a file request
In this Appendix, we will derive the first and second moments of the service time of a file request at storage in -th scenario, which will be used further to prove Theorem 1. More precisely, we will show the following result.
, the service time of a file request at storage in -th scenario,has a distribution with mean
and second moment
The rest of the Section proves this result.
It is easy to verify that under our model, the arrival of file requests at storage in -th scenario forms a Poisson Process with rate , which is the superposition of Poisson Processes each with rate .
Let be the (random) requested file size at storage , which is a discrete random variable such that the probability of is . Let be the (random) service time of one MB at storage , which is exponentially distributed with mean . The the expectation of the service time of a file request at storage is
and the associated second moment is
-  A. Luca and M. Bhide, Storage virtualization for dummies, Hitachi Data Systems Edition. John and Wiley Publishing, 2009.
-  Y. Xiang, T. Lan, V. Aggarwal, and R. Chen, “Joint latency and cost optimization for erasure-coded data center storage,” IEEE/ACM Trans. Netw, vol. 24, no. 4, pp. 2443–2457, 2016.
-  J. Guerra, H. Pucha, W. J.Glider, and R. Rangaswami, “Cost effective storage using extent based dynamic tiering,” in In Proceedings of the 9th USENIX Conference on File and Stroage Technologies. Usenix Association, 2011, pp. 20–20.
-  H. Kim, S. Seshadri, C. Dickey, and L. Chiu, “Evaluating phase change memory for enterprise storage systems: study of caching and tiering approaches,” in In Proceedings of the 12th USENIX Conference on File and Storage Technologies. Santa Clara, CA, USA: Usenix Association, 2014, pp. 33–45.
-  Z. Li, A. Mukker, and E. Zadok, “On the importance of evaluating storage systems’ costs,” in In Proceedings of the 6th USENIX Conference on Hot Topics in Storage and File Systems. Philadelphia PA USA: Usenix Association, 2014, pp. 6–6.
-  H. Wang and P. Varman, “Balancing fairness and efficiency in tiered storae systems with bottleneck-aware allocation,” in In Proceedings of the 12th USENIX Conference on File and Storage Technologies. Santa Clara, CA, USA: Usenix Association, 2014, pp. 229–242.
-  O. Ben-Yehuda, M. Ben-Yehuda, A. Schuster, and D. Tsafrir, “Deconstructing amazon ec2 spot instance pricing,” in In Proceedings of the IEEE 3rd International Conference on Cloud Computing Technology and Science. Athens, Greece: CloudCom 2011, 2011, pp. 304–311.
-  I. Drago, M. Mellia, M. Munafó, A. Sperotto, R. Sadre, and A. Pras, “Inside dropbox: understanding personal cloud storage services,” in In Proceedings of the 12th ACM SIGCOMM Conference on Internet Measurement. Boston, MA, USA: IMC 12, 2012, pp. 481–494.
-  L. Youseff, M. Butrico, and D. D. Silva, “Toward a unified ontology of cloud computing,” in 2008 Grid Computing Environments Workshop, Nov 2008, pp. 1–10.
-  M. Naldi and L. Mastroeni, “Cloud storage pricing: A comparison of current practices,” in Proceedings of the 2013 International Workshop on Hot Topics in Cloud Services, ser. HotTopiCS ’13. New York, NY, USA: ACM, 2013, pp. 27–34. [Online]. Available: http://doi.acm.org/10.1145/2462307.2462315
-  Amazon, https://aws.amazon.com/s3/pricing/, 2017, accessed 8th Feb,2017.
-  Dropbox, https://www.dropbox.com/business/pricing, 2017, accessed 8th Feb,2017.
-  Google, https://cloud.google.com/storage/pricing, 2017, accessed 8th Feb,2017.
-  H. Xu and B. Li, “Dynamic cloud pricing for revenue maximization,” IEEE Transactions on Cloud Computing, vol. 1, no. 2, pp. 158–171, July 2013.
-  Q. Wang, K. Ren, and X. Meng, “When cloud meets ebay: Towards effective pricing for cloud computing,” in 2012 Proceedings IEEE INFOCOM, March 2012, pp. 936–944.
-  H. Zhang, H. Jiang, B. Li, F. Liu, A. V. Vasilakos, and J. Liu, “A framework for truthful online auctions in cloud computing with heterogeneous user demands,” IEEE Transactions on Computers, vol. 65, no. 3, pp. 805–818, March 2016.
-  L. Zhang, Z. Li, and C. Wu, “Dynamic resource provisioning in cloud computing: A randomized auction approach,” in IEEE INFOCOM 2014 - IEEE Conference on Computer Communications, April 2014, pp. 433–441.
-  W. Shi, L. Zhang, C. Wu, Z. Li, and F. C. Lau, “An online auction framework for dynamic resource provisioning in cloud computing,” in The 2014 ACM International Conference on Measurement and Modeling of Computer Systems, ser. SIGMETRICS ’14. New York, NY, USA: ACM, 2014, pp. 71–83.
-  W. Y. Lin, G. Y. Lin, and H. Y. Wei, “Dynamic auction mechanism for cloud resource allocation,” in 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, May 2010, pp. 591–592.
-  C. Esposito, M. Ficco, F. Palmieri, and A. Castiglione, “Smart cloud storage service selection based on fuzzy logic, theory of evidence and game theory,” IEEE Transactions on Computers, vol. 65, no. 8, pp. 2348 – 2362, Aug. 2016.
-  R. Zhou, Z. Li, and C. Wu, “An online procurement auction for power demand response in storage-assisted smart grids,” in IEEE INFOCOM 2015- IEEE Conference on Computer Communications, April 2015, pp. 2641–2649.
-  Q. Wu, M. Zhou, Q. Zhu, and Y. Xia, “VCG auction-based dynamic pricing for multigranularity service composition,” IEEE Transactions on Automation Science and Engineering, vol. PP, no. 99, pp. 1–10, 2017.
-  Q. Ma, Y.-F. Liu, and J. Huang, “Time and location aware mobile data pricing,” IEEE Transactions on Mobile Computing, vol. 15, no. 10, pp. 2599–2613, October 2016.
-  H. R. Varian and C. Harris, “The VCG auction in theory and practice,” The American Economic Review, vol. 104, no. 5, pp. 442–445, 2014.
-  K. Shvachko, H. Kuang, S. Radia, and R. Chansler, “The hadoop distributed file system,” in 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST). Incline Village NV USA: IEEE, 2010, pp. 1–10.
-  D. Ongaro, S. M. Rumble, R. Stutsman, J. Ousterhout, and M. Rosenblum, “Fast crash recovery in ramcloud,” in SOSP 2011 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, Cascais, Portugal, 2011, pp. 29–41.
-  S. Ghemawat, H. Gobioff, and S.-T. Leung, “The google file system,” in SOSP 2003 Proceedings of the 19th ACM Symposium on Operating Systems Principles, New York, USA, 2003, pp. 29–43.
-  D. Niu, Z. Liu, B. Li, and S. Zhao, “Demand forecast and performance prediction in peer-assisted on-demand streaming systems,” in 2011 Proceedings IEEE INFOCOM Mini-Conference. Shanghai, China: IEEE, 2011, pp. 421–425.
-  G. Grsun, M. Crovella, and I. Matta, “Describing and forecasting video access patterns,” in 2011 Proceedings IEEE INFOCOM Mini-Conference. Shanghai, China: IEEE, 2011, pp. 16–20.
-  G. Dan and N. Carlsson, “Dynamic content allocation for cloud-assisted service of periodic workloads,” in 2014 Proceedings IEEE INFOCOM. Toronto, ON, Canada: IEEE, 2014, pp. 853–861.
-  K. Gardner, S. Zbarsky, S. Doroudi, M. Harchol-Balter, E. Hyyti, and A. Scheller-Wolf, “Reducing latency via redundant requests: Exact analysis,” in Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, Portland, OR, USA, 2015, pp. 347–360.
-  G. Joshi, E. Soljanin, and G. Wornell, “Efficient replication of queued tasks for latency reduction in cloud systems,” in 2015 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton, USA, 2015, pp. 107–114.
-  B. Haeupler, V. S. Mirrokni, and M. Zadimoghaddam, “Online stochastic weighted matching: improved approximation algorithms,” WINE 2011: Internet and Network Economics, pp. 170–181, 2011.
-  P. Giaccone, B. Prabhakar, and D. Shah, “Randomized scheduling algorithms for high-aggregate bandwidth switches,” IEEE Journal on Selected Areas in Communications, vol. 21, no. 4, pp. 546–559, May 2003.
-  N. Shalom, “Amazon found every 100ms of latency cost them 1% in sales.” http://blog.gigaspaces.com/amazon-found-every-100ms-of-latency-cost-them-1-in-sales/, 2008, accessed 8th Feb. 2017.
-  W. Chan, T.-C. Lu, and R.-J. Chen, “Pollaczek-khinchin formula for the m/g/1 queue in discrete time with vacations,” IEE Proceedings-Computers and Digital Techniques, vol. 144, no. 4, pp. 222–226, 1997.
-  R. J.-B. Wets, “Stochastic programs with fixed recourse: The equivalent deterministic program,” SIAM Review, vol. 16, no. 3, pp. 309–339, 1974.
-  A. S. Drud, “Conopt: A large-scale grg code,” ORSA Journal on Computing, vol. 6, no. 2, pp. 207–216, 1994.
-  W. Chen and L. Sha, “An energy-aware data-centric generic utility based approach in wireless sensor networks,” in In Proceedings of the third international symposium on Information processing in sensor networks. Berkeley California USA: IPSN, 2004, pp. 215–224.