Optimal Energy-Efficient Policies for Data Centers through Sensitivity-Based Optimization

# Optimal Energy-Efficient Policies for Data Centers through Sensitivity-Based Optimization

## Abstract

In this paper, we propose a novel dynamic decision method by applying the sensitivity-based optimization theory to find the optimal energy-efficient policy of a data center with two groups of heterogeneous servers. Servers in Group 1 always work at high energy consumption, while servers in Group 2 may either work at high energy consumption or sleep at low energy consumption. An energy-efficient control policy determines the switch between work and sleep states of servers in Group 2 in a dynamic way. Since servers in Group 1 are always working with high priority to jobs, a transfer rule is proposed to migrate the jobs in Group 2 to idle servers in Group 1. To find the optimal energy-efficient policy, we set up a policy-based Poisson equation, and provide explicit expressions for its unique solution of performance potentials by means of the RG-factorization. Based on this, we characterize monotonicity and optimality of the long-run average profit with respect to the policies under different service prices. We prove that the bang-bang control is always optimal for this optimization problem, i.e., we should either keep all servers sleep or turn on the servers such that the number of working servers equals that of waiting jobs in Group 2. As an easy adoption of policy forms, we further study the threshold-type policy and obtain a necessary condition of the optimal threshold policy. We hope the methodology and results derived in this paper can shed light to the study of more general energy-efficient data centers.

Keywords: Queueing; Data center; Energy-efficient policies; Sensitivity-based optimization; Markov decision process.

## 1 Introduction

Data centers have become a core part of the IT infrastructure for Internet service. Typically, hundreds of thousands of servers are deployed in a data center to provide ubiquitous computing environments. Tremendous energy consumption becomes a significant operation expense of data centers. In 2014, the electricity consumption of data centers in the USA estimated 70 billion KWh, accounted for 2% of the national electricity consumption [43]. The data centers in the USA are expected to consume energy 140 TWh and spend $13 billion energy bills by 2020 [37], while these figures in Europe will reach 104 TWh and$9.6 billion [7]. The energy consumption of data centers consists of three main parts: servers, networks and cooling, while servers are the major one. It is estimated that servers consume around 70% of the total energy consumption in a data center with tiered architectures [26]. On the other hand, reducing the energy consumption of servers also can help reduce the energy consumption of networking and cooling. Therefore, energy-efficient scheduling of servers is of significance for the energy management of data centers.

During the last two decades, considerable attention has been paid to studying the energy efficiency of data centers. An early interesting observation by Barroso and Hölzle [1] demonstrated that a lot of data centers were designed to be able to handle the peak loads effectively, but it directly caused that a significant number of servers (about 20%) are often idle in the off-peak period. Although the idle servers do not provide any service, they still consume a notable amount of energy. Therefore, it is necessary to design an energy-efficient mechanism for effectively saving energy of idle servers. Previous studies demonstrate that a potential power cutting could be as remarkable as 40% [4]. For this purpose, a key technique, called an energy-efficient state ‘sleep’ or ‘off’, was introduced to save energy for idle servers. See Gandhi et al. [14] and Kuehn and Mashaly [28] for more interpretations. In this case, some queueing models either with server energy-efficient states (e.g., work, idle, sleep, and off) or with server control policies (e.g., vacation, setup, and -policy) were developed in the study of energy-efficient data centers. Queueing theory and Markov (reward or decision) processes become two useful mathematical tools in analyzing energy-efficient data centers. See Gandhi [9] and Li et al. [31] for more details.

Few available studies have applied queueing theory and Markov processes to performance analysis and optimization of energy-efficient data centers. Important examples in the recent literature are remarked as follows. Gandhi et al. [11] considered a data center with multiple identical servers, the states of which include work, idle, sleep and off, and their energy consumption have a decreasing order. One crucial technique given in Gandhi et al. [11, 13] was to develop some interesting queueing models, for example, the M/M/k queue with setup times. Since then, some multi-server queues have received attention (for example, queues with server vacations, queues with either local setup times or -policy), and they were successfully applied to energy-efficient management of data centers. Readers may refer to recent publications for more details, among which are Mazzucco et al. [34], Schwartz et al. [42], Gandhi and Harchol-Balter [12], Gandhi et al. [10], Maccio and Down [33], Phung-Duc [38], Chen et al. [6], and Li et al. [31].

In the study of energy-efficient data centers, it is a key to develop effective optimal methods and dynamic control techniques in data centers. So far, there have been two classes of optimal methods applied to the analysis of energy-efficient data centers. The first class is regarded as ‘static optimization’ with two basic steps. Step one is to set up performance cost (i.e., a suitable performance-energy tradeoff) of a data center, where the performance cost can be expressed by means of queueing indexes of the data center. Step two is to optimize the performance cost with respect to some key parameters of the data center by using, such as, linear programming, nonlinear programming, integer programming, and bilevel programming. The second class is viewed as ‘dynamic optimization’ in which Markov decision processes or stochastic network optimization are applied to energy-efficient management of data centers, e.g., see Benini et al. [2] and Yao et al. [57] for more details.

For the static optimization, some available works have been successfully conducted according to two key points: The first key point is to emphasize how to construct a suitable utility function for the performance-energy tradeoff, which needs to synchronously optimize several different performance measures, for example, reducing energy consumption, reducing system response time, and improving quality of service. The second key point is to minimize performance cost with respect to some crucial parameters of data centers by means of, such as linear programming and nonlinear programming. On such a research line, Gandhi et al. [11] recalled two classes of performance-energy tradeoffs: (a) ERWS, the weighted sum of the mean response time and the mean power cost , where are weighted coefficients; and (b) ERP, the product of the mean response time and the mean power cost. For the ERP, Gandhi et al. [11] first described the data center as a queue to compute the two mean values and , and then provided an optimization method to minimize the ERP. Also, they further analyzed optimality or near-optimality of several different energy-efficient policies. In addition, Gandhi [9] gave some extended results and a systematical summarization with respect to minimizing the ERP. Maccio and Down [33] generalized the ERP by Gandhi [11] to a more general performance cost function as follows.

 f(β,w)=M∑i=1βi([R])wR,i([E])wE,i([C])wC,i,

where is the expected cycle rate, and , , and for are nonnegative weighted coefficients, and . They used the queueing models to compute the three mean values , and , and then provided some discussion on the optimality of cost function . Gebrehiwot et al. [15] made another interesting generalization of the ERP and ERWS by Gandhi [11] through introducing the multiple intermediate sleep states. Under more general assumptions with general service and setup times, they computed the two mean values and by means of some queueing insensitivity, and then discussed the optimality of the ERP and ERWS. Further, Gebrehiwot et al. [16, 17] generalized the FCFS queueing results of the data center with multiple intermediate sleep states to the processor-sharing discipline and the shortest remaining processing time (SRPT) discipline, respectively. Different from the ERP and ERWS, Mitrani [35, 36] considered a data center of identical servers whose first part contains servers. The idle or work state of servers is controlled by two different thresholds: an up threshold and a down threshold . He designed a simple three-layer queue to describe the energy-efficient data center in terms of a new performance cost: , where and are the average numbers of jobs present and of energy consumption, respectively. He provided expressions for computing the average numbers and such that the performance cost can be optimized with respect to the three parameters , and .

However, for the dynamic optimization, little work has been done on applying Markov decision processes to set up optimal dynamic control policies for energy-efficient data centers. In general, such a study is more interesting, difficult and challenging due to the fact that a complicated queueing model with synchronously multiple control objectives (e.g., reducing energy consumption, reducing system response time and guaranteeing quality of service) needs to be synthetically established in a Markov decision process. For a data center with multiple identical servers, Kamitsos et al. [23, 24, 25] constructed a discrete-time Markov decision process by uniformization and proved that the optimal sleep energy-efficient policy is simply hysteretic. Hence, this problem has a double threshold structure by means of the optimal hysteretic policy given in Hipp and Holzbaur [19] and Lu and Serfozo [32]. On the other hand, as some close research to energy-efficient data centers, it is worthwhile to note that the policy optimization and dynamic power management for electronic systems or equipments were developed well by means of Markov decision processes and stochastic network optimization. Important examples include: (a) Discrete-time Markov decision processes by Benini et al. [2] and Yang et al. [56]; (b) Continuous-time Markov decision processes by Qiu and Pedram [40] and Qiu et al. [41]; (c) Stochastic network optimization by Yao et al. [57] and Huang and Neely [20]; (d) It has become increasingly important to simplify the method of Markov decision processes such that more complicated stochastic networks can be analyzed effectively. On this research line, event-driven techniques of Markov decision processes have received high attention for the past one decade. Important examples include the event-driven power management by Šimunić et al. [44], and the event-driven optimization techniques by Becker et al. [3], Cao [5], Koole [27], Engel and Etzion [8], and Xia et al. [52].

The purpose of this paper is to apply the Markov decision processes to set up an optimal dynamic control policy for energy-efficient data centers. To do this, we first apply the sensitivity-based optimization theory in the study of data centers. Note that the sensitivity-based optimization is greatly refined from the Markov decision processes through re-expressing the Poisson equation (corresponding to the Bellman optimality equation) by means of several novel tools, for instance, performance potential and performance difference (see Cao’s book [5]). Also, the sensitivity-based optimization theory can be effectively related to the Markov reward processes (e.g., see Li [29] and Li and Cao [30]) so that it is an effective dynamic decision method for performance optimization of many practical Markov systems. The key idea in the sensitivity-based optimization theory is a performance difference equation that can quantify the performance difference of a Markov system under any two different policies. The difference equation gives a clear relation that explains how the system performance is varying with respect to policies. See an excellent book by Cao [5] for more details. So far, the sensitivity-based optimization theory has been applied to performance optimization of queueing systems (or networks). Important examples include an early invited overview by Xia and Cao [48]; the MAP/M/1 queue by Xia et al. [50]; the closed queueing networks by Xia and Shihada [54], Xia [46] and Xia and Jia [51]; and the open queueing networks by Xia [47] and Xia and Chen [49]. In addition, the sensitivity-based optimization theory was also applied to network energy management, for example, the multi-hop wireless networks by Xia and Shihada [55] and the tandem queues with power constraints by Xia et al. [53].

The main contributions of this paper are twofold. The first one is to apply the sensitivity-based optimization theory to study the optimal energy-efficient policies of data centers for the first time, in which we propose a job transfer rule among the server groups such that the sleep energy-efficient mechanism becomes more effective. Different from previous works in the literature for applying an ordinary Markov decision process to dynamic control of data centers, we propose and develop an easier and more convenient dynamic decision method: sensitivity-based optimization, in the study of energy-efficient data centers. Crucially, this sensitivity-based optimization method may open a new avenue to the optimal energy-efficient policy of more complicated data centers. The second contribution is to characterize the optimal energy-efficient policy of data centers. We set up a policy-based Poisson equation and provide explicit expression for its unique solution by means of the RG-factorization. Based on this, we analyze the monotonicity and optimality of the long-run average profit with respect to the energy-efficient policies under some restrained service prices. We obtain the structure of optimal energy-efficient policy. Specifically, we prove that the bang-bang control is optimal for this problem, which significantly reduces the large search space. We also provide an effective way to design and verify the threshold-type mechanism in practice, which is of great significance to solve the mechanism design problem of energy-efficient data centers. Therefore, the results of this paper give new insights on understanding not only mechanism design of energy-efficient data centers, but also applying the sensitivity-based optimization to dynamic control of data centers. We hope that the methodology and results given in this paper can shed light to the study of more general energy-efficient data centers.

The remainder of this paper is organized as follows. In Section 2, we describe the problem of an energy-efficient data center with two groups of different servers. In Section 3, for the energy-efficient data center, we first establish a policy-based continuous-time birth-death process with finite states. Then we define a suitable reward function with respect to states and policies of the birth-death process. Based on this, we formulate a dynamic optimization problem to find the optimal energy-efficient policy of the data center. In Section 4, we set up a policy-based Poisson equation and provide explicit expression for its unique solution by means of the RG-factorization. In Section 5, we define a perturbation realization factor of the policy-based control process of the data center, and analyze how the service price impacts on the perturbation realization factor. In Section 6, we use the Poisson equation to derive a useful performance difference equation. Based on this, we discuss the monotonicity and optimality of the long-run average profit with respect to the energy-efficient policies, and prove the optimality of the bang-bang control. In Section 7, we use the Poisson equation to further study a class of threshold energy-efficient policies, and obtain the necessary condition of the optimal threshold policy. Finally, we give some discussions and conclude this paper in Section 8.

## 2 Problem Description

In this section, we give a problem description of the energy-efficient problem in data centers. As the large variation of working loads in data centers, it is widely adopted to organize and operate the data center in a multiple-tier architecture such that the on/off scheduling can be performed in different tiers to save energy [26]. In this paper, we study a data center with two groups of heterogeneous servers. There is no waiting room for jobs in the data center, which can be viewed as a loss queue. The assumption of loss queue model is reasonable for data centers and it is also widely used in telephone systems, computer networks, cloud computing, and so on [18, 21, 45]. In what follows we provide a detailed problem description for the data center.

Server groups: The data center contains two server groups: Groups 1 and 2, each of which is also one of the interactive subsystems of the data center. Groups 1 and 2 contain and servers, respectively. Hence the data center contains servers. Servers in the same group are homogeneous and in different groups are heterogeneous. Note that Group 1 is viewed as a base-line group whose servers are always at the work state to guarantee a necessary service capacity in the data center. Each server in Group 1 consumes an amount of energy per unit of time. By contrast, Group 2 is regarded as a reserved group whose servers may either work or sleep so that each of the servers can switch its state between work and sleep. If one server in Group 2 is at the sleep state, then it consumes a smaller amount of energy to maintain the sleep state.

Power consumption: The power consumption rates (i.e., power consumption per unit of time) for the two groups of servers are described as: and for the work state in Groups 1 and 2, respectively; and only for the sleep state in Group 2. We assume that .

Arrival processes: The arrivals of jobs at the data center are a Poisson process with arrival rate . Each arriving job is assigned to a server of the two groups according to the following allocation rules:

(a) Each server in Group 1 must be fully utilized so that Group 1 provides a priority service over Group 2. If Group 1 has some idle servers, then the arriving job immediately enters the idle server in Group 1 and receives service there. Furthermore, if all the servers of Group 1 are busy but Group 2 has some idle servers, then the arriving job immediately enters an idle server in Group 2 and receives service there.

(b) No waiting room. A job can be served at an idle server or wait at a sleeping server in Group 2. If all the servers of Groups 1 and 2 are occupied, then any arriving job must be lost immediately. Note that each server may contain only one job, hence the total number of jobs in the data center cannot exceed the number .

(c) Opportunity cost. Once the data center contains jobs, then any arriving job has to be lost immediately due to no waiting room. This leads to an opportunity cost with respect to the job loss.

Service processes: In Groups 1 and 2, the service times provided by each server are independent and exponential with the service rate and , respectively. We assume that as a fast condition, which makes the prior use of servers in Group 1.

Once a job enters the data center to receive or wait for service, it has to pay a holding cost in Group 1 or Group 2. We assume a so-called cheap condition that the holding cost in Group 1 is always cheaper than that in Group 2. The fast and cheap conditions are intuitive to guarantee the prior use of servers in Group 1. That is, the servers in Group 1 are not only faster but also cheaper than those in Group 2.

If a job finishes its service at a server and leaves the system, then the data center can obtain a fixed service revenue from each served job. The service discipline of each server in the data center is First Come First Serve (FCFS).

Transfer rule: Based on prior use of servers in Group 1, whenever a server in Group 1 becomes idle, an incomplete-service job (if exists) in Group 2 should be transferred to the idle server in Group 1 to save processing time. When a job is transferred from Group 2 to Group 1, the data center needs to pay a transfer cost.

Independence: We assume that all the random variables defined above in the data center are independent.

Finally, the data center, together with its operational mode and mathematical notations, are simply depicted in Figure 1.

###### Remark 1

A further interpretation for the transfer rule: When a job is being served at a server of Group 2, it can be transferred to one idle server of Group 1 and restart its service when an idle server in Group 1 is available. Note that each server in Group 1 is not only faster but also cheaper than that in Group 2, that is, the fast and cheap conditions guarantee that servers in Group 1 have priority than those in Group 2. From the memoryless property of exponential distributions, the restarting service in Group 1 is still faster and cheaper than its original service in Group 2. Therefore, this transfer rule effectively supports energy-efficient management of the data center due to the fact that the servers of Group 1 are fully utilized while servers of Group 2 are kept in sleep state as many as possible.

###### Remark 2

Although some authors (e.g., see Gandhi et al. [11], Mitrani [35, 36] and Maccio and Down [33]) analyzed the energy-efficient data center with two groups of servers, where one is the base-line group and the other is the reserved or subsidiary group, all the servers in the two groups given in their papers are assumed to be identical. From such a point of view, it is easy to understand that those queueing models given in their works are the same as only Group 2 of our paper. On the other hand, it is noted that for our queueing model, Group 1 has a large influence on analysis of Group 2 due to introducing new transfer rules. In fact, our queueing model here has been discussed in Li et al. [31] with a more general setting.

## 3 Optimization Model Formulation

In this section, for the energy-efficient data center, we first establish a policy-based continuous-time Markov process, and show that its infinitesimal generator has the simple structure of a birth-death process with finite states. Then, we define a suitable reward function with respect to both states and policies of the birth-death process. Based on this, we set up a dynamic optimization model to deal with the optimal energy-efficient policy of the data center.

In the data center with Group 1 of servers and Group 2 of servers, we need to introduce both ‘states’ and ‘policies’ to express stochastic dynamics of this data center. Let and be the number of jobs in Groups 1 and 2, respectively. Then, is regarded as the state of the data center at time . Let all the cases of state form a state space as follows.

 Ω={(0,0),(1,0),…,(n,0),(n,1),…,(n,m)}.

Note that such a state for and does not exist according to the transfer rule. However, the policies are defined with a little bit more complication. Let be the number of servers turned on in Group 2 at the state , for and . From the problem description in Section 2, it is easy to see that

 di,j=⎧⎪ ⎪⎨⎪ ⎪⎩0,i=0,1,…,n−1,j=0,0,i=n,j=0,0,1,…,m,i=n,j=1,2,…,m. (1)

Now, we provide an interpretation for the above expression: If , then due to the transfer rule. In this case with , is taken as zero due to the energy-efficient cause. While once , may be any element in the set . In this case with and , may be taken as any element in the set .

Corresponding to each case of state at time , we define a time-homogeneous policy as

 d=(d0,0,d1,0,…,dn,0;dn,1,dn,2,…,dn,m).

It follows from (1) that

 d=(0,0,…,0;dn,1,dn,2,…,dn,m). (2)

Let all the possible policies given in (2) compose a policy space as follows.

 D={d:d=(0,0,…,0;dn,1,dn,2,…,dn,m),di,j∈{0,1,…,m}, (i,j)∈Ω}.

Let be the system state at time under any given policy . Then, is a policy-based continuous-time birth-death process on the state space whose infinitesimal generator is given by

 B(d)=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝−λλμ1−(λ+μ1)λ⋱ \ \ \ \ \ \ ⋱ \ \ \ \ \ \ \ ⋱ \ \ \ \ \ \ nμ1−(λ+nμ1)λν(dn,1)−[λ+ν(dn,1)]λ\ \ \ \ \ \ \ ⋱⋱ \ \ \ \ \ \ \ \ \ \ ⋱ \ \ \ \ \ ν(dn,m−1)−[λ+ν(dn,m−1)]λν(dn,m)−ν(dn,m)⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠, (3)

where for , and denotes the minimal one between two real numbers and . Note that , it is clear that . Thus, the birth-death process must be irreducible, aperiodic and positive recurrent for any given policy . In this case, we write the stationary probability vector of the Markov process under a policy .

 π(d)=(π(d)(0,0),π(d)(1,0),…,π(d)(n,0),π(d)(n,1),…,π(d)(n,m)). (4)

Obviously, the stationary probability vector is the unique solution to the system of linear equations: and , where is a column vector of ones with proper dimension. We write

 ξi,0=λii!μi1, \ i=0,1,…,n, (5)

and

 ξ(d)n,j=λnn!μn1λjjΠi=1ν(dn,i), % \ j=1,2,…,m, (6)
 b(d)=n∑i=0ξi,0+m∑j=1ξ(d)n,j. (7)

It follows from Subsection 1.1.4 of Chapter 1 in Li [29] that

 π(d)(i,0) =1b(d)ξi,0, \ i=0,1,…,n; (8) π(d)(n,j) =1b(d)ξ(d)n,j, \ j=1,2,…,m. (9)

For any two vectors and , we say that if for any . The following proposition provides an interesting observation on how a policy influences the stationary probability vector .

###### Proposition 1

For any two given policies with , then

 π(d1)(i,0)≥π(d2)(i,0), \ i=0,1,…,n.

Proof: For any two given policies with , then it follows from (5) that

 ξ(d1)i,0=ξ(d2)i,0=λii!μi1, \ i=0,1,…,n.

If , then for each , it is clear that , this gives

 ν(d1n,j)=nμ1+(d1n,j∧j)μ2≥nμ1+(d2n,j∧j)μ2=ν(d2n,j),

hence it follows from (6) that

 ξ(d1)n,j=λnn!μn1λjjΠi=1ν(d1n,i)≤λnn!μn1λjjΠi=1ν(d2n,i)=ξ(d2)n,j.

It is easy to see from (7) that

 b(d1)=n∑i=0ξi,0+m∑j=1ξ(d1)n,j≤n∑i=0ξi,0+m∑j=1ξ(d2)n,j=b(d2).

Thus, it follows (8) that for each ,

 π(d1)(i,0)=1b(d1)ξi,0≥1b(d2)ξi,0=π(d2)(i,0).

This completes the proof.

The following theorem provides a useful observation that some special policies have no effect on both the infinitesimal generator and the stationary probability vector . Note that this theorem will be necessary and useful for the analysis of policy monotonicity and optimality in our later study, for example, the proof of Theorem 3.

###### Theorem 1

Suppose that two policies satisfy the following two conditions: For each , (a) if , then we take ; and (b) if , we take as any element of the set . We have

 B(d1)=B(d2), \ π(d1)=π(d2).

Proof: It is easy to see from (3) that the first rows of the matrix is the same as those of the matrix .

In what follows we compare the latter rows of the matrix with those of the matrix . For the two policies satisfying the two conditions (a) and (b), by using , it is clear that for ,

 ν(d1n,j)=nμ1+jμ2=ν(d2n,j).

Thus, it follows from (3) that , this also gives that . This completes the proof.

Based on the problem description in Section 2, we define a suitable reward function for the energy-efficient data center. For convenience of readers, here we explain both energy consumption cost and system operational cost with respect to some key factors of the data center. We summarize five classes of costs in the data center as follows.

(a) Energy consumption cost. It is seen from Section 2 that and are the energy consumption rates for the work state in Group 1 and Group 2, respectively; while is the energy consumption rate for the sleep state only in Group 2. In addition, is the energy consumption price.

(b) Holding cost. Each job in the data center has to pay a holding cost (resp. ) per unit of sojourn time in Group 1 (resp. Group 2). Moreover, we have two assumed conditions: A fast condition and a cheap condition .

(c) Transfer cost. If there are some idle servers in Group 1, then the jobs in the servers in Group 2 must be transferred to the idle servers in Group 1 as many as possible. In this case, such each transfer needs to pay a transfer cost for each job.

(d) Opportunity cost. Once the data center has contained jobs, then any arriving job has to be lost immediately. This leads to an opportunity cost due to the job loss, hence is an opportunity cost for each lost job.

(e) Service price. If a job finishes its service at a server and leaves this system, then the data center gains a fixed service revenue (or earnings) for each served job, that is, is the service price.

Based on the above cost and price definition, a reward function with respect to both states and policies is defined as a profit rate (i.e. the total revenues minus the total costs per unit of time). Therefore, the reward function at state under policy is defined as

 f(d)(i,j)= −[iC(1)2+jC(2)2]−iμ11{j>0}C3−λ1{i=n,j=m}C4, (10)

where is an indicator function whose value is 1 when the event in happens, otherwise its value is 0. Furthermore, the job transfer rate from Group 2 to Group 1 is given by . If , then and . If and , then . If and , then .

For convenience of readers, it is necessary to explain the reward function from four different cases as follows.

Case (a): For and ,

 f(0,0)=−(nP1,W+mP2,S)C1. (11)

Case (b): For and ,

 f(i,0)=Riμ1−(nP1,W+mP2,S)C1−iC(1)2. (12)

Note that in Cases (a) and (b), there is no job in Group 2, thus each server of Group 2 is at the sleep state. However, in the following two cases (c) and (d), there are some jobs in Group 2, hence the policy  will play a key role in opening or closing some servers of Group 2 in order to effectively save energy.

Case (c): For , ; and ,

 f(d)(n,j) \ \ −[nC(1)2+jC(2)2]−nμ1C3. (13)

To further simplify or compute (13), we need to especially deal with . To this end, if , then , hence we have

 f(d)(n,j) \ \ −(nP1,W+mP2,S)C1−[nC2(1)+jC(2)2]−nμ1C3; (14)

while if , then , hence we have

 f(d)(n,j) =R(nμ1+jμ2)−(P2,W−P2,S)C1dn,j \ \ −(nP1,W+mP2,S)C1−[nC(1)2+jC(2)2]−nμ1C3. (15)

Case (d): For and ; and , we obtain that , and we have

 f(d)(n,m) =R(nμ1+dn,mμ2)−[nP1,W+dn,mP2,W+(m−dn,m)P2,S]C1 \ \ −[nC(1)2+mC(2)2]−nμ1C3−λC4 \ \ −(nP1,W+mP2,S)C1−[nC(1)2+mC(2)2]−nμ1C3−λC4. (16)

We define an -dimensional column vector composed of the elements and  as

 f(d)=(f(0,0),f(1,0),…,f(n,0),f(d)(n,1),…,f(d)(n,m))T, (17)

where denotes transpose of vector or matrix .

In the remainder of this section, the long-run average profit of the data center (or the policy-based continuous-time birth-death process ) under an energy-efficient policy is defined as

 ηd =limT→+∞E{1T∫T0f(d)((I(t),J(t))(d))%dt} =limT→+∞E{1T∫T0f(d)(X(d)(t))dt} =π(d)f(d), (18)

where and are given by (4) and (17), respectively.

We observe that as the number of working servers in Group 2 decreases, the total revenues and the total costs in the data center will decrease synchronously, vice versa. Thus, there is a tradeoff between the total revenues and the total costs. This motivates us to study an optimal mechanism design for the energy-efficient data center. The objective is to find an optimal energy-efficient policy such that the long-run average profit is maximize, that is,

 d∗=argmaxd∈D{ηd}. (19)

In fact, it is difficult and challenging to analyze the properties of the optimal energy-efficient policy , and to provide an effective algorithm for computing the optimal policy . In the next section, we will introduce the sensitivity-based optimization theory to study this energy-efficient optimization problem.

## 4 The Poisson Equation and Its Explicit Solution

In this section, for the energy-efficient data center, we set up a Poisson equation which is derived by means of the law of total probability focusing on some stop times. It is worth noting that the Poisson equation provides a useful relation between the sensitivity-based optimization and the Markov decision processes (MDPs). Also, we use the RG-factorization, given in Li and Cao [30] or Li [29], to solve the Poisson equation and provide the explicit expression for its unique solution.

For , it follows from Chapter 2 in Cao [5] that for the policy-based continuous-time Markov process , we define the performance potential as

 g(d)(i,j) =E{∫+∞0[f(d)((I(t),J(t))(d))−ηd]dt∣∣∣(I(0),J(0))(d)=(i,j)} (20)

where is defined in (18). It is seen from Cao [5] that for any policy , quantifies the contribution of the initial state to the long-run average profit of the data center. Here is also called the relative value function or the bias in the traditional MDP theory, see, e.g. Puterman [39]. We further define a column vector with elements for as

 g(d)=(g(d)(0,0),g(d)(1,0),…,g(d)(n,0),g(d)(n,1),…,g(d)(n,m))T. (21)

We define the first departure time from state as

 τ=inf{t≥0:(I(t),J(t))(d)≠(i,j)},

where . Clearly, is a stop time of the Markov process . Based on this, if , then it is seen from (3) that state is the upper boundary state of the birth-death process , hence . Similarly, we get a basic relation for the stop time as follows.

 (22)

Now, we derive a Poisson equation to compute the column vector in terms of the stop time and the basic relation (22). By a similar computation to that in Li and Cao [30] or Xia et al. [50], our analysis is decomposed into five parts as follows.

For and we have

 g(d)(i,0)= E{∫+∞0[f(d)(X(d)(t))−ηd]dt∣∣∣(I(0),J(0))(d)=(i,0)} = E{τ|I(t)=i,J(t)=0}[f(i,0)−ηd] +E{∫+∞τ[f(d)(X(d)(t))−ηd]dt∣∣∣(I(τ),J(τ))(d)} = 1λ+iμ1[f(i,0)−ηd] +λλ+iμ1E{∫+∞0[f(d)(X(d)(t))−ηd]dt∣∣∣(I(0),J(0))(d)=(i+1,0)} +iμ1λ+iμ1E{∫+∞0[f(d)(X(d)(t))−ηd]dt∣∣∣(I(0),J(0))(d)=(i−1,0)} = 1λ+iμ1[f(i,0)−ηd]+λλ+iμ1g(d)(i+1,0)+iμ1λ+iμ1g(d)(i−1,0), (23)

where for the birth-death process , it is easy to see that

 E{τ|I(t)=i,J(t)=0}=1λ+iμ1.

Thus, we obtain

 (24)

Base on (23), with a boundary consideration, for and , we have

 (25)

For and we have

 (26)

For and , we have

 g(d)(n,j)= E{∫+∞0[f(d)(X(d)(t))−ηd]dt∣∣∣(I(0),J(0))(d)=(n,j)} = E{τ|I(t)=n,J(t)=j}[f(d)(n,j)−ηd] +E{∫+∞τ[f(d)(X(d)(t))−ηd]dt∣∣∣(I(τ),J(τ))(d)} = 1λ+ν(dn,j)[f(d)(n,j)−ηd] +λλ+ν(dn,j)E{∫+∞0[f(d)(X(d)(t))−ηd]dt∣∣∣(I(0),J(0))(d)=(n,j+1)} +ν(dn,j)λ+ν(dn,j)E{∫+∞0[f(d)(X(d)(t))−ηd]dt∣∣∣(I(0),J(0))(d)=(n,j−1)} = 1λ+ν(dn,j)[f(d)(n,j)−ηd]+λλ+ν(dn,j)g(d)(n,j+1) +ν(dn,j)λ+ν(dn,j)g(d)(n,j−1), (27)

where

 E{τ|I(t)=n,J(t)=j}=1λ+ν(dn,j). (28)

It follows from (27) that

 ν(dn,j)g(d)(n,j−1)−[λ+ν(dn,j)]g(d)(n,j)+λg(d)(n,j+1)=ηd−f(d)(n,j). (29)

For and , with a boundary consideration, a similar analysis to (29) gives

 ν(dn,m)g(d)(n,m−1)−ν(dn,m)g(d)(n,m)=ηd−f(d)(n,m). (30)

Note that due to the fact that .

It follows from (24), (25), (26), (29) and (30) that

 B(d)g(d)=ηde−f(d)

or

 −B(d)g(d)=f(d)−ηde, (31)

where is given in (17) and is given in (3).

To solve the system of linear equations (31), we note that rank and due to that the size of the matrix is . Hence, this system of linear equations (31) exists infinitely-many solutions with a free constant of an additive term. Let be a matrix obtained through omitting the first row and the first column vectors of the matrix . Then,

 B=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝−(λ+μ1)λ2μ1−(λ+2μ1)λ⋱ \ \ \ \ \ \ ⋱ \ \ \ \ \ \ \ ⋱ \ \ \ \ \ \ nμ1−(λ+nμ1)λν(dn,1)−[λ+ν(dn,1)]λ\ \ \ \ \ \ \ ⋱⋱ \ \ \ \ \ \ \ \ \ \ ⋱ \ \ \ \ \ ν(dn,m−1)−[λ+ν(dn,m−1)]λν(dn,m)−ν(dn,m)⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠.

Obviously, rank and the size of the matrix is . Hence, the matrix is invertible.

Let and be two column vectors of size obtained through omitting the first element of the two column vectors and with size , respectively. Then,

 h(d)=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝f(1,0)−ηd⋮f(n,0)−ηdf(d)(n,1)−ηd⋮f(d)(n,m)−ηd⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠def=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝h(d)1⋮h(d)nh(d)n+1⋮h(d)n+m⎞⎟ ⎟ ⎟ ⎟