# Joint Task Offloading and Resource Allocation for Multi-Server Mobile-Edge Computing Networks The authors are with the Department of Electrical and Computer Engineering, Rutgers University–New Brunswick, NJ, USA (e-mail: tuyen.tran@rutgers.edu, pompili@cac.rutgers.edu).This work was supported by the National Science Foundation (NSF) Grant No. CNS-1319945.

## 1Introduction

Motivation:

The rapid growth of mobile applications and the Internet of Things (IoTs) have placed severe demands on cloud infrastructure and wireless access networks such as ultra-low latency, user experience continuity, and high reliability. These stringent requirements are driving the need for highly localized services at the network edge in close proximity to the end users. In light of this, the Mobile-Edge Computing (MEC) [1] concept has emerged, which aims at uniting telco, IT, and cloud computing to deliver cloud services directly from the network edge. Differently from traditional cloud computing systems where remote public clouds are utilized, MEC servers are owned by the network operator and are implemented directly at the cellular Base Stations (BSs) or at the local wireless Access Points (APs) using a generic-computing platform. With this position, MEC allows for the execution of applications in close proximity to end users, substantially reducing end-to-end (e2e) delay and releasing the burden on backhaul networks [2].

With the emergence of MEC, the ability of resource-constrained mobile devices to offload computation tasks to the MEC servers is expected to support a myriad of new services and applications such as augmented reality, IoT, autonomous vehicles and image processing. For example, the face detection and recognition application for airport security and surveillance can be highly benefit from the collaboration between mobile devices and MEC platform [3]. In this scenario, a central authority such as FBI would extend their Amber alerts such that all available cell phones in the area where a missing child was last seen that opt-in to the alert would actively capture images. Due to the significant amount of processing and the need for a large database of images, the captured images are then forwarded to the MEC layer to perform face recognition.

Our Vision:

Challenges and Contributions:

In this context, the main contributions of this article are summarized as follows.

• Given the NP-hardness of the JTORA problem, we propose to decompose the problem into (i) a Resource Allocation (RA) problem with fixed task offloading decision and (ii) a Task Offloading (TO) problem that optimizes the optimal-value function corresponding to the RA problem.

• We further show that the RA problem can be decoupled into two independent problems, namely the Uplink Power Allocation (UPA) problem and the Computing Resource Allocation (CRA) problem; the resulting UPA and CRA problems are addressed using quasi-convex and convex optimization techniques, respectively.

• We propose a novel low-complexity heuristic algorithm to tackle the TO problem and show that it achieves a suboptimal solution in polynomial time.

• We carry out extensive numerical simulations to evaluate the performance of the proposed solution, which is shown to be near-optimal and to improve significantly the users’ offloading utility over traditional approaches.

Article Organization:

The remainder of this article is organized as follows. In Sect. Section 2, we review the related works. In Sect. Section 3, we present the system model. The joint task offloading and resource allocation problem is formulated in Sect. Section 4, followed by the NP-hardness proof and decomposition of the problem itself. We present our proposed solution in Sect. Section 5 and numerical results in Sect. Section 6. Finally, in Sect. Section 7 we conclude the article.

## 2Related Works

The MEC paradigm has attracted considerable attention in both academia and industry over the past several years. In 2013, Nokia Networks introduced the very first real-world MEC platform [13], in which the computing platform—Radio Applications Cloud Servers (RACS)—is fully integrated with the Flexi Multiradio BS. Saguna also introduced their fully virtualized MEC platform, so called Open-RAN [14], that can provide an open environment for running third-party MEC applications. Recently, a MEC Industry Specifications Group (ISG) was formed to standardize and moderate the adoption of MEC within the RAN [1].

A number of solutions have also been proposed to exploit the potential benefits of MEC in the context of the IoTs and 5G. For instance, our previous work in [2] proposed to explore the synergies among the connected entities in the MEC network and presented three representative use-cases to illustrate the benefits of MEC collaboration in 5G networks. In [15], we proposed a collaborative caching and processing framework in a MEC network whereby the MEC servers can perform both caching and transcoding so as to facilitate Adaptive Bit-Rate (ABR) video streaming. Similar approach was also considered in [16] which combined the traditional client-driven dynamic adaptation scheme, DASH, with network-assisted adaptation capabilities. In addition, MEC is also seen as a key enabling technique for connected vehicles by adding computation and geo-distributed services to the roadside BSs so as to analyze the data from proximate vehicles and roadside sensors and to propagate messages to the drivers in very low latency [17].

In summary, most of the existing works did not consider a holistic approach that jointly determines the task offloading decision and the radio and computing resource allocation in a multi-cell, multi-server system as considered in this article.

## 3System Model

We consider a multi-cell, multi-server MEC system as illustrated in Figure 1, in which each BS is equipped with a MEC server to provide computation offloading services to the resource-constrained mobile users such as smart phones, tablets, and wearable devices. In general, each MEC server can be either a physical server or a virtual machine with moderate computing capabilities provisioned by the network operator and can communicate with the mobile devices through wireless channels provided by the corresponding BS. Each mobile user can choose to offload computation tasks to a MEC server from one of the nearby BSs it can connect to. We denote the set of users and MEC servers in the mobile system as and , respectively. For ease of presentation, we will refer to the MEC server , server , and BS interchangeably. The modeling of user computation tasks, task uploading transmissions, MEC computation resources, and offloading utility are presented here below.

Let denote the local computing capability of user in terms of CPU . Hence, if user executes its task locally, the task completion time is . To calculate the energy consumption of a user device when executing its task locally, we use the widely adopted model of the energy consumption per computing cycle as [6], where is the energy coefficient depending on the chip architecture and is the CPU frequency. Thus, the energy consumption, , of user when executing its task locally, is calculated as,

In case user offloads its task to one of the MEC servers, the incurred delay comprises: (i) the time to transmit the input to the MEC server on the uplink, (ii) the time to execute the task at the MEC server, and (iii) the time to transmit the output from the MEC server back to the user on the downlink. Since the size of the output is generally much smaller than the input, plus the downlink data rate is much higher than that of the uplink, we omit the delay of transferring the output in our computation, as also considered in [11].

where is the background noise variance and the first term at the denominator is the accumulated intra-cell interference from all the users associated with other BSs on the same sub-band . Since each user only transmits on one sub-band, the achievable rate of user when sending data to BS is given as,

where . Moreover, let . Hence, the transmission time of user when sending its task input in the uplink can be calculated as,

### 3.3MEC Computing Resources

The MEC server at each BS is able to provide computation offloading service to multiple users concurrently. The computing resources made available by each MEC server to be shared among the associating users are quantified by the computational rate , expressed in terms of number of CPU . After receiving the offloaded task from a user, the server will execute the task on behalf of the user and, upon completion, will return the output result back to the user. We define the computing resource allocation policy as , in which is the amount of computing resource that BS allocates to task offloaded from user . Hence, clearly . In addition, a feasible computing resource allocation policy must satisfy the computing resource constraint, expressed as,

Given the computing resource assignment , the execution time of task at the MEC servers is,

Given the offloading policy , the transmission power , and the computing resource allocation ’s, the total delay experienced by user when offloading its task is given by,

The energy consumption of user , , due to uploading transmission is calculated as , where is the power amplifier efficiency of user . Without loss of generality, we assume that . Thus, the uplink energy consumption of user simplifies to,

In a mobile cloud computing system, the users’ QoE is mainly characterized by their task completion time and energy consumption. In the considered scenario, the relative improvement in task completion time and energy consumption are characterized by and , respectively [11]. Therefore, we define the offloading utility of user as,

in which , with , specify user ’s preference on task completion time and energy consumption, respectively. For example, a user with short battery life can increase and decrease so as to save more energy at the expense of longer task completion time. Note that offloading too many tasks to the MEC servers will cause excessive delay due to the limited bandwidth and computing resources at the MEC servers, and consequently degrade some users’ QoE compared to executing their tasks locally. Hence, clearly user should not offload its task to the MEC servers if .

The expressions of the task completion time and energy consumption in clearly shows the interplay between radio access and computational aspects, which motivates a joint optimization of offloading scheduling, radio, and computing resources so as to optimize users’ offloading utility.

## 4Problem Formulation

We formulate here the problem of joint task offloading and resource allocation, followed by the outline of our decomposition approach.

For a given offloading decision , uplink power allocation , and computing resource allocation , we define the system utility as the weighted-sum of all the users’ offloading utilities,

with given in and specifying the resource provider’s preference towards user , . For instance, depending on the payments offered by the users, the resource provider could prioritize users with higher revenues for offloading by increasing their corresponding preferences. With this position, we formulate the Joint Task Offloading and Resource Allocation (JTORA) problem as a system utility maximization problem, i.e.,

The constraints in the formulation above can be explained as follows: constraints and imply that each task can be either executed locally or offloaded to at most one server on one sub-band; constraint implies that each BS can serve at most one user per sub-band; constraint specifies the transmission power budget of each user; finally, constraints and state that each MEC server must allocate a positive computing resource to each user associated with it and that the total computing resources allocated to all the associated users must not excess the server’s computing capacity. The JTORA problem in is a Mixed Integer Nonlinear Program (MINLP), which can be shown to be NP-hard; hence, finding the optimal solution usually requires exponential time complexity [29]. Given the large number of variables that scale linearly with the number of users, MEC servers, and sub-bands, our goal is to design a low-complexity, suboptimal solution that achieves competitive performance while being practical to implement.

### 4.2Problem Decomposition

Note that the constraints on the offloading decision, , in , , , and the RA policies, , in , , , are decoupled from each other; therefore, solving the problem in is equivalent to solving the following Task Offloading (TO) problem,

in which is the optimal-value function corresponding to the RA problem, written as,

In the next section, we will present our solutions to both the RA problem and the TO problem so as to finally obtain the solution to the original JTORA problem.

We present now our low-complexity approach to solve the JTORA problem by solving first the RA problem in and then using its solution to derive the solution of the TO problem in .

Firstly, given a feasible task offloading decision that satisfies constraints , , and , and using the expression of in , the objective function in can be rewritten as,

Furthermore, from , , and , we have,

in which, for simplicity, , , and . Notice from and that the problem in has a separable structure, i.e., the objectives and constraints corresponding to the power allocation ’s and computing resource allocation ’s can be decoupled from each other. Leveraging this property, we can decouple problem into two independent problems, namely the Uplink Power Allocation (UPA) and the Computing Resource Allocation (CRA), and address them separately, as described in the following sections.

The UPA problem is decoupled from problem by considering the first term on the RHS of as the objective function. Specifically, the UPA problem is expressed as,

Problem is non-convex and difficult to solve because the uplink SINR corresponding to user depends on the transmit power of the other users associated with other BSs on the same sub-band through the inter-cell interference , as seen in . Our approach is to find an approximation for and thus for such that problem can be decomposed into sub-problems that, in turn, can be efficiently solved. The optimal uplink power allocation still generates small objective value for . Suppose each BS calculates its uplink power allocation independently, i.e., without mutual cooperation, and informs its associated users about the uplink transmit power; then, an achievable upper bound for is given by,

Similar to [30], we argue that is a good estimate of since our offloading decision is geared towards choosing the appropriate user-BS associations so as that be small in the first place. This means that a small error in should not lead to large bias in [30].

By replacing with , we get the approximation for the uplink SINR for user uploading to BS on sub-band as,

Let and . The objective function in can now be approximated by . With this position, it can be seen that the objective function and the constraint corresponding to each user’s transmit power is now decoupled from each other. Therefore, the UPA problem in can be approximated by sub-problems, each optimizing the transmit power of a user , and can be written as,

Problem is still non-convex as the second-order derivative of the objective function with respect to (w.r.t) , i.e., , is not always positive. However, we can employ quasi-convex optimization technique to address problem based on the following lemma.

See Appendix.

In general, a quasi-convex problem can be solved using the bisection method, which solves a convex feasibility problem in each iteration [31]. However, the popular interior cutting plane method for solving a convex feasibility problem requires iterations, where is the dimension of the problem. We now propose to further reduce the complexity of the bisection method.

Firstly, notice that a quasi-convex function achieves a local optimum at the diminishing point of the first-order derivative, and that any local optimum of a strictly quasi-convex function is the global optimum [32]. Therefore, based on Lemma ?, we can confirm that the optimal solution of problem either lies at the constraint border, i.e., or satisfies . It can be verified that when,

Moreover, we have, , and . This implies that is a monotonically increasing function and is negative at the starting point . Therefore, we can design a low-complexity bisection method that evaluates in each iteration instead of solving a convex feasibility problem, so as to obtain the optimal solution , as presented in Algorithm ?.

In Algorithm ?, if , the algorithm will terminate in exactly iterations. Let denote the optimal uplink transmit power policy for a given task offloading policy . Denote now as the objective value of problem corresponding to .

### 5.2Computing Resource Allocation (CRA)

The CRA problem optimizes the second term on the RHS of and is expressed as follows,

Notice that the constraint in is convex. Denote the objective function in as ; by calculating the second-order derivatives of w.r.t. , we have,

It can be seen that the Hessian matrix of the objective function in is diagonal with the strictly positive elements, thus it is positive-definite. Hence, is a convex optimization problem and can be solved using Karush-Kuhn-Tucker (KKT) conditions. In particular, the optimal computing resource allocation is obtained as,

and the optimal objective function is calculated as,

In the previous sections, for a given task offloading decision , we obtained the solutions for the radio and computing resources allocation. In particular, according to , , , and , we have,

where can be obtained through Algorithm ? and can be calculated using the closed-form expression in . Now, using , we can rewrite the TO problem in as,

Problem consists in maximizing a set function w.r.t over the ground set defined by , and the constraints in and define two matroids over . Due to the NP-hardness of such problem [33], designing efficient algorithms that guarantee the optimal solution still remains an open issue. In general, a brute-force method using exhaustive search would require evaluating possible task offloading scheduling decisions, where , which is clearly not a practical approach.

To overcome the aforementioned drawback, we propose a low-complexity heuristic algorithm that can find a local optimum to problem in polynomial time. Specifically, our algorithm starts with an empty set and repeatedly performs one of the local operations, namely the remove operation or the exchange operation, as described in Routine ?, if it improves the set value . As we are dealing with two matroid constraints, the exchange operation involves adding one element from outside of the current set and dropping up to elements from the set, so as to comply with the constraints. In summary, our proposed heuristic algorithm for task offloading scheduling is presented in Algorithm ?.

: (Complexity Analysis of Algorithm ?) Parameter in Algorithm ? is any value such that is at most a polynomial in . Let be the optimal value of problem over the ground set . It is easy to see that where is the element with the maximum over all elements of . Let be the number of iterations for Algorithm ?. Since after each iteration the value of the function increases by a factor of at least , we have , and thus . Note that the number of queries needed to calculate the value of the objective function in each iteration is at most . Therefore, the running time of Algorithm ? is , which is polynomial in .

: (Solution of JTORA) Let be the output of Algorithm ?. The corresponding solutions for the uplink power allocation and for computing resource sharing can be obtained using Algorithm ? and the closed-form expression in , respectively, by setting . Thus, the local optimal solution for the JTORA problem is . While characterizing the degree of suboptimality of the proposed solution is a non-trivial task—mostly due to the combinatorial nature of the task offloading decision and the nonconvexity of the original UPA problem—in the next section we will show via numerical results that our heuristic algorithm performs closely to the optimal solution using exhaustive search method.

## 6Performance Evaluation

In this section, simulation results are presented to evaluate the performance of our proposed heuristic joint task offloading scheduling and resource allocation strategy, referred to as hJTORA. We consider a multi-cell cellular system consisting of multiple hexagonal cells with a BS in the center of each cell. The neighboring BSs are set apart from each other. We assume that both the users and BSs use a single antenna for uplink transmissions. The channel gains are generated using a distance-dependent path-loss model given as , and the log-normal shadowing variance is set to . In most simulations, if not stated otherwise, we consider cells and the users’ maximum transmit power set to . In addition, the system bandwidth is set to and the background noise power is assumed to be .

In terms of computing resources, we assume the CPU capability of each MEC server and of each user to be and , respectively. According to the realistic measurements in [24], we set the energy coefficient as . For computation task, we consider the face detection and recognition application for airport security and surveillance [3] which can be highly benefit from the collaboration between mobile devices and MEC platform. Unless otherwise stated, we choose the default setting values as , (following [3]), , , and , . In addition, the users are placed in random locations, with uniform distribution, within the coverage area of the network, and the number of sub-bands is set equal to the number of users per cell. We compare the system utility performance of our proposed hJTORA strategy against the following approaches.

• Exhaustive

: This is a brute-force method that finds the optimal offloading scheduling solution via exhaustive search over possible decisions; since the computational complexity of this method is very high, we only evaluate its performance in a small network setting.

: All tasks (up to the maximum number that can be admitted by the BSs) are offloaded, as in [10]. In each cell, offloading users are greedily assigned to sub-bands that have the highest channel gains until all users are admitted or all the sub-bands are occupied; we then apply joint joint resource allocation across the BSs as proposed in Sect. Section 5-A, B.

: Each user is randomly assigned a sub-band from its home BS, then the users independently make offloading decision [21]; joint resource allocation is employed.

: Each BS independently makes joint task offloading decisions and resource allocation for users within its cell [11].

### 6.1Suboptimality of Algorithm

Firstly, to characterize the suboptimality of our proposed hJTORA solution, we compare its performance with the optimal solution obtained by the Exhaustive method, and then with the three other described baselines. Since the Exhaustive method searches over all possible offloading scheduling decisions, its runtime is extremely long for a large number of variables; hence, we carry out the comparison in a small network setting with users uniformly placed in the area covered by cells, each having sub-bands. We randomly generate large-scale fading (shadowing) realizations and the average system utilities (with confident interval) of different schemes are reported in Figure 2(a,b) when we set and , respectively. It can be seen that the proposed hJTORA performs very closely to that of the optimal Exhaustive method, while significantly outperforms the other baselines. In both cases, the hJTORA algorithm achieves an average system utility within that of the Exhaustive algorithm, while providing upto , , and gains over the DORA, GOJRA, and IOJRA schemes, respectively. Additionally, in Table 1, we report the average runtime per simulation drop of different algorithms, running on a Windows 7 desktop with CPU and RAM. It can be seen that the Exhaustive method takes very long time, about longer than the hJTORA algorithm for such a small network. The DORA algorithm runs slightly faster than hJTORA while IOJRA and GOJRA requires the lowest runtimes.

### 6.4Effect of Users’ Preferences

Figure 5 shows the average time and energy consumption of all the users when we increase the users’ preference to time, ’s, between and while at the same time decrease the users’ preference to energy as . It can be seen that the average time consumption decreases when increases, at the cost of higher energy consumption. In addition, when , the users experience a larger average time and energy consumption than in the case when . This is because when there are more users competing for the limited resources, the probability that a user can benefit from offloading its task is lower.

### 6.5Effect of Inter-cell Interference Approximation

To test the effect of the approximation to model the inter-cell interference as in in Sect. Section 5-A, we compare the results of the hJTORA solution to calculate the system utility using the approximated expression versus using the exact expression of the inter-cell interference. Figure 6 shows the system utility when the users’ maximum transmit power ’s vary between and . It can be seen that the performance obtained using the approximation is almost identical to that of the exact expression when is below , while an increasing gap appears when . However, as specified in LTE standard, 3GPP TS36.101 section 6.2.31, the maximum UE transmit power is ; hence, we can argue that the proposed approximation can work well in practical systems.

## Appendix

Firstly, it is straightforward to verify that is twice differentiable on . We now check the second-order condition of a strictly quasi-convex function, which requires that a point satisfying also satisfies [31].

The first-order and second-order derivatives of can be calculated, respectively, as,

and

in which,

Suppose that ; to satisfy , it must hold that,

By substituting into , we obtain,

It can be easily verified that both and are strictly positive . Hence, , which confirms that is a strictly quasi-convex function in .

### Footnotes

1. Refer to: 3GPP TS36.101, V14.3.0, Mar. 2017

