A Quantum-Search-Aided Dynamic Programming Framework for Pareto Optimal Routing in Wireless Multihop Networks
Wireless Multihop Networks (WMHNs) have to strike a trade-off among diverse and often conflicting Quality-of-Service (QoS) requirements. The resultant solutions may be included by the Pareto Front under the concept of Pareto Optimality. However, the problem of finding all the Pareto-optimal routes in WMHNs is classified as NP-hard, since the number of legitimate routes increases exponentially, as the nodes proliferate. Quantum Computing offers an attractive framework of rendering the Pareto-optimal routing problem tractable. In this context, a pair of quantum-assisted algorithms have been proposed, namely the Non-Dominated Quantum Optimization (NDQO) and the Non-Dominated Quantum Iterative Optimization (NDQIO). However, their complexity is proportional to , where corresponds to the total number of legitimate routes, thus still failing to find the solutions in “polynomial time”. As a remedy, we devise a dynamic programming framework and propose the so-called Evolutionary Quantum Pareto Optimization (EQPO) algorithm. We analytically characterize the complexity imposed by the EQPO algorithm and demonstrate that it succeeds in solving the Pareto-optimal routing problem in polynomial time. Finally, we demonstrate by simulations that the EQPO algorithm achieves a complexity reduction, which is at least an order of magnitude, when compared to its predecessors, albeit at the cost of a modest heuristic accuracy reduction.
List of Acronyms
|BBHT-QSA||Boyer-Brassard-Høyer-Tapp Quantum Search Algorithm|
|BER||Bit Error Ratio|
|CDP||Classical Dynamic Programming|
|CF(E)||Cost Function (Evaluation)|
|EQPO||Evolutionary Quantum Pareto Optimization|
|MODQO||Multi-Objective Decomposition Quantum Optimization|
|MO-ACO||Multi-Objective Ant Colony Optimization|
|NDQO||Non-Dominated Quantum Optimization|
|(P-)NDQIO||(Preinitialized) Non-Dominated Quantum Iterative Optimization|
|NSGA-II||Non-dominated Sorting Genetic Algorithm II|
|OPF-SR||Optimal Pareto Front Self-Repair|
|(O)PF||(Optimal) Pareto Front|
|WMHN||Wireless MultiHop Network|
The concept of Wireless Multihop Networks (WMHN)  enables the communication of remote nodes by forwarding the transmitted packets through a cloud of mobile relays. Naturally, the specific choice of the relays plays a significant role in the performance of WMHNs , thus bringing their routing optimization in the limelight. Explicitly, optimal routing relies on a fragile balance of diverse and often conflicting Quality-of-Service (QoS) requirements , such as the route’s overall Bit-Error-Ratio (BER) or Packet Loss Ratio (PLR), its total power consumption, its end-to-end delay, the route’s achievable rate, the entire system’s sum-rate and its “lifetime” .
For the sake of taking into account multiple QoS requirements, several studies consider single-component Objective Functions (OF) as their optimization objectives. In this context, the metric of Network Lifetime (NL) [5, 4] has been utilized, which involves the routes’ power consumption in conjunction with the nodes’ battery levels. Additionally, the so-called Network Utility (NU)  also constitutes a meritorious single-component optimization OF. Apart from the aforementioned QoS requirements, NU also takes into account the routes’ achievable rate . In conjunction with the construction of aggregate functions, the authors of [8, 9] also incorporate QoS as constraints, thus providing a more holistic view of the routing problem. In this context, Banirazi et al.  optimized an aggregate function of the Dirichlet routing cost as well as the average network delay at specific operating points that maximize the network throughput.
The beneficial properties of dynamic programming  have been exploited for the sake of identifying the optimal routes, while relying on single-component aggregate functions. In this context, Dijkstra’s algorithm [11, 12, 13] has been employed, since it is capable of approaching the optimal routes at the cost of imposing a complexity on the order of , where corresponds to the number of edges in the network’s graph. Additionally, the appropriately modified Viterbi decoding algorithm [14, 15] has also been utilized for solving single-component routing optimization problems, where the route exploration process can be viewed as a trellis graph and thus the routing problem is transformed into a decoding problem. Explicitly, this transformation is reminiscent of the famous Bellman-Ford algorithm .
The aforementioned approaches fail to identify the potential discrepancies among the QoS requirements, but they can be unified by the concept of Pareto Optimality . However, the search-space of multi-component optimization is inevitably expanded due to combining the single-component OFs. Furthermore, the complexity is proportional to , where corresponds to the total number of eligible routes. Additionally, since increases exponentially as the relay nodes proliferate , the Pareto-optimal routing problem is classified as Non-deterministic Polynomial hard (NP-hard) . This escalating complexity can be partially mitigated by identifying a single Pareto-optimal solution. For instance, Gurakan et al.  conceived an optimal iterative routing scheme for identifying a single Pareto-optimal solution in terms of the sum rate and the energy consumption of wireless energy-transfer-enabled networks. However, in our application we are primarily interested in identifying the entire set of Pareto-optimal solution, since it provides fruitful insights into the underlying trade-offs . In this context, multi-objective evolutionary algorithms [18, 21, 22] have been employed for addressing the escalating complexity. In particular, Yetgin et al.  used both the Non-dominated Sorting Genetic Algorithm II (NSGA-II) and the Multi-Objective Differential Evolution Algorithm (MODE) for optimizing the transmission routes in terms of their end-to-end delay and power dissipation. While considering a similar context, Camelo et al.  invoked the NSGA-II for optimizing the same QoS requirements for both the ubiquitous Voice over Internet Protocol (VoIP) and for file transfer. Additionally, the so-called Multi-Objective Ant Colony Optimization (MO-ACO) algorithm  has been employed in  for the sake of addressing the multi-objective routing problem in WMHNs.
Quantum computing provides a powerful framework [24, 25, 26] for the sake of rendering Pareto-optimal routing problems tractable by exploiting the so-called Quantum Parallelism (QP) . Explicitly, in  Quantum Annealing , has been invoked for the sake of optimizing the activation of the wireless links in wireless networks, while maintaining the maximum throughput and minimum interference as well as providing a substantial complexity reduction w.r.t. its classical counterpart, namely simulated annealing. In terms of Pareto optimal routing using universal quantum computing , the so-called Non-Dominated Quantum Optimization (NDQO) algorithm proposed in  succeeded in identifying the entire set of Pareto-optimal routes at the expense of a complexity, which is on the order of , relying on QP. As an improvement, the so-called Non-Dominated Quantum Iterative Optimization (NDQIO) algorithm was proposed in . Explicitly, the NDQIO algorithm is also capable of identifying the entire set of Pareto-optimal routes, while imposing a parallel complexity and a sequential complexity defined111We define the parallel complexity as the complexity imposed while taking into account the degree of parallelism. By contrast, the sequential complexity does not consider any kind of parallelism. In , they are referred to as normalized execution time and normalized power consumption, respectively. in , which is on the order of and , respectively, by relying on the beneficial synergy between QP and Hardware Parallelism (HP). Note that corresponds to the number of Pareto-optimal routes.
Despite the substantial complexity reduction offered both by the NDQO and the NDQIO algorithms, the multi-objective problem still remains intractable, when the network comprises an excessively high number of nodes due to the escalating complexity. Explicitly, Zalka  has demonstrated that the complexity order of is the minimum possible, as long as the database values are uncorrelated. By contrast, when the formation of the Pareto-optimal route-combinations becomes correlated owing to socially-aware networking , a further complexity reduction can be achieved. Based on this specific observation, we will design a novel algorithm, namely the Evolutionary Quantum Pareto Optimization (EQPO), in order to exploit the correlations exhibited by the individual Pareto-optimal routes by appropriately constructing trellis graphs that guide the search process in the same fashion as in Viterbi decoding. Furthermore, we will also exploit the synergies between QP and HP for the sake of achieving an additional complexity reduction by considering as low a fraction of the database entries as possible, while still guaranteeing a near-full-search-based performance.
Our contributions are summarized as follows:
In Section III, we develop a novel multi-objective dynamic programming framework for generating potentially Pareto-optimal routes relying on the correlations of the specific links constituting the Pareto-optimal routes, hence substantially reducing the total number of routes considered. Explicitly, this framework is a multi-objective extension of the popular single-objective Bellman-Ford algorithm.
In Section IV, we propose a novel quantum-assisted algorithm, namely the Evolutionary Quantum Pareto Optimization algorithm, which jointly exploits our novel dynamic programming framework as well as the synergies between the QP and the HP for the sake of solving the multi-objective routing problem of WMHNs.
In Section V, we also characterize the performance versus complexity of the EQPO algorithm and demonstrate that it achieves both a parallel and a sequential complexity reduction of at least an order of magnitude for a 9-node WMHN, when compared to that of the NDQIO algorithm.
The rest of this paper is organized as follows. In Section II, we will briefly discuss the specifics of the network model considered in our case study. In Section III, we will present a dynamic programming framework, which is optimal in terms of its heuristic accuracy. In Section IV, we will relax the optimal framework of Section II for the sake of striking a better accuracy versus complexity trade-off with the aid of our EQPO algorithm. Subsequently, in Section V-A we will analytically characterize the EQPO algorithm’s complexity and in Section V-B we will evaluate its performance.
Ii Network Specifications
In the context of this treatise, the model of the networks considered both in  and in  has been adopted. To elaborate further, the WMHN considered is a fully connected network and it consists of a single Source Node (SN), a single Destination Node (DN) and a cloud of Relay Nodes (RN). The SN and the DN are located in the opposite corners of a (100100) m square-block area, which is the WMHN coverage area considered. By contrast, the RNs are considered to be roaming within the coverage area having random locations, which obey the uniform distribution within the WMHN coverage area. A WMHN topology is exemplified in Fig. 1 for a WMNH consisting of 5 nodes in total. Additionally, a cluster-head equipped with a quantum computer, which is responsible for collecting all the required WMHN information, such as the nodes’ geolocations and their interference levels, is considered to be present at the DN side. Therefore, we should point out that this treatise is focused on a centralized protocol.
Based on the network information gathered, the WMHN cluster-head has to identify the optimal routes emerging from the SN to the DN based on certain Utility Functions (UF). Similar to  and , we have jointly taken into account the route’s overall delay, its overall power consumption and its overall Bit Error Ratio (BER). Before delving into the UFs, let us define a legitimate route of our WMHN consisting of nodes, as , which contains each RN only once for the sake of limiting the total number of legitimate routes, while at the same time avoiding routes associated with excessive power consumption and delay. Note that we have associated the SN and the DN with the node indices 1 and , respectively, in the context of this treatise. Additionally, these legitimate routes are mapped to a specific index under lexicographic ordering using Lehmer Encoding222Lehmer Encoding maps a specific permutation to an index in the factoradic basis . . The route’s overall delay is considered as one of our UFs, which is quantified in terms of the number of hops established by the route. This is formally formulated as follows:
where the operator corresponds to the number of nodes along the route including the SN and DN. Moving on to the -th route’s overall power consumption , it is proportional to the sum of path-losses incurred by each of the individual links constituting the route. Explicitly, the path-loss quantified in dB for a single link between the -th and the -th nodes is equal to :
where corresponds to the path-loss exponent, is the distance between the two nodes and denotes the carrier’s wavelength. In our case-study we have set and m corresponding to a frequency of GHz. Consequently, the second UF is formulated as follows:
Moving on to the final UF, namely the BER, let us first elaborate on the interference levels experienced by the nodes. In our specific scenario, there is only a single pair of source and destination nodes, resulting in a single route being active. Additionally, we have assumed that the WMHN has a sufficient number of orthogonal spreading codes and sub-carriers for the sake of efficiently separating the routes as in . In this context, there is no interference stemming from the WMHN itself; however, we have assumed that a sufficiently high number of users access the channel, hence the resultant interference can be treated as Additive White Gaussian Noise (AWGN), owing to the Central Limit Theorem (CLT) . Therefore, the interference is modeled by a random Gaussian process, with its mean set to -90 dBm and its standard deviation set to 10 dB, while the transmission power is set to dBm. Additionally, the nodes transmit their messages using the uncoded QPSK scheme  over uncorrelated Rayleigh fading channels and utilize Decode-and-Forward relaying  for forwarding the respective messages. Based on these assumptions, we can readily use the closed-form BER performance of the adopted scheme versus the received Signal-to-Noise Ratio (SNR), while the overall route’s BER can be calculated using the following recursive formula :
which corresponds to the output BER of a two-stage Binary Symmetric Channel (BSC) , where and represent the BER associated with the first and the second stage, respectively.
Having described the UFs considered, let us now proceed by defining our optimization problem. Explicitly, we will jointly consider the UFs in the form of a Utility Vector (UV) , which is defined as follows:
where and correspond to the -th route’s delay and power consumption defined in Eqs. (1) and (3), while denotes the -th route’s end-to-end BER, which is recursively evaluated using Eq. (4). Explicitly, we opt for jointly minimizing the entire set of UFs considered by the UV of Eq. (5). Therefore, for the evaluation of the fitness of the UVs we will utilize the concept of Pareto Optimality333The readers should refer to  for a more detailed tutorial on Pareto optimality. , which is encapsulated by Definitions 1 and 2.
Pareto Dominance : A particular route associated with the UV , where is the number of the UFs considered, is said to strongly dominate another route associated with the UV , denoted by , iff we have , . Equivalently, the route is said to weakly dominate another route , denoted by , iff we have , and , so that we have .
Pareto Optimality : A particular route associated with the UV is Pareto-optimal, iff there is no route that dominates , i.e. we have so that is satisfied. Equivalently, the route is strongly Pareto-optimal iff there is no route that weakly dominates , i.e. we have , so that is satisfied.
Explicitly, Definition 1 provides us with the criterion for evaluating the fitness of a specific route with respect to another reference route, while Definition 2 outlines the condition of the specific route’s optimality. Based on the number of routes dominating a specific route, it is possible to group the routes into the so-called Pareto Fronts (PF). Explicitly, the PF comprises the Pareto-optimal routes, which are dominated by no other routes according to Definition 2, which is often referred to as the Optimal Pareto Front (OPF).
In our application, our aim is to identify the entire set of weakly Pareto-optimal routes for the sake of gaining insight into the routing trade-offs associated with the UFs considered. Naturally, for the sake of identifying a specific route as Pareto-optimal we have to perform precisely Pareto-dominance comparisons, where corresponds to the total number of legitimate routes. Therefore, the complexity imposed by the exhaustive search aiming for identifying the entire set of routes belonging to the OPF is on the order of . Explicitly, the total number of legitimate routes increases exponentially as the number of nodes increases , hence rendering the multi-objective routing problem as NP-hard. Thus sophisticated methods are required for finding all of the solutions.
Let us now proceed by elaborating on our novel dynamic framework designed for efficiently exploring the search space.
Iii Mutli-Objective Routing Dynamic Programming Framework
Before delving into the analysis of our multi-objective dynamic programming framework, which is specifically tailored for our routing problem, we will express each of the UFs considered in the UV of Eq. (5) as a weighted sum of the specific UFs associated with the individual links comprised by a particular route. Explicitly, the power consumption has already been expressed in this form based on Eq. (3). As for the delay, which we have defined as the number of hops, it may be redefined as follows:
where corresponds to the Kronecker delta function , while and represent the route and its associated index, respectively. As for the route’s overall BER, the recursive formula of Eq. (4) may be approximated as follows:
where represents the BER of the specific link established between the -th and the -th nodes, while is the approximation error, which is on the order of:
Since the sum of the products of all the links’ BER will be several orders of magnitude lower than their sum, the approximation error of Eq. (7) may be deemed to be negligible.
Having expressed the UFs considered as a weighted sum of the UFs associated with their links, we may now proceed by exploiting this specific property for the sake of achieving a further complexity reduction. In fact, it is possible to transform our composite multi-objective routing problem into a series of smaller subproblems, thus arriving at a dynamic programming structure. This transformation is performed with the aid of Definition 3 in conjunction with Proposition 1.
A specific route is said to generate another route by inserting the single RN node between the previous RN and the DN. Explicitly, the resultant route is , .
Let us consider a specific route associated with the UV and its sub-route associated with the UV . Let us assume furthermore that each component of the UV associated with the route has a positive value and that it can be expressed as a sum of the respective UFs of its individual links , i.e. we have:
with , where and are the number of optimization objectives and the set of legitimate routes, respectively. The route cannot generate any Pareto-optimal routes using the rule of Definition 3 if there is a route from the SN to the DN associated with that weakly dominates the sub-route , i.e. if we have . The respective proof is presented in Appendix A.
Explicitly, Proposition 1 guarantees that a specific route comprised by the sub-route cannot generate Pareto-optimal routes by adding an intermediate RN to between its last RN and the DN, if the sub-route is weakly dominated by any of the legitimate routes. Explicitly, should its sub-route be sub-optimal, the respective route will be sub-optimal as well, since we have , based on Proposition 1. Note that the opposite of this statement does not apply, since there exist sub-optimal routes, whose sub-routes are indeed Pareto-optimal.
|Route||Route UV||Sub-route UV||Optimal Route||Optimal Sub-route|
This specific property can be exploited for the sake of reducing the search-space size required for identifying the entire OPF. To elaborate further, we can devise an irregular trellis graph  for the sake of guiding the search space exploration, as portrayed in Fig. 2 for the 5-node WMHN of Fig. 1. Note however that this specific trellis graph is different from those used for channel coding in , since in the latter we only have as many legitimate paths as many legitimate symbols. By contrast, here all transitions represent legitimate routes in our scenario. Additionally, we rely on Definition 3 for the sake of determining the possible trellis-node transitions. For instance, observe in Fig. 2 that a trellis-path emerging from the trellis-node associated with the generator route is only capable of visiting the nodes associated with the routes and , since a single RN is inserted before the DN into the generator route based on Definition 3. Moving on to the next trellis stages, during the -th trellis stage the following three steps are carried out:
Iii-1 Surviving Routes
The set of generated routes are constructed based on the set of surviving routes of the previous stage and relying on Definition 3.
Iii-2 Pareto-Optimal Routes
The set of Pareto-optimal routes is identified based on the following optimization problem:
Note that the optimization problem of Eq. (10) considers the joint search space constituted by the all the routes of the -th trellis stage as well as by the Pareto-optimal routes of the previous stage. Using recursion, we can readily observe that the Pareto-optimal routes of the previous stage contain the Pareto-optimal routes across all stages up to the -st stage. This property is beneficial for our dynamic programming framework, since it eliminates the need for backwards propagation, thus only requiring the employment of a feed-forward method for the identification of the entire OPF.
Iii-3 Surviving Routes
The set of surviving routes is identified based on the following optimization problem:
where corresponds to the particular sub-route of , having all the links of , except for the last hop, as detailed in Proposition 1.
The optimization process proceeds to the next trellis stage as long as either there exist surviving routes, i.e. we have , or if the maximum affordable number of trellis stages - which is equal to the maximum number of hops of the legitimate routes - has not been exhausted. Otherwise, the optimization process terminates by exporting the hitherto identified OPF.
Let us now proceed by elaborating on the route exploration process using the 5-node WMHN example of Fig 1. Its respective trellis is portrayed in Fig 2, while the routes’ and their respective sub-route’s UVs are shown in Table I. Initially, the optimization process considers the set of routes, which is constituted by all the legitimate routes having a single and two hops, namely the routes , , and , as portrayed in the trellis stage of Fig. 2. Based on Table I, all the routes considered are Pareto-optimal and thus the respective set is equal to . Subsequently, the set of surviving nodes is constructed. Explicitly, the direct route is not considered in this case, since its inclusion leads to the generation of routes, which have already been processed. Observe in Table I that all the routes constituted by 2 hops have Pareto optimal sub-routes and hence the set of surviving routes becomes .
After the identification of the set of surviving routes , the set of routes generated in the 2 trellis stage is created by including an appropriate RN right before the DN, as annotated with the aid of black arrows in Fig 2. Naturally, since all the routes constituted by two hops have been identified as being Pareto-optimal, the entire set of routes having three hops is visited by the trellis-paths in the 2 trellis stage, as seen in Fig. 2. The set of Pareto-optimal routes of the trellis stage is then concatenated to the set of the routes generated in the trellis stage and the set of Pareto-optimal routes is identified. After this operation, the latter is set to , , hence including the route to the OPF, as denoted with the aid of the bold rectangle in Fig. 2. The surviving routes of the trellis stage are then identified using the optimization problem of Eq. (11). Explicitly, they constitute the set , as it may be verified by Table I and denoted with the aid of the gray-filled nodes of Fig. 2.
In the presence of surviving nodes, the optimization process proceeds with the final trellis stage; however, in this case the routes and are not considered, since their generators do not have Pareto-optimal sub-routes. This is portrayed in Fig. 2 with the aid both of the gray dashed arrows and of the gray dashed nodes. Hence, the set is generated. The set is then concatenated to that of the routes generated in the final trellis stage and the final set of Pareto-optimal routes is identified. Explicitly, the latter is identical to the respective set of the trellis stage, since none of the routes generated in the final stage is Pareto-optimal, as verified by Table I. Additionally, since we have reached the final stage, the set of surviving routes is not identified and the process exits by exporting the hitherto observed OPF.
In a nutshell, this route exploration process succeeds in transforming the multi-objective routing problem into a series of significantly less complex sub-problems, each corresponding to a single trellis stage, hence inheriting the structure of dynamic programming problems . Note that the metric-accumulation, which is typical in dynamic programming problems, is constituted by the update of the Pareto-optimal routes. Note that this dynamic programming framework is optimal in terms of its efficacy in identifying the entire OPF, just like the exhaustive search method. Primarily, this is a benefit of Proposition 1, which excludes the routes that are incapable of generating Pareto-optimal routes during the next trellis stages.
Iv Evolutionary Quantum Pareto Optimization
In Section III, we introduced a novel dynamic programming framework for the sake of guiding the search process in identifying the Pareto-optimal routes, thus effectively reducing the complexity. In this section, we exploit this framework and further improve it with the aid of our EQPO algorithm. More specifically, we have relaxed the dynamic programming framework of Section III for the sake of striking a better accuracy versus complexity trade-off. Additionally, we have improved the quantum-assisted process of  for identifying the Pareto-optimal routes, so that it becomes capable of “remembering” the OPF identified in the previous trellis stages. We will refer to this improved quantum-assisted process as the Preinitialized-NDQIO (P-NDQIO) algorithm. In this context, the P-NDQIO and the EQPO algorithms are presented in Sections IV-A and IV-B, respectively. Let us now proceed by presenting the P-NDQIO algorithm.
Iv-a Preinitialized NDQIO algorithm
The P-NDQIO algorithm, which is formally stated in Alg. 1, is the main technique of memorization , thus providing a significant complexity reduction by remembering and propagating the OPF identified across the previous trellis stages to the next ones. Its memorization is performed in Step 1 of Alg. 1, where the OPF of the current trellis stage is initialized to that of the previous stage. Subsequently, the P-NDQIO algorithm performs its iterations, looking for Pareto-optimal routes in Steps 2-14 of Alg. 1.
During each iteration, which results in identifying a single Pareto-optimal route, the P-NDQIO algorithm first invokes the so-called Boyer-Brassard-Hoyer-Tapp Quantum Search Algorithm (BBHT-QSA)  for the sake of identifying routes that are not dominated by any of the routes belonging to the hitherto identified OPF. We refer to this process as the Backward BBHT-QSA (BW-BBHT-QSA) process . If an invalid route-solution - i.e. a route that is indeed dominated by the OPF identified so far - is output by the BBHT-QSA, the P-NDQIO algorithm concludes that the entire OPF has been identified. However, since the BBHT-QSA exhibits a low probability of failing to identify a valid solution444We define a valid route-solution as the specific route that satisfies the condition in Step 5 of Alg. 1, the BW-BBHT-QSA step is repeated for an additional iteration in order to ensure the detection of the entire OPF, as seen in Steps 12 and 14 of Alg. 1. Otherwise, should a valid route-solution be identified by the BW-BBHT-QSA step, this specific route is classified as “potentially” being Pareto-optimal. Consequently, the P-NDQIO algorithm invokes the so-called BBHT-QSA chain process [19, 30] in Steps 6-9 of Alg. 1. Explicitly, the output of the BW-BBHT-QSA is set as the initial reference solution in Step 7 of Alg. 1 and a BBHT-QSA process is activated in Step 8 of Alg. 1, which searches for routes that dominate the reference one. If a route that dominates the reference one is found, the reference route is updated to the BBHT-QSA output and a new BBHT-QSA process is activated. Naturally, the activation of the BBHT-QSA process is repeated until a particular route is output by the BBHT-QSA that does not dominate the reference route, thus indicating that the reference route is Pareto-optimal. Subsequently, the Pareto-optimal routes of the set are checked as to whether they are dominated by the reference route, so that they are removed and the reference route is then included in , as seen in Step 10 of Alg. 1. Explicitly, this check, which is referred to as the OPF Self-Repair (OPF-SR) process in , provides the EQPO algorithm with resilience against including sub-optimal routes in the early trellis stages due to the limited number of generated routes, hence preventing their propagation to the later stages.
Both the BW-BBHT-QSA process and the BBHT-QSA chains are parts of the original NDQIO algorithm; thus, the P-NDQIO algorithm employs quantum circuits that are identical to those of the NDQIO algorithm. Therefore, the motivated readers may refer to  for extended discussions.
Iv-B EQPO algorithm
The dynamic framework introduced in Section III, albeit optimal in terms of its capability of identifying the entire OPF, it may impose an excessive complexity quantified in terms of the number of dominance comparisons required for solving the optimization problem of Eq. (11). To elaborate further, as the number of UFs considered increases, the number of surviving routes is increased due to the differences among the UFs. This in turn leads to the proliferation of the number of routes generated per trellis stage. However, only a relatively small fraction of the surviving route-population leads eventually to generating Pareto-optimal routes in the next trellis stages. Therefore, the employment of the optimal dynamic framework presented in Section III imposes a significant complexity overhead for the sake of ensuring the detection of the entire set of Pareto-optimal routes. Consequently, a performance versus complexity trade-off has to be struck for the sake of mitigating this complexity overhead. In fact, this specific balance is struck in the context of the EQPO algorithm by jointly relying on Relaxations 1 and 2.
A route can only generate optimal routes based on Definition 3, if it is Pareto-optimal. This is formally formulated as follows:
Relaxation 1 restricts the set of the surviving routes at the end of the -th trellis stage to the set of the newly-discovered Pareto-optimal routes at this specific trellis stage. This relaxation provides beneficial complexity reduction, since it makes the identification both of the set of surviving routes and of the set of Pareto-optimal routes possible by simply solving the optimization problem of Eq. (10). Explicitly, Proposition 1 does not conflict with Relaxation 1, since the Pareto-optimal routes are guaranteed to have Pareto-optimal sub-routes. This is justified by the fact that the sub-routes dominate their routes due to the absence of the final hop, which results in increasing all the UFs considered. Thus, since there exist no route from the SN to the DN dominating the route identified, there exist no routes dominating the respective sub-route either. However, the complexity reduction offered by Relaxation 1 comes at the price of reduced accuracy, since sub-optimal routes having Parero-optimal sub-routes do exist, which might potentially lead to the generation of Pareto-optimal routes in the next trellis stages. This specific limitation is mitigated with the aid of Relaxation 2.
For the sake of facilitating the identification of all Pareto-optimal routes, Definition 3 is relaxed as follows: a specific route is said to generate another route by inserting the single RN between the -th and the -st nodes.
Relaxation 2 extends the set of generated routes, which are created by the set of surviving routes of the previous trellis stage. This is realized by replacing a single direct link established either by two RNs or by an RN and the DN with an indirect link involving an appropriate RN as an intermediate relay. Naturally, this specific modification enhances the heuristic accuracy of the EQPO algorithm, since it allows the generation of additional routes, thus acting similarly to the mutation operation of genetic algorithms .
Let us now proceed by elaborating on the specifics of the EQPO algorithm, which is formally presented in Alg. 2. To elaborate further, in Step 1 of Alg. 2 the EQPO algorithm initializes the set of routes generated, the Pareto-optimal routes as well as the surviving routes to the direct route, i.e. to the route . It then proceeds with the trellis stages using Steps 2-8 of Alg. 2. During each trellis stage, the set of generated routes is constructed in Step 4 of Alg. 2 relying on Relaxation 2. Upon applying Relaxations 1 and 2 in the trellis of Fig. 2 results in the trellis of Fig. 3.
This set is then concatenated with the set of Pareto-optimal routes identified in the previous stage. Subsequently, the P-NDQIO algorithm is invoked in Step 6 of Alg. 2 for the sake of identifying the set of Pareto-optimal routes from the set . Then, the set of surviving routes is determined in Step 7 of Alg. 2, relying on Relaxation 1.
More specifically, the steps carried out as part of the EQPO algorithm’s dynamic programming framework during a single trellis stage are listed as follows:
Iv-B1 Route Generation
EQPO creates the set of routes based on the set of surviving routes from the previous trellis stage using Relaxation 2, as seen in Step 4 of Alg. 2. For instance, observe in Fig. 3 that the route is capable of generating 4 routes, namely the routes , , , . By contrast, Definition 3 allows the generation of only the first two routes, as portrayed in Fig. 2. Additionally, in contrast to the optimal dynamic programming framework of Section III, each route of the current trellis stage in Fig. 3 can be generated by multiple surviving routes of the previous stage. This specific feature of Relaxation 2 enhances the heuristic accuracy, since it enables the generation of potentially Pareto-optimal routes, which have suboptimal constructors and hence would be disregarded based on Relaxation 1.
Iv-B2 Pareto-Optimal and Surviving Routes
Following the construction of the set of the routes generated, the EQPO algorithm invokes the P-NDQIO algorithm of Section IV-A in Step 6 of Alg. 2 in order to search for new Pareto-optimal routes belonging to the set . However, based on Definition 2, the optimality of the route depends on the set of eligible routes considered. Consequently, the OPF hitherto identified across all the previous trellis stages has to be concatenated with in Step 5 of Alg. 2, thus ensuring that the routes identified as optimal by the P-NDQIO algorithm are indeed Pareto-optimal with respect to the entire set of legitimate routes. Note that the set contains the Pareto-optimal routes across all trellis stages all the way up to the -th one, as in the optimal dynamic programming framework of Section III. Consequently, using Relaxation 1 the Pareto-optimal routes identified at the current trellis stage are considered as surviving routes. Note that the Pareto-optimal routes identified throughout the previous stages are not taken into account, since they would generate routes already processed during the previous trellis stages.
The EQPO algorithm continues processing the trellis stages either until it reaches a trellis stage having no surviving paths or when the maximum affordable number of trellis stages is exhausted, in a similar fashion to the optimal dynamic programming framework of Section III.
Let us now highlight the differences between the trellises of Figs. 2 and 3 considering the 5-node example of Fig. 1. Note that the same annotation is used in Fig. 3 as that of Fig. 2 Explicitly, based on Eq. (12), the EQPO algorithm classified the specific routes, which are Pareto-optimal as being “Pareto-Optimal” and those that have been generated in the current stage as “Visited & Surviving”. Hence in contrast to Fig. 2, they are equivalent in Fig. 3. Similar to the optimal dynamic programming framework of Section III, the EQPO algorithm initializes the set of generated routes to the set of the legitimate routes having either single or two hops, as portrayed in the trellis stage of Fig. 3. Based on Table I, all the routes having two hops are Pareto-optimal and thus the EQPO algorithm classifies them as the surviving routes of the trellis stage, as seen in Fig. 3. Similar to Fig. 2, the EQPO algorithm’s trellis paths visit the entire set of routes having three hops and then the algorithm identifies the route as Pareto-optimal with the aid of the P-NDQIO algorithm. Consequently, this specific route is deemed to be the sole surviving route in Fig 3. This is in contrast to Fig. 2, where three more routes have been identified as surviving ones. Recall from Fig. 2 that these routes do not lead to Pareto-optimal routes in the last trellis stage. This in turn results in the EQPO algorithm visiting one less route in the trellis stage, i.e. not considering the sub-optimal route as potentially Pareto-optimal.
V Complexity versus Accuracy Discussions
In this section, we will characterize the complexity imposed by the EQPO Alg. presented in Alg. 2 and evaluate its heuristic accuracy versus the complexity invested. Additionally, note that since we had no quantum computer at our disposal, the simulations of the QSAs were carried out using a classical cluster. Explicitly, since the quantum oracle gate  calculates in parallel the UF vectors of all the legitimate routes in the QD, they were pre-calculated. We note that this results in an actual complexity higher than that of the full-search method. Therefore, the employment of the quantum algorithms in a quantum computer is essential for observing a complexity reduction as a benefit of the QP. Hence, in our simulations, we have made the assumption of employing a quantum computer and we count the total number of activations for quantifying the EQPO’s complexity. This number would be the same for both classical and quantum implementations. Note that in the following analysis we will use the notation , where corresponds to the cardinality of the set .
Furthermore, our simulation results have been averaged over runs. During each run we have randomly generated the node’s locations as well as the interference levels experienced by them with the aid of the respective distributions mentioned in Section II. We have ensured that each run is uncorrelated with the rest of the runs.
Let us now proceed by analytically characterizing the complexity imposed by our proposed algorithm.
We will first characterize the complexity imposed by the EQPO algorithm’s dynamic progamming framework, when the exhaustive search is employed instead of the P-NDQIO algorithm in Step 6 of Alg. 2. We will refer to this method as the Classical Dynamic Programming (CDP) method and we will use it as a benchmarker for assessing the complexity reduction offered by the QP.
Prior to characterizing the EQPO algorithm and the CDP method we will analyze the the orders of the number of the surviving routes and of the number of the Pareto-optimal routes identified across the first trellis stages. As far as the number of the Pareto-optimal routes identified across the first trellis stages is concerned, the trellis graph guiding the search for Pareto-optimal routes identifies more Pareto-optimal routes, as it proceeds through more trellis stages. Explicitly, its order can be formally expressed as follows:
where corresponds to the fraction of the OPF identified by the first trellis stages. Naturally, this fraction approaches unity as the number of trellis stages moves closer to the maximum number of hops.
Moving on to the number of the surviving routes at the -stage, it is equal to the number of Pareto-optimal routes identified at the -th trellis stage, based on Relaxation 1. Explicitly, is a fraction of the total number of the Pareto-optimal routes identified across the first trellis stages. Hence, we have with , since the set of Pareto-optimal routes at the -th trellis stage is included in the set of Pareto-optimal routes identified at the first trellis stages. Therefore we can evaluate the order as follows:
Consequently, in Eqs. (13) and (14), we have upper bounded the order of the number of surviving routes at the -th stage as well as the order of the number of Pareto-optimal routes identified at the first stages by the order of the total number of Pareto-optimal routes, i.e. we have . Naturally, Eq. (13) and (14) will facilitate the complexity analysis, since they render the aforementioned orders independent of the index of the trellis stages. Let us now proceed by characterizing the complexity imposded by the CDP method.
V-A1 CDP method’s complexity
Let us assume that there is a total of generated routes arriving at the -th trellis stage. These particular routes are generated by the specific Pareto-optimal routes identified at the previous trellis stage, which are in total. Based on the aforementioned assumptions, the number of generated routes arriving at the -th trellis stage is formulated as follows:
Since the set of Pareto-optimal routes of the previous trellis stage are concatenated to the set of generated routes in Step 5 of Alg. 2, the total number of routes considered at the -th trellis stage is given by:
Additionally, the CDP method performs dominance comparisons, which we will refer to as the Cost Function Evaluation (CFE), since each generated route has to be compared to all of the routes considered. Therefore, the total complexity imposed by the CDP method across all trellis stages may be quantified in terms of the number of dominance comparisons, which is formulated as follows:
where we have exploited the property of the sum of squared numbers , where we have .
V-A2 EQPO algorithm’s complexity
Moving on to the EQPO algorithm’s complexity analysis, the P-NDQIO algorithm is activated once per trellis stage, based on Alg. 2. Note that we will classify the complexity imposed by the P-NDQIO into two different domains, namely that of the parallel and that of the sequential complexity. To elaborate further, the P-NDQIO algorithm also exploits the synergies between QP and HP, which was utilized by the NDQIO algorithm of . Explicitly, the parallel complexity, which is termed as “normalized execution time” in , is defined as the number of dominance comparisons, when taking into account the degree of HP. Therefore, it may be deemed to be commensurate with the algorithm’s actual normalized execution time. By contrast, the sequential complexity, which is termed as “normalized power consumption” in , is defined as the total number of dominance comparisons, without considering the potential degree of HP. Hence, this specific complexity may be deemed to be commensurate with the algorithm’s normalized power consumption, as elaborated in  as well.
Let us now proceed by characterizing the complexity of the individual sub-processes of the P-NDQIO process. During each trellis stage, the P-NDQIO algorithm activates its BW-BBHT-QSA step. This step invokes the BBHT-QSA once; however, since the quantum circuits of the original NDQIO algorithm are utilized, each activation of the quantum oracle, namely the operator in [30, Fig. 8], compares each of the generated routes to all the routes comprising the OPF identified so far. Since this set of comparisons is carried out in parallel, a single activation imposes a single CFE and CFEs in the parallel and sequential domains, respectively. Note that the BW-BBHT-QSA process will be activated times during a single trellis stage, since we opted for repeating this step for an additional iteration, when the BBHT-QSA fails to identify a valid route. Therefore, the parallel and sequential complexity imposed by the BW-BBHT-QSA process are quantified as follows:
Recall that the term in Eqs. (18) and (20) corresponds to the number of Pareto-optimal routes identified . Additionally, for the calculation of the orders of complexity we have relied on the fact that the BBHT-QSA has a complexity on the order of as demonstrated both in  and in . Moving on to the complexity imposed by the BBHT-QSA chains, it has been demonstrated in  that the complexity imposed by a single of BBHT-QSA chain - which leads to the identification of a single Pareto-optimal route - is identical to that of the so-called Durr-Hoyer Algorithm (DHA) , namely on the order of in terms of the number of quantum oracle gate activations. As for the latter, the quantum operator of [30, Fig. 7] has been utilized, which implements a dominance comparison. Explicitly, each activation of this operator imposes a parallel complexity of CFEs and a sequential complexity of a single CFE, owing to the parallel implementation of the UF comparisons. Therefore, the parallel and sequential complexity imposed by the BBHT-QSA chains are quantified as follows:
Finally, as for the OPF-SR dominance comparisons of Step 10 of Alg. 1, the parallel and sequential complexity imposed by this process are quantified as follows:
Recall from Eqs. (19), (21), (23), (25), (26) and (27) that we used Eqs. (13) and (14), where we have with corresponding to the total number of Pareto-optimal routes. Let us now proceed with the evaluation of the total parallel and sequential complexities of the EQPO algorithm. In the worst-case scenario the EQPO algorithm will process trellis stages, corresponding to the maximum possible number of hops, whilst visiting each node at most once. Thus, the total parallel and sequential complexities imposed by the EQPO algorithm are quantified as follows:
Note that in Eqs. (29) and (31) we have exploited the specific property of the sum of square roots, where we have . Observe from Eqs. (17) and (29) that the EQPO algorithm achieves a parallel complexity reduction against the CDP method by a factor on the order of . Additionally, the respective sequential complexity reduction is by a factor on the order of , based on Eqs. (17) and (31). Hence, the EQPO imposes a lower sequential complexity than the CDP method, as long as we have . As far as the EQPO algorithm’s predecessors are concerned, it has been proven in  that the NDQO algorithm imposes identical parallel and sequential complexities, which are on the order of . By contrast, the NDQIO algorithm imposes a parallel and a sequential complexity, which are on the order of and , respectively, where corresponds to the total number of legitimate routes. Consequently, the complexity imposed by both the NDQO and the NDQIO algorithms is proportional to in both domains, yielding an exponential increase in the order of complexity as the number nodes increases. By contrast, both the EQPO algorithm and the CDP method exhibit a complexity order similar to polynomial scaling, since its has been demonstrated in [30, Fig. 11] that the total number of Pareto-optimal routes increases at a significantly lower rate than that of the total number of routes.
Let us now proceed by presenting the average parallel and the average sequential complexity imposed both by the EQPO algorithm and by the CDP method, which are shown in Figs. (a)a and (b)b, respectively. We will compare the complexities imposed by the aforementioned algorithms to those of the Brute-Force (BF) method as well as to those of the EQPO algorithm’s predecessors, namely the NDQO and the NDQIO algorithms. The aforementioned methods consider the entire set of legitimate routes, hence they have no database correlation exploitation capabilities. Additionally, the NDQO algorithm and the BF method do not employ any HP scheme, thus their respective parallel and sequential complexities are identical. As far as the average complexity of the CDP method is concerned, observe in Figs. (a)a and (b)b that it requires a higher number of CFEs than the BF method for WMHNs having less than 8 nodes. This parallel complexity overhead is justified by the fact that the number of Pareto-optimal routes w.r.t. the total number of legitimate routes is relatively high. This in turn yields an increase in the fraction of trellis nodes that are classified as survivors, hence leading to more dominance comparisons. However, this trend is reversed for WMHNs having more than 7 nodes, where the CDP method exhibits a complexity reduction compared to the BF method. More specifically, for WMHNs constituted by 9 nodes, this complexity reduction is close to an order of magnitude. Still referring to 9-node WMHNs, the CDP method imposes a slightly higher parallel complexity than that of the NDQO algorithm, while it matches the sequential complexity of the NDQIO algorithm for the same 9-node network, based on Figs. (a)a and (b)b, respectively.
Moving on to the average parallel complexity of the EQPO algorithm, observe in Fig. (a)a that the EQPO algorithm imposes fewer CFEs than the rest of the algorithms considered for WHMNs having more than 5 nodes. Explicitly, this complexity reduction becomes more substantial, as the number of nodes increases, reaching a parallel complexity reduction of almost an order of magnitude for 9-node WMHNs, when compared to the NDQIO algorithm, which is capable of exploiting the HP as well. As for its sequential complexity, observe in Fig. (b)b that the EQPO algorithm imposes more CFEs than the rest of the algorithms for WMHNs having less than 7 nodes. This may be justified by the relatively small number of surviving routes, which does not allow the QP to excel by providing beneficial complexity reduction. However, this trend is reversed for WMHNs having more than 6 nodes, where the number of surviving routes becomes higher. More specifically for 9-node WMHNs, the EQPO algorithm begins to impose a sequential complexity reduction w.r.t. all the remaining algorithms considered. Additionally, observe in Figs. (a)a and (b)b that the EQPO algorithm’s complexity increases with a much lower gradient, as the number of nodes increases, when compared to the full-search-based algorithms, namely to the BF method as well as to the NDQO and the NDQIO algorithms. Explicitly, this is justified by the “almost polynomial” order of complexity, as demonstrated in Eqs. (29) and (31).
Having elaborated on the complexity imposed by the EQPO let us now proceed by discussing its heuristic accuracy. Since our design target is to identify the entire set of Pareto-optimal routes, we will evaluate the EQPO algorithm’s accuracy versus the complexity imposed in terms of two metrics, namely that of the average Pareto distance and that of the average Pareto complection . The same set of metrics have been considered in  for the evaluation of NDQIO algorithm’s accuracy as well. To elaborate further, the Pareto distance of a particular route is defined as the probability of this specific route being dominated by the rest of the legitimate routes. Explicitly, given a set of Pareto-optimal routes identified by the EQPO algorithm, their average Pareto distance is a characteristic of the OPF, since it provides insights into the proximity of the exported OPF to the true OPF. Naturally, a Pareto distance having a value of implies that the OPF identified by the EQPO is fully constituted by true Pareto-optimal routes. By contrast, the average Pareto completion is defined as the specific fraction of the solutions on the true OPF identified by the EQPO. Therefore, our goal is to achieve a Pareto completion as close to as possible.
Having defined the performance metrics, let us now present the performance versus complexity results of the EQPO algorithm, which are shown in Fig. 5 for 7-node WMHNs. The reason we have evaluated the aforementioned metrics for 7-node WMHNs is for the sake of comparison to the methods analyzed in  as well as in . Apart from the NDQO and NDQIO algorithms, we have used as benchmarks two additional classical evolutionary algorithms555The readers should refer to  and to  for a detailed description of the MO-ACO and the NSGA-II, respectively., namely the NSGA-II and the MO-ACO. Using the same convention as in  and , we have set the number of individuals equal to the number of generations and we have matched the total parallel complexity imposed by these classical algorithms to that of the NDQO algorithm, since the NDQO algorithm appears to impose the highest parallel complexity, based on Fig. (a)a. As for their total sequential complexity we have set it to that of the NDQIO algorithm. Consequently, we have considered employing 19 individuals over 19 generations for the parallel complexity matching and 29 individuals over 29 generations for the sequential complexity matching for both the NSGA-II and the MO-ACO algorithm.
Let us now proceed by elaborating on the average Pareto distance exhibited by 7-node WMHNs versus the parallel complexity invested, as portrayed in Fig. (a)a. Observe in this figure that the EQPO algorithm performs optimally – in the sense that no suboptimal routes are included in the OPF – for about 130 CFEs and then exhibits an error floor around . Similar trends are observed for the classical NSGA-II and for MO-ACO algorithm as well as for the quantum-assisted NDQO algorithm; the classical benchmark algorithms both exhibit an error floor around , while the respective NDQO algorithm’s error floor is around . By contrast, the NDQIO algorithm initially has an error floor of about , which then decays to infinitesimally low levels, when more CFEs are invested owing to its OPF-SR process . This specific trend is visible in Fig. (a)a, where the NDQIO algorithm outperforms the NDQO technique in terms of their beyond 8842 CFEs in the sequential complexity domain. Additionally, the NDQIO algorithm begins to exhibit a lower than that of the EQPO algorithm after 498 and 2932 CFEs in the parallel and sequential domains, respectively.
Let us now provide some further insights into the significance of the aforementioned error floors. Explicitly, a particular route is considered suboptimal, if there exists even just a single route dominating it, i.e. if it has a Pareto distance higher than or equal to , where corresponds to the total number of legitimate routes. This threshold is visually portrayed with the aid of the dashed and dotted horizontal lines in Figs. (a)a and (b)b. Hence, we can normalize the results w.r.t. this threshold for exporting the probability of a specific route becoming suboptimal. Consequently, EQPO algorithm’s error floor is translated into a probability of a specific route being suboptimal, which is equal to 0.2%, while the respective probability of the NDQO algorithm is equal to . Additionally, the respective probabilities of the classical benchmark algorithms are about 33% and 3.3%, when parallel and sequential complexity are considered, respectively. Consequently, the EQPO algorithm’s probability of opting for a suboptimal route may be regarded as negligible.
The evaluation of the average Pareto completion probability versus the parallel and the sequential complexity are shown in Figs. (c)c and (d)d. Note that the subplots inside these figures portray the portion of unidentified true Pareto-optimal routes, as encapsulated by the expression of . Explicitly, we will utilize this metric for assessing the error floor w.r.t. the , which may not be visible from the main plots. Additionally, note that we examined both and versus the parallel and sequential complexity imposed up to the maximum value observed by the EQPO algorithm. As far as the EQPO algorithm’s average Pareto completion versus the parallel complexity is concerned, observe in Fig. (c)c that the EQPO is capable of identifying a higher portion of the true OPF, when compared to the rest of the algorithms examined, while considering the same number of CFEs in the parallel complexity domain. Explicitly, the EQPO algorithm succeeds in identifying almost the entire set of Parero-optimal routes, since it is only incapable of identifying as few as 0.1% of the entire true OPF. This error floor is reached after 1301 and 14651 CFEs in the parallel and sequential complexity domains, respectively, as it can be verified by Figs. (c)c and (d)d.
By contrast, this trend is not echoed in the sequential complexity domain. To elaborate further, observe in Fig. (b)b that the EQPO algorithm remains more efficient than its classical counterparts. On the other hand, while it is indeed more efficient than the NDQO algorithm up to a complexity budget of 2147 sequential CFEs, it identifies less Pareto-optimal routes than the NDQO algorithm. The same trend is observed for the NDQIO algorithm as well for a complexity budget of 4794 sequential CFEs. Nevertheless, this discrepancy between the parallel and sequential complexity is expected to be decreased, as the number of nodes increases. This is justified by the fact that the EQPO algorithm imposes a lower sequential complexity as the nodes proliferate, as seen in Fig. (b)b.
Last but not least, the results portrayed on Fig. 5 rely on the intelligent central node having perfect knowledge both of the nodes’ geo-locations and of the interference power levels experienced by them. This fundamental assumption, albeit impractical, provides us with the upper bound of the achievable performance of the routing schemes considered. Explicitly, despite its impractical nature, it facilitates a fair comparison of the EQPO algorithm to its predecessors in terms of their complexity and heuristic accuracy, which is the main focus of this treatise. Intuitively, a practical network information update process would result in both approximated and outdated network information, thus degrading the results of Fig. 5, while maintaining the complexity per routing routing optimization at a similar order. Note that we plan on characterizing these imperfections and conceive a practical network information update scheme in our future work.
In this treatise we have exploited the correlations in the formation of the Pareto-optimal routes for the sake of achieving a routing complexity reduction. In this context, we have first developed an optimal dynamic programming framework, which transforms the multi-objective routing problem into a decoding problem. However, this optimal framework imposes a high complexity. For this reason, we relaxed the aforementioned framework and proposed the EQPO algorithm, which is empowered by the P-NDQIO algorithm and thus jointly exploits the synergies between the QP and the HP along with the potential correlation in the formation of the Pareto-optimal routes. We then analytically characterized the complexity imposed by the EQPO algorithm showed that it is capable of solving the multi-objective routing problem in near-polynomial time. In fact, the EQPO achieved a parallel complexity reduction of almost an order of magnitude and a sequential complexity reduction by a factor of 3 for 9-node WMHNs. Finally, we demonstrated with the aid of simulations that this complexity reduction only imposes an almost negligible error, which was found to be 0.2% and 0.1% in terms of the average Pareto distance and the average Pareto completion probability for 7-node WMHNs.
Appendix A Proof of Proposition 1
Let us consider the route generated by the route . Based on Eq. (9), the UFs associated with this specific route are equal to: