Efficient partition of integer optimization problems with one-hot encoding
Quantum annealing is a heuristic algorithm for solving combinatorial optimization problems, and D-Wave Systems Inc. has developed hardware for implementing this algorithm. The current version of the D-Wave quantum annealer can solve unconstrained binary optimization problems with a limited number of binary variables, although cost functions of many practical problems are defined by a large number of integer variables. To solve these problems with the quantum annealer, the integer variables are generally binarized with one-hot encoding, and the binarized problem is partitioned into small subproblems. However, the entire search space of the binarized problem is considerably extended compared to that of the original integer problem and is dominated by unfeasible solutions. Therefore, to efficiently solve large optimization problems with one-hot encoding, partitioning methods that extract subproblems with as many feasible solutions as possible are required. We propose two partitioning methods and demonstrate that better solutions are obtained using the methods proposed in this study.
Combinatorial optimization problems, i.e., the minimization of cost functions with discrete variables, have significant real-world applications. Generally, the cost function of a combinatorial optimization problem can be mapped to the Hamiltonian of a classical Ising model [Ising_mapping]. Simulated annealing (SA) [SA_original] is a classical heuristic algorithm that searches the ground states of a Hamiltonian, exploiting thermal fluctuations to escape local minima. In contrast to SA, quantum annealing (QA) [QA_original], which is strongly related to the adiabatic quantum computation [AQC_original], escapes the local minima through the tunneling effects induced by quantum fluctuations. Whether the quantum effects accelerate the computation of searching ground states is one of the main topics of research, and numerous studies have been conducted on this topic [QA_SA_compare1, QA_SA_compare2, QA_SA_compare3, QA_SA_compare4, QA_SA_compare5, QA_SA_compare6]. Recently, D-Wave Systems Inc. developed a commercial QA machine based on superconducting flux qubits [D-Wave_machine]. Experimental studies using QA machines have been performed to compare the performance of QA with that of SA [D-Wave_compare1, D-Wave_compare2, D-Wave_compare3] and to demonstrate the applicability of QA machines to practical problems [D-Wave_application1, D-Wave_application2, D-Wave_application3, D-Wave_application4, D-Wave_application5, D-Wave_application6, D-Wave_application7, D-Wave_application8, D-Wave_application9, D-Wave_application10, D-Wave_application11, D-Wave_application12, D-Wave_application13, D-Wave_application14, D-Wave_application15, D-Wave_application16, D-Wave_application17].
The generic form of a time-dependent Hamiltonian in QA is as follows:
where is the classical Hamiltonian which represents the cost function to be minimized, and is the quantum fluctuation term for which the ground state is trivial. At the beginning of QA, the coefficients of the time-dependent Hamiltonian are set to and , and the system is in the trivial ground state determined by . At the end of QA, the coefficients are set to and , where is the annealing time. The system evolves according to the Schrödinger equation:
where is a state vector of the system and is set to for simplicity. According to the adiabatic theorem [adiabatic_condition], the system will remain close to the instantaneous ground state of the time-dependent Hamiltonian if it changes sufficiently slowly. Thus, by setting the annealing time large enough, we can obtain the ground state of the classical Hamiltonian , which represents the optimal solution.
The current version of the D-Wave quantum annealer (D-Wave 2000Q) implements QA with a transverse magnetic field:
where represents the total number of qubits. A cost function that can be handled by the D-Wave quantum annealer is as follows:
where the interactions between qubits are restricted to the Chimera graph, that is constructed as an grid of complete bipartite graphs [chimera_architecture]. Although the Chimera graph for D-Wave 2000Q is , the number of operable qubits is less than because of defects in the qubits and connectivities.
Due to the limitation of the number of available qubits, we cannot solve large optimization problems directly using the D-Wave quantum annealer. In real settings, large problems are partitioned into subproblems that can be handled by the quantum annealer. The subproblems are iteratively optimized by the quantum annealer, and the optimization result is used to improve the current solution [qbsolv, hybrid1, hybrid2]. A cluster of spins in the subproblem are simultaneously updated in this scheme, and this iterative method is considered as one of the large-neighborhood local search algorithms [large-neighborhood]. Although such algorithms can be performed using classical computers, subproblems are basically restricted to tree structures that are solvable in polynomial time by belief propagation or dynamic programming [tree_partition1, tree_partition2, tree_partition3, tree_partition4]. Therefore, employing the quantum annealer is considered to be advantageous if it can solve subproblems with many closed loops more efficiently than classical algorithms. Furthermore, it is conjectured that, for improving solution accuracy, solving as large subproblems as possible is essential. The size of subproblems that can be embedded into the quantum annealer strongly depends on the quality of the minor embedding, particularly for problems that have a small number of interactions. Because subproblems must be iteratively embedded, fast algorithms to embed larger subproblems are required for exploiting the potential of the quantum annealer. While it is reasonable to employ a complete-graph embedding [Comp_Embed1, Comp_Embed2, Comp_Embed3] for problems with dense interactions, the subproblem-embedding algorithm, which we developed in a previous study [subproblem_embed], might be effective for improving solution accuracy of sparse problems.
Moreover, the quantum annealer requires that the cost function is represented in the form of a quadratic unconstrained binary optimization problem (QUBO) or Ising model, although many cost functions in practical problems are defined by integer variables. Generally, the binarization of the integer variables is achieved using one-hot encoding [Ising_mapping]. For example, the following integer optimization problem with integer variables :
where , is the number of components, is an interaction between and , and denotes the Kronecker delta function, is rewritten as
by one-hot encoding. Here, is a binary variable that is assigned to the component of , indicates that the component is selected for , and feasible solutions are constrained to configurations in which exactly one component is selected for each . Subsequently, a penalty term is introduced to obtain the following unconstrained form:
where the second term depicts the penalty term introduced to extract feasible solutions that satisfy the constraint , which we call "one-hot constraint", and the parameter controls the strength of the penalty term. By setting the parameter to a sufficiently large value, ground states of the original integer optimization problem [Eq. (5)] are correctly encoded. However, the performance of the D-Wave quantum annealer is significantly affected by noise and intrinsic control errors when a needlessly large is used. Hence, to obtain high accurate solutions, we must explore an appropriate value of , which is one of the most tedious tasks for the optimization under the one-hot constraint. Moreover, the intire search space of the binarized optimization problem [Eq. (7)] is dominated by unfeasible solutions. Figure 1(a) shows the problem graph of Eq. (7), whose vertices and edges represent binary variables and interactions between them, respectively. binary variables are assigned to each , and the total number of binary variables is . While the number of configurations of binary variables is , the number of feasible solutions is only . Therefore, to efficiently solve large optimization problems under the one-hot constraint using the quantum annealer, partitioning methods are required to extract subproblems with as many feasible solutions as possible. A simple example of an undesirable partition is depicted in Fig. 1(b). Assume that one hopes to improve the current solution shown in Fig. 1(b) and that the three binary variables enclosed by the green rectangle are extracted as the subproblem. In this case, better feasible solutions cannot be explored by optimizing the subproblem as only the current solution in the subproblem satisfies the one-hot constraint. To the best of our knowledge, the partitioning method proposed in the literature [D-Wave_application17] is the first one focusing on the one-hot constraint. This method is applicable to the double-constrained problems, and such as the assignment problem and the traveling salesman problem. However, extracted subproblems still contain unfeasible solutions, for which the parameter has to be adjusted. In this study, we propose two partitioning methods applicable to the problems whose cost function involves a single one-hot constraint, as shown in Eq. (7). The first method is similar to the previously developed method [D-Wave_application17], while the other method extracts subproblems comprising only feasible solutions and does not require adjusting the parameter . The performance of the proposed methods is assessed for several Potts models, which are generalized Ising models whose cost function is defined by integer variables [The_Potts_model]. We demonstrate that better solutions are efficiently obtained using the proposed methods.
In this section, we propose efficient partitioning methods for solving large optimization problems under the one-hot constraint, and assess the performance of proposed methods for several Potts models.
We propose two partitioning methods: one is the multivalued partition, and the other is the binary partition. These methods are summarized in Fig. 2. Both methods extract a subproblem that involves binary variables assigned to the tentatively selected components for each . The resulting subproblems include feasible solutions other than the current feasible solution.
The multivalued partition extracts a subproblem with two or more components for each , as shown in Fig. 2(a). In addition to the tentatively selected component, one or more components are randomly selected for each , and a subproblem that comprises the binary variables assigned to the selected components is extracted. The extracted subproblem involves feasible solutions other than the current solution, and the randomly selected components are explored for each by optimizing the subproblem. However, the extracted subproblem still contains unfeasible solutions, and the penalty term remains in the cost function of the subproblem. This partitioning method is similar to that developed in the literature [D-Wave_application17]. While the extracted subproblems are embedded usnig complete-graph embedding in the literature [D-Wave_application17], we employed the subproblem-embedding algorithm, which we developed in a previous study [subproblem_embed]. Details on how to achieve a multivalued partition using the subproblem-embedding algorithm are explained in the Methods section.
The binary partition is summarized in Fig. 2(b). In addition to the tentatively selected component, the binary partition randomly selects exactly one component for each . Subsequently, new binary variables that represent "stay in the tentatively selected component ()" or "transit to the randomly selected component ()" are introduced for each , and a binary subproblem is constructed whose cost function is defined by . The cost function of the binary subproblem is derived in the Methods section. After that, a subproblem of the binary subproblem is embedded into the D-Wave quantum annealer. Here, the cost function of the binary subproblem does not involve the penalty term because all solutions in the binary subproblem are feasible. Therefore, the binary partition does not require adjusting the parameter . Moreover, a larger number of binary variables can be embedded into the D-Wave quantum annealer because the penalty term, which generates fully connected interactions between and , is not involved. Consequently, the number of feasible solutions involved in the embedded subproblem is considerably increased using the binary partition. The binary subproblem can be regarded as one of the simplest cases of the optimization under the half-hot constraint [half_hot]. The half-hot constraint is proposed to avoid the difficulty caused by the longitudinal magnetic field of the penalty term. This difficulty is avoidable using the binary partition, which might contribute to improving solution accuracy. A disadvantage of the binary partition is that only two components are considered for each integer variable. As shown in the next subsection, this leads to poor performance for the ferromagnetic Potts model.
The performance of the proposed methods is evaluated for four types of Potts models on the cubic lattice with integer variables, namely, the ferromagnetic, anti-ferromagnetic, Potts glass [Potts_glass] and Potts gause glass [Potts_gauge_glass, chiral_Potts] models. The cost function is given by
where is set to , , , represents the interaction between the nearest neighbors on the cubic lattice, and denotes the Kronecker delta function. The cost function is represented as the QUBO form using the one-hot constraint as follows:
The parameters and in each model are shown in Table 1.
|Ferromagnetic Potts model|
|Anti-ferromagnetic Potts model|
|Potts glass model||or||0|
|Potts gauge glass model||or or|
While the ground states of the ferromagnetic and anti-ferromagnetic Potts models are trivial, it is generally difficult to obtain those of the Potts glass and Potts gauge glass models because of frustrations.
The optimization process demonstrated in this study is shown in Fig. 4. The original large problem is partitioned using three partitioning methods: the random, multivalued and binary partitions. The random partition does not adress whether an extracted subproblem contains feasible solutions for each or not. The subproblem-embedding algorithm proposed in the literature [subproblem_embed] is used for embedding a subproblem into the D-Wave quantum annealer. After optimizing the embedded subproblem using the D-Wave quantum annealer, the variables in the subproblem are replaced to the best solution among the solutions obtained using the quantum annealer. Subsequently, a greedy algorithm is executed by a conventional digital computer to get to exact (local) minima. In this greedy algorithm, an integer variable is randomly selected, and the tentatively selected component is replaced to that which minimizes the local energy with respect to the selected integer variable . We complete refining the current solution when all local energies are minimized. Finally, the best solution obtained in the procedure is updated. These processes are iterated, and we compare the solution accuracy for the three partitioning methods.
The obtained energies by the three partitioning methods are shown in Fig. 5. The average, maximum, and minimum energies for 16 trials are plotted, and the same initial states are used for each partitioning method. The horizontal axis represents the number of iterations, which is the number of subproblem optimizations performed by the D-Wave quantum annealer. The plot for the multivalued partition is slightly shifted to the left to avoid the overlap between other plots. Figures 5(a) and (b) show the obtained energies for the ferromagnetic and anti-ferromagnetic Potts models, respectively. The ground states of these models are trivial, and the minimum energy is and for the ferromagnetic and anti-ferromagnetic Potts models, respectively. Although the multivalued partition is expected to solve large optimization problems more efficiently than the random partition, the performance of the random and multivalued partitions are almost the same. The performance of the binary partition is different from the other methods; while it is the worst for the ferromagnetic Potts model, it is best for the anti-ferromagnetic Potts model. Figures 5(c) and (d) show the obtained energies for the Potts glass and Potts gauge glass models, respectively. As expected, better solutions are obtained with a smaller number of iterations using multivalued partition compared to random partition, particularly for the Potts gauge glass model. The binary partition shows the best performance among the three partitioning methods for both the Potts glass and Potts gauge glass models.
In this section, we discuss differences among the three partitioning methods. The following three questions arise from the results in the previous section.
Why is the multivalued partition not superior to the random partition for the ferromagnetic and anti-ferromagnetic Potts models?
Why is the performance of the binary partition the worst for the ferromagnetic Potts model?
Why does the binary partition exhibit the best performance except for the ferromagnetic Potts model?
One of the possible answers to the first question is the existence of lower-energy unfeasible solutions in neighborhoods of the current feasible solution, and that better feasible solutions can be reached via unfeasible solutions. A simple example of the one-dimensional ferromagnetic Potts model is shown in Fig. 6(a). Assume that the binary variable enclosed by the green rectangle is extracted as a one-variable subproblem, which is one of the simplest cases of the random partition. The energy change caused by flipping the extracted binary variable is because two interactions are simultaneously recovered () and the one-hot constraint is violated (). If , flipping the binary variable decreases the energy despite violating the constraint. Note that is sufficient to correctly encode ground states of the one-dimensional ferromagnetic Potts model because the energy of the lowest-energy unfeasible states, where two components are commonly selected for each , is and must be larger than that of the ground states (). Consequently, if is appropriately tuned (), the current solution is updated to the unfeasible solution by optimizing the subproblem; moreover, better feasible solutions will be reached via the unfeasible solution. The number of simultaneously recovered interactions significantly contributes to the existence of such lower-energy unfeasible solutions, and it will increase as the number of frustrated interactions in ground states decreases. Therefore, the multivalued partition is not effective in improving solution accuracy for the ferromagnetic and anti-ferromagnetic Potts models without frustrations. Furthermore, for the Potts glass and Potts gauge glass models, the performance of the multivalued partition is better than that of the random partition because many interactions are frustrated even in ground states.
The answer to the second question is that subproblems that can eliminate domain walls are rarely extracted by the binary partition. Figure 6(b) shows one of first excited states which is commonly observed in the optimization of the ferromagnetic Potts model. The ten variables in Fig. 6(b) are divided into two domains: the five variables are aligned to in one domain, whereas the other variables are aligned to in the other domain. The boundary of the domains is referrd to as domain wall. To improve the current solution, an extracted subproblem must contain one of the ground states because the current solution is the first excited state. For example, to align all integer variables to , the component must be selected for the variables . The probability of the component being selected for is equal to because, in addition to the tentatively selected component, the binary partition randomly selects one component for each . This probability exponentially decreases with respect to the number of variables, and the extraction of only two components is not suitable for the ferromagnetic Potts model. Furthermore, it is conjectured that the binary partition exhibits poor performance for optimization problems that contain partial ferromagnetically ordered domains, and the concomitant use of the binary and multivalued partitions might be preferred for such problems.
The answer to the third question is that there exist several binary subproblems that can improve the current solution. Local interactions of the anti-ferromagnetic Potts model are shown in Fig. 6(c). The current solution is one of the first excited states, where the interaction between and is frustrated. Assume that the integer variable is updated to improve the current solution, then, there are two binary subproblems that can improve the current solution, which are more likely to be extracted compared to other binary subproblems. Therefore, the disadvantage of the binary partition, which is that only two components are considered for each integer variable, is mitigated for optimizing the anti-ferromagnetic Potts model. Hence we can exploit the advantages of the binary partition, i.e., the extracted subproblems contain a larger number of feasible solutions and the adjustment of the parameter is not required. This is also the case for the Potts glass and Potts gauge glass models, in which frustrated ground states generate several binary subproblems that improve the current solution. Figure 6(d) shows a simple example for the Potts gauge glass model. One of the ground states and first excited states are shown in the top of Fig. 6(d), where the interaction depicted by the dashed line represents the frustrated interaction. One interaction is frustrated in the ground state, which is caused by the interaction between different components, and two interactions are frustrated in the first excited state. Assume that we update the integer variable to improve the current solution in the first excited state, then, there are two binary subproblems that can improve the current solution, as shown in the bottom of Fig. 6(d): one recovers the interaction between and and the other recovers the interaction between and . The frustrated ground states generate two binary subproblems that improve the current solution, each of which recovers different interactions. Thus, the disadvantage of the binary partition is mitigated as long as is not extensively large. Note that, while the number of binary subproblems that improve the current solution increases with increasing for the anti-ferromagnetic Potts model, it does not change for the Potts gauge glass model.
We proposed two partitioning methods to efficiently solve large optimization problems under the one-hot constraint using the D-Wave quantum annealer. The performance of the proposed methods is assessed for the ferromagnetic, anti-ferromagnetic, Potts glass, and Potts gauge glass models. The binary partition shows the best performance among the three partitioning methods except for the ferromagnetic Potts model. While the advantages of the binary partition are that it enables embedding a larger number of binary variables and does not require adjusting the parameter , the disadvantage is that only two components are considered for each integer variable. Although the disadvantage leads to poor performance for the ferromagnetic Potts model, it is mitigated for optimization problems that have many binary subproblems improving the current solution such as the anti-ferromagnetic Potts model and optimization problems with frustrations. We did not identify problems for which the multivalued partition is most suitable, although the multivalued partition exhibits a better performance than the random partition for problems with frustrations. In furure, studies should focused on constructing algorithms that can efficiently solve the ferromagnetic Potts model by the binary partition and assess the performance of the proposed methods for various optimization problems such as the graph coloring problem whose cost function is represented as the Hamiltonian of the anti-ferromagnetic Potts model.
Details on partitioning and embedding are described in this section.
We explain how to achieve the multivalued partition using the subproblem-embedding algorithm [subproblem_embed]. The key idea of the subproblem-embedding algorithm is to exclude binary variables that are not easily embedded from the subproblem and thereby reduce the computation time. Note that the multivalued partition requires that the binary variable, which is assigned to the tentatively selected component, must be embedded into the D-Wave quantum annealer if the integer variable is included in the subproblem. To achieve the multivalued partition combined with the subproblem-embedding algorithm, the order of the binary variables embedded into the D-Wave quantum annealer is specified, as shown in Algorithm 1. We introduce two criteria to determine the order of the binary variables:
The binary variable adjacent to the already embedded binary variables.
The binary variable assigned to the tentatively selected component.
After selecting an integer variable adjacent to the already embedded integer variables, the embedding order of the binary variables assigned to is determined according to the above criteria. When selecting a binary variable, which is embedded into the quantum annealer first, we regard Criterion 1 as more important than Criterion 2 because embedding of the independent binary variables is a waste of the hardware resources of the quantum annealer. For the remainder of the binary variables, we give weight to Criterion 2 to achieve the multivalued partition. The integer variable , in which only one component, is embedded is excluded from the subproblem after embedding of all binary variables is completed.
The average number of embedded integer variables with components are shown in Table 2, which is assessed for embedding of the Potts gauge glass model into D-Wave 2000Q_2 with defects and is averaged over trials. Note that, to distinguish the multivalued and binary partitions, is required for most of the integer variables. All four components are embedded for of the integer variables in the subproblem, indicating that we can embed the multivalued subproblem which is distinct from the binary subproblem. The average number of binary variables embedded into the D-Wave quantum annealer is .
To solve large optimization problems by the binary partition, the cost function of the binary subproblem needs to be dirived from that of the original large problem. The general form of the local energy between and is given by
where represents the interaction between and . The binary partition extracts a binary subproblem by randomly selecting one component in addition to the tentatively selected component for each integer variable. The local energy of the binary subproblem in the QUBO form is given as follows:
where , and denote the tentatively selected component and the randomly selected component for , respectively, and indicates "stay in the tentatively selected component " ("transit to the other component . Note that the cost function of the binary subproblem does not contain the penalty term because all solutions in the binary subproblem satisfy the one-hot constraint.
The problem graph of the binary subproblem extracted from the three-dimensional Potts model is the cubic lattice with bond dilutions. The density of the interactions in the binary subproblem is lower than that of the multivalued subproblem because the cost function of the binary subproblem does not contain the penalty term, which generates partially fully connected interactions between and . The average number of embedded binary variables is when the binary partition is used, while it is only binary variables when the multivalued partition is used. Furthermore, all solutions in the binary subproblem satisfy the one-hot constraint, while the multivalued subproblem does not. Therefore, the average number of feasible solutions involved in the embedded subproblem is considerably increased using the binary partition. Table 3 shows in an embedded subproblem by using the multivalued and binary partitions combined with the complete graph embedding [Comp_Embed3] and subproblem-embedding algorithm [subproblem_embed].
|Complete graph embedding||Subproblem-embedding algorithm|
The authors are deeply grateful to Shu Tanaka, Masamichi J. Miyama and Tadashi Kadowaki for fruitful discussions. The author M. O. is grateful for the financial support provided by JSPS KAKENHI 19H01095 and 16H04382, Next Generation High-Performance Computing Infrastructures and Applications R&D Program by MEXT.
Author contributions statement
S. O. conceived and developed the concept, and carried out all the experiments. M. O. proposed the plan to evaluate the validity of the concept, discussed the details of the results, and reviewed the manuscript. S. T. directed the project in our study.
Competing interests: The authors declare no competing interests.