# Control of automated guided vehicles without collision by quantum annealer and digital devices

###### Abstract

We formulate an optimization problem to control a large number of automated guided vehicles in a plant without collision. The formulation consists of binary variables. A quadratic cost function over these variables enables us to utilize certain solvers on digital computers and recently developed purpose-specific hardware such as D-Wave 2000Q and the Fujitsu digital annealer. In the present study, we consider an actual plant in Japan, in which vehicles run, and assess efficiency of our formulation for optimizing the vehicles via several solvers. We confirm that our formulation can be a powerful approach for performing smooth control while avoiding collisions between vehicles, as compared to a conventional method. In addition, comparative experiments performed using several solvers reveal that D-Wave 2000Q can be useful as a rapid solver for generating a plan for controlling the vehicles in a short time although it deals only with a small number of vehicles, while a digital computer can rapidly solve the corresponding optimization problem even with a large number of binary variables.

## Introduction

An automated guided vehicle (AGV) is a portable robot for moving materials in manufacturing facilities and warehouses [1, 2]. It moves along markers or wires on floors or uses vision, magnets, or lasers for navigation in a few cases. Currently, in most of plants, transportations of materials relies on AGVs and their smooth control. However, in limited-size plants, AGVs are frequently involved in traffic congestion around intersections because a large number of AGVs cross them simultaneously. AGVs are controlled by following a predetermined rule to prevent collisions between AGVs. The simplest but most important rule prohibits AGVs from crossing intersections simultaneously. In other words, an intersection is not allowed to be occupied by multiple AGVs. One approach for designing a plant that is suitable for controlling AGVs might be to divide an area into several zones and prohibit simultaneous occupation by multiple AGVs in an event period [3, 4, 5]. In addition, the scheduling problem for controlling the AGVs is also considered [6].

In the present study, we consider a method for controlling AGVs without changing the layout of a plant. The method is based on the formulation of the control problem of AGVs as an optimization problem. We determine the routes of AGVs without collision by solving the optimization problem. In the optimization problem, we utilize binary variables, which represent the selection of routes from numerous candidates in a discretized map. We employ a quadratic cost function with the binary variables to formulate the optimization problem for simplicity. It nevertheless requires considerable time to solve the optimization problem with binary variables in most cases. We implement specialized solvers to provide meaningful solutions for controlling AGVs. One of the solvers is D-Wave 2000Q, which solves the unconstrained binary quadratic programming problem (recently also termed as the quadratic unconstrained binary optimization (QUBO) problem) using quantum annealing (QA) [7].

Quantum annealing utilizes the quantum effect for solving the QUBO problem [8]. The quantum effect introduces a driving force in searching for the minimum of a cost function. The mechanism is purely quantum, but it can be mapped onto a stochastic dynamical system in which binary variables fluctuate. The fluctuation is used to search for the ground state, which corresponds to the minimum of the cost function. Quantum annealing can find the ground state at the end of its protocol by gradually decreasing the strength of the fluctuation of binary variables. Theoretical assurance is established by following the quantum adiabatic theorem [9, 10, 11]. Surprisingly, QA is realized in an actual quantum device using present-day technology [12, 13, 14, 15]. Since then, QA has been developed rapidly and has attracted considerable attention. It has been tested for numerous applications such as portfolio optimization [16], protein folding [17], the molecular similarity problem [18], computational biology [19], job-shop scheduling [20], traffic optimization [21], election forecasting [22], and machine learning [23, 24, 25, 26, 27, 28]. In addition, its potential might be boosted by the nontrivial quantum fluctuation, referred to as the nonstoquastic Hamiltonian, for which efficient classical simulation is intractable [29, 30, 31, 32]. However, the solution is not always optimal owing to the limitations of devices and environmental effects [33]. Theoretical assurance is given under the assumption of the nonexistence of an environment. This is not a realistic situation in performing QA in quantum devices such as D-Wave 2000Q. Therefore, several protocols based on QA do not follow adiabatic quantum computation or maintain a system in the ground state to reach the optimal solution in the final stage of QA; rather, they employ a nonadiabatic counterpart [34, 35, 36, 37]. In addition, even though the outputs from D-Wave 2000Q are not optimal solutions, we may use these as approximate solutions. The approximate solutions can be at a satisfactory level depending on the purpose of the solution of the QUBO problem.

In the present study, we formulate the QUBO problem for controlling AGVs and solve it via D-Wave 2000Q. In the control of AGVs, rapid response is necessary for dealing with instantaneous changes in a system. Thus, it is expected that D-Wave 2000Q can provide a method for establishing the future infrastructure for controlling AGVs because it can output approximate solutions in a few tens of microseconds. We also utilize other solvers for attaining the approximate solutions of our QUBO problem and compare their performance with D-Wave 2000Q. In addition, we discuss the potential applications of D-Wave 2000Q and its future development.

The remaining part of the paper is organized as follows: In the next section, we formulate the control of AGVs as the QUBO problem, which can be solved using D-Wave 2000Q. The solution does not always satisfy certain constraints for controlling AGVs, and output solutions must be postprocessed. We explain how to attain reasonable solutions via the postprocessing. In the third section, we solve the QUBO problem via D-Wave 2000Q and its original linear programming problem via the Gurobi Optimizer [38]. In the following section, we compare the results obtained by D-Wave 2000Q and other solvers. In the last section, we summarize our study and discuss the direction of future work on the optimization problem in terms of the QUBO problem and its solvers.

## Methods

We formulate the optimization problem for controlling AGVs in this section. The formulation is generic and not specific to individual situations. We do not optimize the entire plan to control all AGVs simultaneously. We consider iterative optimization to provide an adequate route for each AGV during time period . At time , we gather information on the location, , and the task, , distributed to each AGV. We solve our optimization problem and employ its solution to control the AGVs during time period . After moving the AGVs at , we again gather information on the current situation and iterate the above procedure.

Let us formulate the optimization problem to be solved for a plan in time period . We define the binary variable for each AGV as , where is the index for a route and is that for an AGV. The index of the route is selected from a set of routes, , where is the given task for the -th AGV. The index of runs from to , which is the number of AGVs. The set of routes is constructed a priori following the tasks and the structure of the plant in which the AGVs run. Then, we define the cost function for controlling the AGVs as

(1) |

and the following constraints are satisfied:

(2) |

and

(3) |

where ranges from to and denotes all edges of the network along which the AGVs move in the plant. The cost function is optimized to maximize the length of the routes of the AGVs. The first constraint is to enforce each AGV to select a single route. The second constraint is to avoid collision. This constraint is created by defining a binary quantity for characterizing the -th route as with and . For each route, on the edge occupied by the selected route, , at time . On the contrary, on the edge unoccupied by the selected route, , at time . In a previous study [21], a similar formulation was proposed for the traffic optimization of moving taxis. However, the study did not consider the time dependence of . In the present study, the speed of the AGVs is almost constant. The AGVs can move as expected and can be predicted precisely.

The above optimization problem is a linear programming problem, and thus, it can be straightforwardly solved using a conventional method. However, by employing the penalty method, we now formulate the linear programming problem as

(4) |

where and are predetermined coefficients. The second and third terms represent the constraints of the original linear programming problem. Thus, we do not allow any solution with a nonzero value in the these terms. To attain solutions that satisfy these constraints, we must set relatively large values of coefficients and based on the penalty method. The cost function can be reduced into the following quadratic form with a matrix as

(5) |

where is the vector consisting of binary variables . Thus, the above optimization problem is transformed into the unconstrained binary quadratic programming or QUBO problem.

In order to shorten the time of the whole procedure to control the AGVs, we prepare a database that stores the set of routes when we create the QUBO during the time period . In advance, we generate the shortest paths from an origin to a destination for each task. We divide the shortest paths into sets of several vertices at the longest , where is the maximum speed of the AGVs, and store them. When we build the QUBO matrix, we only elucidate a vertex set included in a part of the shortest paths beginning at up to the reachable position at the end of period . For instance, let us consider the case that the part of the shortest path for the -th AGV at consists of the node set . The shortest path is determined by the initial point, , and the destination for accomplishing the task, . This results in . Then, we prepare the route sets as , , and , which indicate ”stop,” ” step ahead,” and ” steps ahead,” respectively. This approach is similar to a classical method for collision avoidance between AGVs in that it introduces delays and deviations in the shortest paths for each vehicle [39].

Next, we describe how to solve the QUBO problem efficiently. We use D-Wave 2000Q to solve the QUBO problem. A binary variable is regarded as a spin through an interpretation as . In QA, the spin represents quantum-mechanical operators known as the -component Pauli matrices as , where denotes the quantum-mechanical operators that form the quantum system. D-Wave 2000Q exploits quantum fluctuation, which is frequently termed as the transverse field, to attain the solution of the QUBO problem through the quantum dynamics governed by the following Hamiltonian [7]:

(6) |

where is the -component of the Pauli matrices to generate the superposition of the up () and down () spins. Starting from the trivial initial state with full superposition at , quantum dynamics leads to a nontrivial ground state, which corresponds to the minimum of the cost function at , where and are time-dependent parameters defined as and . D-Wave 2000Q performs QA on its superconducting qubits. D-Wave 2000Q does not always output the optimal solution at the end of its protocol because it is not the device for performing ideal QA[33]. The ideal QA is assumed to perform in an isolated quantum system. In fact, D-Wave 2000Q typically outputs low-energy states, namely, approximate solutions with relatively low-valued cost functions. In addition, annealing time is extremely short owing to its limitation caused by the coherence time of superconducting qubits. Therefore, we can attain numerous outputs from D-Wave 2000Q for the same QUBO problem in a short time. This is a weak but outstanding feature of the D-Wave 2000Q. The feature is weak because solutions do not satisfy the constraints as outputs are not always optimal. In other words, in a few cases, nonzero values remain in the second and third terms of Eq. 4 for each output obtained from D-Wave 2000Q. However, the solutions employed to control the AGVs must satisfy all constraints. We utilize the remarkable advantage of D-Wave 2000Q, which enables us to attain a large number of outputs as the candidates of good solutions and not necessarily optimal solutions. We may filter out the solutions that do not satisfy the constraints from the outputs of D-Wave 2000Q. As a result, we obtain a reasonable solution without collisions and the double selection of routes.

We may utilize other solvers for finding the minimum of the QUBO problem. In the present study, we test the Fujitsu digital annealer (DA) to solve the QUBO problem [40, 41]. Fujitsu Laboratories has recently developed purpose-specific CMOS hardware designed to solve fully-connected QUBO problems (i.e., on complete graphs). This is an advantage over D-Wave 2000Q, in which the chimera graph is employed. The DA hardware is currently able to solve Ising-type optimization problems with a size of up to variables, with and bit (fixed) precision for biases and variable couplers, respectively. In the DA hardware, simulated annealing (SA) [42] with several improvements is employed to solve the QUBO problem. The improved SA uses a parallel-trial scheme, in which a flip of all variables in parallel is attempted at each step, and seeks higher acceptance probability for updating one-spin flip compared to the ordinary process in SA. In addition, the DA utilizes an escape mechanism referred to as a dynamic offset to overcome short and narrow barriers.

As described earlier, our QUBO problem is originally the linear programming problem. In the present study, we test direct manipulation to solve the original linear programing problem using the branch and bound method via the Gurobi Optimizer [38]. In the Gurobi Optimizer, we first solve a relaxed optimization problem by mapping the binary variables to continuous variables. The solution determines the lower bound of the cost function in the original optimization problem. Then, we solve the original linear programing problem by the discretization of the solution with continuous variables while reaching the lower bound. When reaching the lower bound, the tentative solution of the original optimization problem is confirmed to be optimal. The attained solution is deterministic owing to lack of stochasticity. This is an essential difference from D-Wave 2000Q and the DA.

Below, we report the comparison among the results obtained by these methods for controlling the AGVs while avoiding collision.

## Results

In this section, we report the results attained by iteratively solving the QUBO problem at each time period for controlling the AGVs. For proving the efficiency of our method, we prepare a simulation environment for an actual plant as shown in Fig. 1.

The plant utilizes AGVs for product delivery, and the AGVs move simultaneously along four fixed routes according to predetermined tasks. The speed of each AGV is m/s. The distance between nodes is m.

We simulate the controlled AGV movement using different methods, namely, the conventional method, and the solution of the QUBO problem by D-Wave 2000Q and the DA, and the solution of the original linear programming problem by the Gurobi Optimizer. The results attained by the conventional method and D-Wave 2000Q are shown in Fig. 2. We simulate the AGVs in the plant for seconds and indicate the accumulated waiting time by circles, as shown in Fig. LABEL:map_comp. The waiting rate is calculated by the ratio of the number of stopping AGVs and the total number of AGVs. Several circles represent frequent traffic jams of the AGVs. The size of a circle is proportional to the accumulated waiting time of the AGVs at that point.

The conventional method for controlling the AGVs is a rule-based method at every intersection.

The rule is that when the AGVs require the same intersection route, only one AGV can move in and out at the intersection. For example, when two AGVs require the same intersection, one AGV waits until the other AGV leaves the intersection. The AGVs that move along the circumference of the plant have higher priority for entering an intersection for increasing the working rate. The time average of the waiting rate converges to percent.

We solve the QUBO problem via D-Wave 2000Q at each time period. The time period is set to be seconds, namely [s]. We set the parameters as and because D-Wave 2000Q does not deal with large elements of QUBO matrix . In addition, we rescale the QUBO matrix such that it is within the range of the available magnitude. D-Wave 2000Q solves the QUBO problem times for finding reasonable solutions. We filter the solutions that do not satisfy the constraints and select one of the reasonable solutions for moving the AGVs further. The AGVs move following the selected solution during the time period of seconds. The solution indicates the movement in the next seconds. Thus, the movement of the AGVs is updated before they reach the end of the given route. It can be seen from Fig. 2 that the number of circles, which represent the accumulated waiting time, is considerably reduced compared to the conventional method. The time average of the waiting rate converges to percentages . The actual movement of the AGVs from the initial condition is shown in the supplemental video files. Compared to the result of the conventional method, the AGVs move smoothly following the solution to the QUBO problem. The readers can find the smooth movements of the AGVs in the supplemental movies.

## Other solvers and comparison data

It is not necessary to solve the QUBO problem using D-Wave 2000Q; one can utilize other solvers. One method is the DA, which solves the QUBO problem using an improved version of SA. Notice that the DA can solve the QUBO problem with a large number of binary variable compared to D-Wave 2000Q. In the present study, the maximum number of binary variables in the QUBO problem is , which is the product of the number of the AGVs () and the number of candidates of routes (). Thus, the number of the binary variables is quite small. Even though the DA does not exhibit its potential efficiency in this case, we find that the time average of the waiting rate converges around percent as shown in Fig. 3.

In addition, we solve the original linear programming problem through the relaxation of the binary variables to continuous variables by utilizing the branch and bound method via Gurobi Optimizer version 8.01 on a -core Intel i7 4770K processor with GB RAM. In this case, we attain the optimal solution of the original linear programming problem in a short time and utilize the optimal solution to control the AGVs. However, the optimal solution is found through discretization performed using a deterministic rule. Therefore, the Gurobi Optimizer does not output multiple optimal solutions when the solutions are degenerate. This deterministic method leads to a biased solution. The solution attained by the Gurobi Optimizer is the optimal solution for the original linear programming problem. However, it is not always optimal for controlling the AGVs without collisions. The time average of the waiting rate converges to percent, which is slightly higher than the results of D-Wave 2000Q and DA. This is the advantage of the stochasticity of D-Wave 2000Q and the DA, which can search for the “optimal” solution from a different perspective. The cost function itself is not necessarily a direct indicator of performance. Thus the optimal solution for the cost function is not always optimal for the actual performance. Similar phenomena appear in machine learning. Generalization performance, which is the measure of potential power in machine learning but not directly related to the cost function to be optimized, can be enhanced via stochastic methods to optimize cost functions. In particular, QA actually leads to better generalization performance, as shown in literature [43].

We repeat the iterative optimization for controlling the AGVs at each time period starting from the same initial condition times and obtain the average and maximum performance, as shown in Table 1. We confirm that D-Wave 2000Q outperforms the other solvers in this problem setting. The Gurobi Optimizer always leads to the optimal solutions, but the waiting rates are not less than the results obtained by D-Wave 2000Q and DA. This is because the optimal solutions do not always lead to the best control of the AGVs. In addition, the biased solution determined by the rule used to find the optimal solution in the Gurobi Optimizer accidentally experiences the worst case scenario for controlling the AGVs. We can find a better solution in terms of controlling the AGVs by modifying the procedure of finding the optimal solution in the Gurobi Optimizer. However it is not trivial to find such a modification. In fact, we add another constraint for the AGVs such that if several AGVs reach the same intersection, the AGV with more following AGVs is preferentially allowed to enter the intersection. In the formulation of the original linear programming problem, we do not consider the priority with which the AGVs are allowed to enter an intersection in an explicit manner. We solve the linear programming problem with the additional constraint by employing the Gurobi Optimizer and show its efficiency in Table 1. We establish the better cost function and constraints for the Gurobi Optimizer by comparing the solutions obtained by the stochastic solvers and deterministic solver, at least for the present study. In this sense, the comparison can be beneficial for solving the optimization problem efficiently and formulating a better cost function for accomplishing the actual purpose. However, the stochastic search performed by D-Wave 2000Q and the DA is one of the approaches to avoid such intractable fine tuning of the cost function.

Conventional | D-Wave 2000Q | Fujitsu digital annealer | Gurobi | Gurobi + | |
---|---|---|---|---|---|

Average | |||||

Max |

Below, we discuss the efficiency of the solvers from another point of view, namely the computational time. We investigate the “actual” computational time, which is obtained in a standard-user environment, and the quality of the attained solutions against the increase in the number of the AGVs and candidate routes. We prepare a hundred of different initial locations of the AGVs such as each pair of the AGVs encounter at an intersection and solve the optimization problem. We report the comparison results in average and variance below.

The D-Wave 2000Q takes s, which is predetermined by users, to once solve the optimization problem in the quantum chip with superconducting qubits. However preprocessing and postprocessing for preparation to solve the optimization problem, the latency of the network when we utilize the D-Wave 2000Q via cloud service, and the queueing time can not be avoided. Thus the actual computational time takes a little bit longer. In addition, the D-Wave 2000Q outputs many samples of the solutions once. We set the number of samples as and measure the actual computational time. We then estimate the actual computational time per output sample as ms for spins, ms for spins, ms for spins, ms for spins, ms for spins, and ms for spins. These computational times per output sample are only to solve the optimization problems without any assurance of precision of the attained solutions. The probability for attaining the ground state gradually decreases as the number of spins increases. In fact, for spins, for spins, for spins, for spins, for spins and for spins.

The number of spins consists of the multiplication of that of the AGVs and the routes. The computational time drastically increases for the case of D-Wave 2000Q beyond spins. This is due to the limitation of the number of binary variables to be solved simultaneously. We solve the case with a larger number of binary variables by utilizing qbsolv, which divides the original problem into a number of small problems. To iteratively use D-Wave 2000Q, we must wait for several seconds owing to the job queue via the cloud service provided by the D-Wave systems Inc. at each iteration to solve the small problems. The actual computational time per output sample and iteration is ms for spins, ms for spins, and ms for spins. The iteration numbers become for spins, for spins, and for spins. The former number in the product is the number of division of the original large problem into small subproblems, and the latter one is that of repetition to solve the optimization problem. Thus, the actual computational time can be extremely long. In addition, the probabilities for attaining the ground state get worse as for spins for spins and for spins. This is a weak point to employ the D-Wave 2000Q to solve the optimization problem. Although it seems that the computational time does not depend on the number of spins, the probability for attaining the ground state gradually decreases as the number of spins increases. On the other hand, the Gurobi Optimizer leads to the optimal solutions for each case. Its computational time to attain the optimal solution depends on the number of spins. ms for spins, for spins for spins, and ms for spins.

For the DA, the machine time is set to be enough to solve the optimization problem about ms. The actual computational time per output sample takes a little bit longer than the machine time as s for spins, s for spins, s for spins, s for spins, and for spins, for spins, ms for spins, s for , and ms for spins. Up to spins, the current version of the DA can solve once the optimization problem without dividing it into small subproblems. This is an advantage point of the DA. In addition, the probability for attaining the ground state is relatively higher compared to that of the D-Wave 2000Q as for , , , and spins, fpr spins, for spins, for spins, and for spins. Notice that the higher value of the probability for attaining the ground state is obtained by tuning the annealing schedule. Instead, the actual computational time takes longer. Then we compute the time to solutions (TTS) defined as

(7) |

where is the actual computational time per output sample and is a predetermined precision to attain the ground state. The time to solution is an indicator of the performance of the solver in the stochastic way. We show the comparison data of TTS(0.99) of the D-Wave 2000Q and the DA and the actual computational time of the Gurobi Optimizer in Fig. 4. In the successful cases with , we plot the actual computational time instead of the TTS. The actual computational time per output sample can be upper bound for the TTS.

## Conclusions

In the setting of the present study, D-Wave 2000Q, the DA, and the Gurobi Optimizer can solve the QUBO problem or the original linear programming problem within the predetermined time period, s. The number of the binary variables that can be solved within s, which is determined by the product of the numbers of AGVs and routes, are up to for D-Wave 2000Q, for the DA, and over for the Gurobi Optimizer in terms of the TTS and the actual computational time to attain the optimal solution. Notice that, in order to control the AGVs, it is not necessarily to find the optimal solutions. The time to solution is just one of the indicators. As a consequence, for controlling the AGVs via optimization problems, a large number of degrees of freedom should be dealt with by a digital computer using a modern algorithm such as the branch and bound method. In contrast, D-Wave 2000Q can be used to obtain a rapid response for controlling AGVs but with small numbers. In the cases with less than the maximum number of available qubits, D-Wave 2000Q outputs reasonable solutions for controlling AGVs without collision in a short time period. The Fujitsu DA supports a wide range of applications via the QUBO problem in terms of speed and availability. In this sense, a digital computer with a modern algorithm and purpose-specific devices, such as D-Wave 2000Q and the DA, are not in conflict. They each have adequate applications depending on conditions of a problem. In the present study, we take the case with the AGV speed is relatively slow. Therefore the time period s is reasonable.

In the present study, we demonstrate the efficiency of the formulation of the optimization problem for controlling the AGVs. The digital computer with the modern algorithm is the best choice as the solver in terms of the availability and speed in the current situation. Notice that we employ the actual computational time basically to estimate the performance of the D-Wave 2000Q and the DA, not the machine time. Therefore, if we can avoid the latency of the communication and queuing time for dealing with the jobs to solve the optimization problem in both of the devices via cloud services, better efficiency can be achieved. In this sense, the computational time of the D-Wave 2000Q and the DA can be reduced significantly. For instance, the machine time for optimization problem by the D-Wave 2000Q can be set to be s and that of the DA is ms. The D-Wave 2000Q can be an alternative for solving the optimization problem for controlling the AGVs in real plants.

The time period was set in the present study following the current situation of the real plant, in which several workers walks, In the cases without any workers, the AGVs can move faster than the setting of the present study. Then shorter response time for controlling the AGVs is necessary. The next-generation quantum annealer beyond the D-Wave 2000Q is expected as a candidate for controlling the AGVs in such future plants. The D-Wave quantum processing units continues to steadly grow in number of qubits. The precision to find the ground state getting better, the TTS becomes shorter. Moreover the actual computational time can be also shorter. In this sense, the shorter response time can be achieved and such future plants can be created by the next-generation quantum annealer, although the current version, the D-Wave 2000, is just a proof of concept and has demonstrated the comparative performance with the modern algorithm on the digital computer. The present study is the first step toward the efficient control of AGVS in future plants as one of the candidates in the real-world application of the QA.

## References

- [1] Ullrich, G. Automated Guided Vehicle Systems: A Primer with Practical Applications (Springer Publishing Company, Incorporated, 2014).
- [2] Fazlollahtabar, H. & Saidi-Mehrabad, M. Autonomous Guided Vehicles: Methods and Models for Optimal Path Planning (Springer Publishing Company, Incorporated, 2016), 1st edn.
- [3] Li, Q., Adriaansen, A., Udding, J. & Pogromsky, A. Design and control of automated guided vehicle systems: A case study. \JournalTitleIFAC Proceedings Volumes 44, 13852 – 13857 (2011). DOI https://doi.org/10.3182/20110828-6-IT-1002.01232. 18th IFAC World Congress.
- [4] Li, Q., Udding, J. T. & Pogromsky, A. Zone-Control-Based Traffic Control of Automated Guided Vehicles, 53–60 (Springer International Publishing, Cham, 2015).
- [5] Li, Q., Pogromsky, A., Adriaansen, T. & Udding, J. T. A control of collision and deadlock avoidance for automated guided vehicles with a fault-tolerance capability. \JournalTitleInternational Journal of Advanced Robotic Systems 13, 64 (2016). DOI 10.5772/62685.
- [6] Fazlollahtabar, H., Saidi-Mehrabad, M. & Balakrishnan, J. Mathematical optimization for earliness/tardiness minimization in a multiple automated guided vehicle manufacturing system via integrated heuristic algorithms. \JournalTitleRobotics and Autonomous Systems 72, 131 – 138 (2015). URL http://www.sciencedirect.com/science/article/pii/S0921889015000949. DOI https://doi.org/10.1016/j.robot.2015.05.002.
- [7] Kadowaki, T. & Nishimori, H. Quantum annealing in the transverse ising model. \JournalTitlePhys. Rev. E 58, 5355–5363 (1998). DOI 10.1103/PhysRevE.58.5355.
- [8] Lucas, A. Ising formulations of many np problems. \JournalTitleFrontiers in Physics 2, 5 (2014). URL https://www.frontiersin.org/article/10.3389/fphy.2014.00005. DOI 10.3389/fphy.2014.00005.
- [9] Suzuki, S. & Okada, M. Residual energies after slow quantum annealing. \JournalTitleJournal of the Physical Society of Japan 74, 1649–1652 (2005). DOI 10.1143/JPSJ.74.1649.
- [10] Morita, S. & Nishimori, H. Mathematical foundation of quantum annealing. \JournalTitleJournal of Mathematical Physics 49 (2008). DOI http://dx.doi.org/10.1063/1.2995837.
- [11] Ohzeki, M. & Nishimori, H. Quantum annealing: An introduction and new developments. \JournalTitleJournal of Computational and Theoretical Nanoscience 8, 963–971 (2011-06-01T00:00:00). DOI doi:10.1166/jctn.2011.1776963.
- [12] Johnson, M. W. et al. A scalable control system for a superconducting adiabatic quantum optimization processor. \JournalTitleSuperconductor Science and Technology 23, 065004 (2010).
- [13] Berkley, A. J. et al. A scalable readout system for a superconducting adiabatic quantum optimization system. \JournalTitleSuperconductor Science and Technology 23, 105014 (2010).
- [14] Harris, R. et al. Experimental investigation of an eight-qubit unit cell in a superconducting optimization processor. \JournalTitlePhys. Rev. B 82, 024511 (2010). DOI 10.1103/PhysRevB.82.024511.
- [15] Bunyk, P. I. et al. Architectural considerations in the design of a superconducting quantum annealing processor. \JournalTitleIEEE Transactions on Applied Superconductivity 24, 1–10 (2014). DOI 10.1109/TASC.2014.2318294.
- [16] Rosenberg, G. et al. Solving the optimal trading trajectory problem using a quantum annealer. \JournalTitleIEEE Journal of Selected Topics in Signal Processing 10, 1053–1060 (2016). DOI 10.1109/JSTSP.2016.2574703.
- [17] Perdomo-Ortiz, A., Dickson, N., Drew-Brook, M., Rose, G. & Aspuru-Guzik, A. Finding low-energy conformations of lattice protein models by quantum annealing. \JournalTitleScientific Reports 2, 571 EP – (2012). URL https://doi.org/10.1038/srep00571.
- [18] Hernandez, M. & Aramon, M. Enhancing quantum annealing performance for the molecular similarity problem. \JournalTitleQuantum Information Processing 16, 133 (2017). URL https://doi.org/10.1007/s11128-017-1586-y. DOI 10.1007/s11128-017-1586-y.
- [19] Li, R. Y., Di Felice, R., Rohs, R. & Lidar, D. A. Quantum annealing versus classical machine learning applied to a simplified computational biology problem. \JournalTitlenpj Quantum Information 4, 14 (2018). URL https://doi.org/10.1038/s41534-018-0060-8. DOI 10.1038/s41534-018-0060-8.
- [20] Venturelli, D., Marchand, D. J. J. & Rojo, G. Quantum Annealing Implementation of Job-Shop Scheduling. \JournalTitleArXiv e-prints (2015). 1506.08479.
- [21] Neukart, F. et al. Traffic flow optimization using a quantum annealer. \JournalTitleFrontiers in ICT 4, 29 (2017).
- [22] Henderson, M., Novak, J. & Cook, T. Leveraging Adiabatic Quantum Computation for Election Forecasting. \JournalTitleArXiv e-prints (2018). 1802.00069.
- [23] Crawford, D., Levit, A., Ghadermarzy, N., Oberoi, J. S. & Ronagh, P. Reinforcement Learning Using Quantum Boltzmann Machines. \JournalTitleArXiv e-prints (2016). 1612.05695.
- [24] Arai, S., Ohzeki, M. & Tanaka, K. Deep neural network detects quantum phase transition. \JournalTitleJournal of the Physical Society of Japan 87, 033001 (2018). DOI 10.7566/JPSJ.87.033001. https://doi.org/10.7566/JPSJ.87.033001.
- [25] Takahashi, C. et al. Statistical-mechanical analysis of compressed sensing for hamiltonian estimation of ising spin glass. \JournalTitleJournal of the Physical Society of Japan 87, 074001 (2018). DOI 10.7566/JPSJ.87.074001. https://doi.org/10.7566/JPSJ.87.074001.
- [26] Ohzeki, M. et al. Quantum annealing: next-generation computation and how to implement it when information is missing. \JournalTitleNonlinear Theory and Its Applications, IEICE 9, 392–405 (2018). DOI 10.1587/nolta.9.392.
- [27] Neukart, F., Von Dollen, D., Seidel, C. & Compostella, G. Quantum-enhanced reinforcement learning for finite-episode games with discrete state spaces. \JournalTitleFrontiers in Physics 5, 71 (2018). DOI 10.3389/fphy.2017.00071.
- [28] Khoshaman, A., Vinci, W., Denis, B., Andriyash, E. & Amin, M. H. Quantum variational autoencoder. \JournalTitleQuantum Science and Technology 4, 014001 (2018). URL http://stacks.iop.org/2058-9565/4/i=1/a=014001.
- [29] Seki, Y. & Nishimori, H. Quantum annealing with antiferromagnetic fluctuations. \JournalTitlePhys. Rev. E 85, 051112 (2012). DOI 10.1103/PhysRevE.85.051112.
- [30] Seki, Y. & Nishimori, H. Quantum annealing with antiferromagnetic transverse interactions for the hopfield model. \JournalTitleJournal of Physics A: Mathematical and Theoretical 48, 335301 (2015).
- [31] Ohzeki, M. Quantum monte carlo simulation of a particular class of non-stoquastic hamiltonians in quantum annealing. \JournalTitleScientific Reports 7, 41186 (2017).
- [32] Arai, S., Ohzeki, M. & Tanaka, K. Dynamics of Order Parameters of Non-stoquastic Hamiltonians in the Adaptive Quantum Monte Carlo Method. \JournalTitleArXiv e-prints (2018). 1810.09943.
- [33] Amin, M. H. Searching for quantum speedup in quasistatic quantum annealers. \JournalTitlePhys. Rev. A 92, 052323 (2015).
- [34] Ohzeki, M. Quantum annealing with the jarzynski equality. \JournalTitlePhys. Rev. Lett. 105, 050401 (2010). DOI 10.1103/PhysRevLett.105.050401.
- [35] Ohzeki, M., Nishimori, H. & Katsuda, H. Nonequilibrium work on spin glasses in longitudinal and transverse fields. \JournalTitleJ. Phys. Soc. Jpn. 80, 084002 (2011). DOI 10.1143/JPSJ.80.084002.
- [36] Ohzeki, M. & Nishimori, H. Nonequilibrium work performed in quantum annealing. \JournalTitleJournal of Physics: Conference Series 302, 012047 (2011).
- [37] Somma, R. D., Nagaj, D. & Kieferová, M. Quantum speedup by quantum annealing. \JournalTitlePhys. Rev. Lett. 109, 050501 (2012).
- [38] Gurobi Optimization, L. Gurobi optimizer reference manual (2018). URL http://www.gurobi.com.
- [39] Dowsland, K. A. & Greaves, A. M. Collision avoidance in bi-directional agv systems. \JournalTitleThe Journal of the Operational Research Society 45, 817–826 (1994). URL http://www.jstor.org/stable/2584290.
- [40] Tsukamoto, S., Takatsu, M., Matsubara, S. & Tamura, H. An accelerator architecture for combinatorial optimization problems. In FUJITSU Sci. Tech. J., vol. 53, 8 (2017).
- [41] Aramon, M. et al. Physics-Inspired Optimization for Quadratic Unconstrained Problems Using a Digital Annealer. \JournalTitleArXiv e-prints (2018). 1806.08815.
- [42] Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. Optimization by simulated annealing. \JournalTitleScience 220, 671–680 (1983). DOI 10.1126/science.220.4598.671.
- [43] Ohzeki, M., Okada, S., Terabe, M. & Taguchi, S. Optimization of neural networks via finite-value quantum fluctuations. \JournalTitleScientific Reports 8, 9950 (2018). URL https://doi.org/10.1038/s41598-018-28212-4.

## Acknowledgements

The authors would like to thank Shu Tanaka for fruitful discussions, which contributed to the work. The present work is financially supported by MEXT KAKENHI Grant No. 15H03699 and 16H04382, and by JST START.

## Author contributions statement

M.O. conceived and conducted the experiment and analyzed the results. A. M. created the simulation program for AGVs and implemented it in conjunction with the optimization scheme via D-Wave 2000Q, Fujitsu digital annealer, and the classical solvers. M. J. M conducted the experiment using D-Wave 2000Q and the Fujitsu digital annealer. M. T. discussed the possibility of the other applications of our method to industry, directed the project in our study, and investigated the possible design of our method. All authors discussed the details of the results and reviewed the manuscript.