Impact of Power System Partitioning on the Efficiency of Distributed Multi-Step Optimization
Impact of Power System Partitioning on the Efficiency of Distributed Multi-Step Optimization
Recent studies have shown that multi-step optimization based on Model Predictive Control (MPC) can effectively coordinate the increasing number of distributed renewable energy and storage resources in the power system. However, the computation complexity of MPC is usually high which limits its use in practical implementation. To improve the efficiency of MPC, in this paper, we apply a distributed optimization method to MPC. The approach consists of a partitioning technique based on spectral clustering that determines the best system partition and an improved Optimality Condition Decomposition method that solves the optimization problem in a distributed manner. Results of simulations conducted on the IEEE 14-bus and 118-bus systems show that the distributed MPC problem can be solved significantly faster by using a good partition of the system and this partition is applicable to multiple time steps without frequent changes.
keywords:Power system partitioning, model predictive control, multi-step optimal power flow, decomposition method, renewable energy
[mycorrespondingauthor]Corresponding author url]firstname.lastname@example.org
With an increasing number of intermittent energy resources and storage devices integrated into the power system, the question that arises is how to optimally coordinate these resources to overcome the uncertainty introduced by the intermittent resources and the inter-temporal coupling of storages. Approaches based on Model Predictive Control (MPC) are able to address these challenges effectively as they determine the current optimal states of the controllable devices with a look-ahead scheme that accounts for the temporal characteristics of both intermittent resources and storage devices. For example, the energy dispatch problem with intermittent resources is formulated as a multi-step optimization problem over a pre-defined time horizon which treats the intermittent energy resources as negative loads that must be consumed when available (1). Similar formulation has also been proposed for the AC Optimal Power Flow (AC OPF) problem that aims to minimize the generation cost of non-intermittent generations over a finite time horizon with the integration of wind generation and storages (2). Both studies have shown that the total generation cost can be reduced by using such an MPC based multi-step optimization approach to effectively coordinate the intermittent resources and storages.
However, the MPC approach is computationally expensive since the size of the optimization problem grows drastically as the optimization horizon increases. Such computation complexity restricts the practical use of MPC because no control actions can be taken if an MPC problem is not solved within the required amount of time due to the lack of computation capability or storage capacity at the central computation entity. To address this issue, distributed MPC has been studied and applied to various applications such as optimal power flow (3), dispatch of generation with emission limitation (4) and automatic generation control (5). Comprehensive surveys have presented different types of distributed MPC (6); (7), where one common approach reviewed in those surveys uses decomposition techniques to solve the MPC problem in a distributed fashion. Multiple decomposition techniques have been reported over the past decades based on Lagrangian (8); (9), Augmented Lagrangian (10); (11); (12) and Benders Decomposition (13), which are generally based on the principle of decomposing the optimization problem into subproblems that can be solved in parallel. For the coordination of wind generation and storages, the Optimality Condition Decomposition (OCD) (9); (14) has been applied to solve the multi-step AC OPF problem successfully by dividing the entire power system into several regions each associated with a subproblem to solve (3). Other distributed MPC methods via dual decomposition (15) and temporal decomposition (16) are also proposed. Distributed MPC not only reduces the computation burden in the centralized approach, but also helps preserve the data privacy among different control regions as only a small amount of data needs to be shared among neighboring regions to achieve the overall optimality of the entire system.
However, while distributed approaches can alleviate the centralized computational burden, most distributed methods are iterative and generally take many iterations to converge, which may still lead to the violation of the time available for solving a specific problem. It has been observed that the number of iterations when using decomposition methods is greatly dependent on the system partitioning; i.e., which bus is assigned to which subsystem, if the overall system is decomposed geographically (17). Based on this observation, a partitioning method based on spectral clustering has recently been proposed that determines the best partition of a system such that the decomposition method can converge in fewer iterations using the determined partition (18). For improving the efficiency of the MPC approach, in this paper, we apply this partitioning method in conjunction with a decomposition technique to solve the MPC problem. Specifically, we consider optimizing the usage of the wind generation by using storage and employing a multi-step AC OPF problem that minimizes the total generation cost over a certain time horizon by optimally setting the charging/discharging status of the storages. Apart from the application considered in this paper, the proposed approach can also be applied to solve similar MPC-based multi-step optimization problems in a distributed fashion as well. The proposed distributed MPC consists of two steps: First, the partitioning method is applied to find the best partition of the test system; then, the multi-step optimization problem is solved using a decomposition method for a 24-hour time period. Through case studies, the time efficiency of the proposed distributed MPC approach is quantified and the impact of system partitioning on the speed of distributed MPC is highlighted. In particular, we demonstrate that the same partition can be used for solving the MPC problem for multiple time steps, which eases the practical use of distributed MPC as the partition of the system does not need to be changed frequently.
The rest of the paper is organized as follows: In Section 2, the multi-step AC OPF problem is formulated and a distributed optimization problem formulation is also given. In Section 3, the partitioning method and the decomposition method used in this paper are presented. Section 4 quantifies the effectiveness and efficiency of the distributed MPC approach with a focus on the impact of system partitioning on the convergence speed through simulations using the IEEE 14-bus and 118-bus test systems. Finally, Section 5 concludes the paper and proposes possible future directions.
2 Problem Formulation
In this section, the centralized multi-step AC OPF problem is first formulated including wind generation and storages, and its formulation in the distributed form is then given. Figure 1 shows an example of the wind and load data for a 24-hour period obtained from the Bonneville Power Administration at a 10-minute scale. As seen from Fig. 1, the wind generation at different times of the day is usually random and it is possible that the wind generation is high while the demand of the system is low. Hence, to reduce the generation cost at the peak demand, the procedure is to store the excessive energy generated by the cheap generators during the time when the wind generation can serve most of the system load and use the stored energy when the demand is high. With this purpose, we formulate a multi-step AC OPF problem where the objective is to minimize the total generation cost of non-renewable generations over a finite time horizon that consists of multiple time steps. The OPF problems at each time step are coupled by the charging and discharging of the storage devices. It is assumed that the wind generation must be consumed when available, hence it can be treated as negative load at the bus where the wind generation is placed. The overall multi-step AC OPF problem at time step is formulated as follows:
for and . The notations used in the problem formulation are listed below.
|number of buses|
|number of generators|
|cost parameters of generator|
|active and reactive power output of generator|
|active and reactive load at bus|
|active power output of wind generator at bus|
|power injected into storage at bus|
|power drawn from storage at bus|
|voltage magnitude of bus|
|difference of voltage angles between bus and bus|
|set of buses connected to bus|
|set of generators connected to bus|
|energy level in the storage at bus|
|time between two consecutive time steps|
|charging/discharging efficiency of storage at bus|
|standby loss of storage|
|charging/discharging efficiency of the storage|
|current on line from bus to bus|
Equations (1) and (1) are the active and reactive power flow balances at each bus. Equation (1d) corresponds to the inter-temporal constraints on storages, (1j) reflects the line thermal limits and all other constraints denote the upper and lower limits on the variables. Apart from the constraints explicitly given above, the voltage angle at the slack bus is set to zero and the voltage magnitudes at generator buses are set to pre-determined values. As a standard procedure in MPC, the solution found for the first time step is applied once the overall problem is solved. Then the optimization time horizon is shifted by time and the optimization problem is formulated and solved for the next time step.
In the following, we formulate the problem (1a) to (1j) in a distributed fashion by grouping the variables into sets that correspond to different subproblems. Such a formulation will facilitate the implementation of decomposition methods, which will be explained in Section 3.1. Note that the geographical decomposition of the problem is considered in this paper where there is a subproblem associated with each area. The reformulated optimization problem as a function of these sets of variables for a total of areas is given by
where includes the variables assigned to subproblem and denotes the objective function associated with the -th subproblem. Constraints (1) to (1j) are represented in a compact form by constraints (2b) and (2c). Constraint (2b) denotes the coupling constraint as it contains variables from multiple subproblems and (2c) denotes the non-coupling constraint as it only contains variables from one subproblem. In the considered OPF problem, the coupling constraints include the power flow balance at the buses placed at the boundaries of the areas and the thermal limits on tie lines connecting different areas, while all other constraints are considered as non-coupling constraints. The inequality constraints in (2b) and (2c) are handled with an Interior Point method.
3 Partitioning and Decomposition
In this section, the two major methods used in our distributed optimization framework are introduced: 1) the partitioning method that determines the best geographical partition of the power system and 2) the decomposition method, namely, the Optimality Condition Decomposition method with Correction terms (OCD-C) that solves the OPF problem in a distributed fashion based on the partition determined using 1).
3.1 Decomposition Method
Optimality Condition Decomposition
Before introducing OCD-C, the general OCD method(9); (14) is presented. To solve (2a)-(2c), one general approach is to derive the Lagrangian function which is denoted as and find the solutions to satisfy the KKT conditions. Denoting all the variables that need to be determined (including the Lagrange multipliers) by and solving for the KKT conditions using the Newton-Raphson approach, the aforementioned procedure is equivalent to solving the following equations to get the update of variables at each iteration:
Here, the variables are grouped according to subproblems and the indices in indicate to which subproblem the variables belong. is the Hessian matrix of the Lagrangian function of the overall optimization function with the variables rearranged according to the subproblems they are assigned to. All the elements in the right-hand-side vector of (3) have to be equal to zero at optimality. Notice that the subproblems are coupled by the non-trivial off-diagonal blocks (where ) in , which does not allow independent solutions of to . Hence, to decouple the subproblems, OCD takes an approximate Newton step by setting the off-diagonal block elements in to zeros (14) and then solving (3). Consequently, each area can carry out the following Newton-Raphson step independently
and update its variables . Then the updated values of variables are shared among subproblems to enable the next iteration of calculation. Note that only a small number of the updated variables need to be exchanged between neighboring areas including the voltages of the buses placed at the boundaries, the Lagrange multipliers associated with the power flow balances at those buses and the multipliers associated with the tie line thermal limits.
OCD with Correction Terms
Due to the fact that the updates of variables in OCD neglect the coupling between subproblems, it deviates from the centralized approach which results in more iterations until convergence. To alleviate this problem, an extended OCD with additional correction terms, namely, OCD-C is proposed and applied to various case studies (17); (3); (18). In OCD-C, the correction terms is added to the right-hand-side of (7) which results in the following updates of variables:
Here, is the correction term and it can be calculated by
where each term in the summation can be calculated in one subproblem and sent to subproblem . The detailed derivation of (9) and the convergence criterion of OCD-C can be found in (18) which shows that OCD-C can converge to the same solution as the centralized approach if the convergence criterion is fulfilled. As and are both sparse, the correction term only contains few non-zero terms which results in little additional information to be exchanged. However, by adding the correction term, OCD-C generally converges in notably fewer iterations than OCD and is used in this paper instead of OCD in the simulations.
3.2 Power System Partitioning
It has been observed that the number of iterations until convergence when using OCD and OCD-C highly depends on how the system is partitioned into subsystems (17). Hence, to improve the time efficiency of decomposition methods, a partitioning method based on spectral clustering is proposed to determine a good partition of the system (18). In essence, the developed partitioning method defines a metric for measuring the computational coupling between buses based on the formulation of the considered optimization problem and groups the strongly computationally coupled buses into one subsystem. This is based on the premise that weaker couplings lead to less mutual impact among the subsystems, thus leading to faster convergence of the decomposition methods. It has been validated that the proposed partitioning method is effective in minimizing the computational coupling between subproblems to speed up the convergence of the decomposition methods. In the following, we present the rationale and key steps and procedures of the partitioning method, while more details can be found in (18).
As mentioned before, we first define an affinity metric between any two buses that measures or represents their computational coupling. The notation is used to denote the Hessian matrix of the overall Lagrangian function with variables ungrouped. We take advantage of the fact that if any entry in is non-zero, this is an indication that the two variables with indices and are coupled; i.e., the updates of these two variables will appear in the same equation, hence, they directly affect each other. The larger the absolute value of , the stronger the coupling. Furthermore, it is assumed that the variables associated with one bus such as the voltage angle, the voltage magnitude and Lagrange multipliers should be assigned to the same subproblem. Hence, the affinity between any two buses is acquired based on the summation of all the absolute values of the elements in that are associated with these two buses. Again, a larger affinity denotes a stronger computational coupling. Specifically, the affinity metric between any bus and bus is calculated as follows:
where is the -th element in the admittance matrix, and and denote the sets of the indices of the variables associated with buses and , respectively. A more detailed explanation on the derivation of this affinity metric is given in (18).
After the affinity metric is calculated, the spectral clustering technique (19) is applied which groups the buses based on the affinity among the buses. In this work, we pre-define the number of areas but then determine which bus should be assigned to which area. Note that: 1) the partition of the system only affects the assignment of variables into subproblems in the computation, but does not affect the physical partition of the power system; 2) the partition of the system does not affect the exact solution of the optimization problem but only affects the time that decomposition methods take to converge to the solution; 3) the used in the calculation is evaluated at the optimal point which could be different depending on operating points. For the MPC problem, the operating point changes with the time step as the load and wind generation vary during a day. However, it will be shown in Section 4.3 that the partition of the system once determined is applicable to multiple time steps, hence does not need to be changed frequently. A simple explanation for this is that the affinity between buses is calculated mainly based on the line admittance, the voltage magnitude, the and of the differences between two bus angles, and the Lagrange multipliers for power flow and line thermal constraints, which do not change dramatically as the operating point changes if there is no severe line congestion. In Section 4.3, it will be further discussed how to choose the operating point for applying the partitioning method in the MPC problem and handle the scenarios where there are different lines becoming congested.
4 Case studies
The distributed MPC approach is tested on the IEEE 14-bus and 118-bus systems. In this section, we will present two sets of simulation results. First, the results using different time horizons are compared, which show the benefit of the MPC approach in terms of reducing the generation cost and the ramping of the generators. Next, the convergence speeds of the decomposition method for different partitions are compared, which demonstrates the importance of system partitioning and the fact that the previously developed partitioning method can be effectively applied to the MPC problem.
4.1 Simulation Setup
The simulations are run in Matlab on an iMac with 3.2GHz Intel Core i5 and 8GB memory. The storage device has a roundtrip efficiency of , standby loss of 0.005 10-minute and maximum capacity of 1.0 10-minute. The wind and load data in Fig. 1 is used. The simulations were run for a 24-hour period using the time horizons of and with the time interval of min, which correspond to no horizon, 30-minute, 60-minute, and 90-minute horizon, respectively. As the longest time horizon is 90 minutes and there are in total 144 time intervals over the 24-hour period, a total of 135 time steps are simulated using the available data. A multi-step AC OPF problem is solved at each time step. For comparison, the centralized optimization which uses the Newton-Raphson approach to update variables is also simulated. Convergence is achieved if the norm of all the mismatch between the constraints is lower than . The same starting point and convergence criterion are used for the OCD-C method with different partitions and the centralized approach. Note that the OCD-C always converges to the same solution as the centralized approach regardless of what partition is used.
4.2 Impact of the Optimization Horizon
In this subsection, the results for the IEEE 14-bus system are given for evaluating the impact of the length of the optimization horizon. A wind generator is located at Bus 5 and a storage device is located at Bus 14. Figure 2 shows the optimal storage energy level with different time horizons denoted by . It is clear that the utilization of the storage increases as the time horizon increases. The benefit of optimizing the usage of the storage is demonstrated in Table 1 which shows the total generation cost and the total generator ramping over the simulated 24-hour period with different horizons. Again, as the time horizon increases, both the generator ramping and the generation cost decrease. Even though the generator ramping is not included as a hard constraint in the optimization problem, it has been reduced as the utilization of the storages smoothes out the fluctuations in the load. These results indicate that the MPC approach can effectively integrate the wind generation and storages especially with a longer time horizon. However, this does not indicate that one should extend the time horizon as much as possible, due to the fact that the problem size increases with the length of the time horizon, which will require more computation time and resources. Besides, the forecasted wind and load data might not be available or accurate for a long time horizon. Overall, the choice of the length of the time horizon depends on specific applications and the computation capability and is beyond the scope of this paper.
|Total Generator Ramping (p.u.)||4.1604||3.9598||3.7841||3.7708|
|Total Generation Cost ($)||679,145||679,144||678,985||678,874|
4.3 Impact of Partitioning
In this section, we focus on the efficiency of the distributed MPC approach and show that a good partition of the system is the key to efficiently implementing decomposition methods. To evaluate the performance of decomposition methods, two metrics are used, namely, the number of iterations and the convergence time . The convergence time is an approximation of the time spent on solving the subproblems in parallel. Specifically, where denotes the time spent on solving the th subproblem at each iteration, which is assumed not to change much over iterations because the subproblem size stays the same. The time spent on information exchange is not accounted for in the current simulation, but will be investigated in future works. A smaller and denote a better partitioning of the system as the objective of the partitioning method is to reduce the iterations and computation time until convergence.
For the 14-bus system, two partitions are used for the decomposition of the problem as shown in Fig. 3. “SP Partition” denotes the best partition determined by the spectral partitioning technique presented in Section 3.2, while “Arbitrary Partition” denotes an arbitrary geographical partition of the system. Note that it is highly likely that the arbitrary partition is chosen if one determines the partition only by observing the diagram of the system. The best partition is found at the operating point of base load level with and applied to solving the MPC problem at all time steps, which, as will be shown later, is also good for solving the MPC problem with an increased time horizon.
IEEE 14-bus system
The average, median and maximum number of iterations until convergence of OCD-C using two different partitions are shown in Table 2 and compared with the centralized approach. As shown in Table 2, using SP Partition will lead to significantly reduced iterations compared to the arbitrary partition. Note that decomposition methods would always take more iterations than the centralized method due to the fact that only partial information of the system is available at each subproblem and frequent information exchange needs to be made to achieve the overall optimality.
|Iterations||Centralized||SP Partition||Arbitrary Partition|
In terms of the actual computation time, the average convergence time is shown in Table 3. It can be seen that using SP Partition, the convergence time of OCD-C is only slightly higher compared to the centralized approach, while the convergence using the arbitrary partition is much slower. We note here that for the IEEE-14 bus system, the centralized approach converges the fastest for most time steps due to the small problem size. However, when deployed on larger systems, the time efficiency of the distributed approach will be superior than that of the centralized approach, which will be demonstrated in Section 4.3.2. Note that the sparsity of the matrices is exploited in the simulation to speed up the calculation, which works more to the advantage of the centralized approach where the matrices involved in calculations are relatively sparser. Hence, on this toy example, it is fairly impressive that the distributed MPC approach achieves a comparable time efficiency as the centralized approach if a good partition is used.
|Centralized||SP Partition||Arbitrary Partition|
Now, to evaluate the robustness of the partition with respect to multiple time steps, the convergence time with the time horizon N=9 over all time steps are shown in Fig. 4. In this 14-bus case, there is no line congestion observed over all time steps. As shown in Fig. 4, SP Partition, which is the best partition found by the partitioning method at base load level for N=1, always leads to a reduced convergence time compared to Arbitrary Partition for all time steps. Hence, the best partition is fairly robust and there is no need to change the partition for different time steps in this case.
IEEE 118-bus system
To further test the efficiency of the distributed MPC approach, a larger system, namely the IEEE 118-bus system, is also used. A wind generator is located at Bus 19 and a storage device is located at Bus 70. Again, we use the partitioning method to find the best partition of the system and choose another arbitrary partition by observation of the system diagram for comparison. Both partitions are shown in Fig. 5. The number of iterations and time until convergence are shown in Table 4 and 5, respectively. Similar to the 14-bus case, both the iterations and convergence time are significantly smaller using the best partition compared to the arbitrary one. It is worth highlighting that the average convergence time using the best partition is lower than the centralized approach, which shows an increased benefit of implementing distributed approaches on larger systems.
|Iterations||Centralized||SP Partition||Arbitrary Partition|
|Centralized||SP Partition||Arbitrary Partition|
The convergence time with the time horizon N=9 over all time steps are shown in Fig. 6. Note that for time step 1 to 13, a tie line associated with the best partition becomes congested, which, however, does not affect the convergence time of the OCD-C method much. In other words, the best partition is still robust with all time steps even when line congestion occurs. However, there could be cases where the convergence performance of the decomposition method is degraded once the tie line constraints become binding due to the increased computational coupling between the two areas that the tie line connects.
Here, we provide insights into how one can determine the partition of a system for different time steps when the load levels are different. Since the best partition performs well for a fairly large range of load levels, one can find the best partition at the load level that occurs during most of the day. When the tie lines associated with the best partition are not severely congested as in the considered case, the best partition can be applied to all time steps. When the tie lines become severely congested, the partitioning method can be applied for that particular operating point to find a new partition. Overall, due to the robustness of the best partition, it can be expected that the computation effort spent on determining the partition of the system is quite low because the partition only needs to be computed for several operating points with different line congestion scenarios.
In this paper, we applied a partitioning method in conjunction with a decomposition technique to solve a MPC-based multi-step AC OPF problem in a distributed manner, which results in an effective integration of the wind generation and the storage device. Through simulation results, we showed that by determining a good partition of the system using the presented partitioning method, the efficiency of the decomposition method can be significantly improved. In particular, the computation time using the proposed distributed approach is shorter compared with the centralized approach when applied to large systems. Furthermore, the best partition of the system is applicable to a wide range of time steps. The proposed distributed optimization approach can also be used for solving other general multi-step optimization problems, which provides a useful tool in the planning and management of power systems.
For future work, we plan to investigate how the information exchange involved in the distributed optimization can be implemented in real systems and how the associated communications latency affects the overall efficiency of decomposition methods.
The authors would like to thank ABB for the financial support and particularly Dr. Xiaoming Feng for his invaluable inputs.
- journal: Electric Power Systems Research
- L. Xie, M. D. Ilić, Model predictive dispatch in electric energy systems with intermittent resources, in: IEEE International Conference on Systems, Man and Cybernetics, 2008.
- K. Baker, G. Hug, X. Li, Optimal integration of intermittent energy sources using distributed multi-step optimization, in: Power and Energy Society General Meeting, 2012.
- K. Baker, J. Guo, G. Hug, X. Li, Distributed MPC for efficient coordination of storage and renewable energy sources across control areas, IEEE Transactions on Smart Grids 7 (2) (2016) 992–1001.
- A. Elaiw, X. Xia, A. Shehata, Application of model predictive control to optimal dynamic dispatch of generation with emission limitations, Electric power systems research 84 (1) (2012) 31–44.
- A. N. Venkat, I. A. Hiskens, J. B. Rawlings, S. J. Wright, Distributed mpc strategies with application to power system automatic generation control, IEEE Transactions on Control Systems Technology 16 (6) (2008) 1192–1206.
- P. D. Christofides, R. Scattolini, D. M. de la Pena, J. Liu, Distributed model predictive control: A tutorial review and future research directions, Computers & Chemical Engineering 51 (2013) 21–41.
- R. R. Negenborn, J. Maestre, Distributed model predictive control: An overview of features and research opportunities, in: IEEE 11th International Conference on Networking, Sensing and Control (ICNSC), 2014.
- A. M. Geoffrion, Lagrangean relaxation for integer programming, Springer, 1974.
- A. J. Conejo, F. J. Nogales, F. J. Prieto, A decomposition procedure based on approximate newton directions, Mathematical programming 93 (3) (2002) 495–515.
- B. H. Kim, R. Baldick, Coarse-grained distributed optimal power flow, IEEE Transactions on Power Systems 12 (2) (1997) 932–939.
- G. Cohen, Auxiliary problem principle and decomposition of optimization problems, Journal of optimization Theory and Applications 32 (3) (1980) 277–305.
- B. H. Kim, R. Baldick, A comparison of distributed optimal power flow algorithms, IEEE Transactions on Power Systems 15 (2) (2000) 599–604.
- M. Shahidehopour, Y. Fu, Benders decomposition: applying benders decomposition to power systems, IEEE Power and Energy Magazine 3 (2) (2005) 20–21.
- F. J. Nogales, F. J. Prieto, A. J. Conejo, A decomposition methodology applied to the multi-area optimal power flow problem, Annals of operations research 120 (1-4) (2003) 99–116.
- Y. Wakasa, M. Arakawa, K. Tanaka, T. Akashi, Decentralized model predictive control via dual decomposition, in: IEEE Conference on Decision and Control, 2008.
- A. G. Beccuti, T. Geyer, M. Morari, Temporal lagrangian decomposition of model predictive control for hybrid systems, in: IEEE Conference on Decision and Control, 2004.
- J. Guo, G. Hug, O. Tonguz, Impact of partitioning on performance of decomposition methods for AC optimal power flow, in: IEEE Innovative Smart Grid Technologies (ISGT) Conference, 2015.
- J. Guo, G. Hug, O. K. Tonguz, Intelligent partitioning in distributed optimization of electric power systems, IEEE Transactions on Smart Grid 7 (3) (2016) 1249–1258.
- A. Y. Ng, M. I. Jordan, Y. Weiss, et al., On spectral clustering: Analysis and an algorithm, Advances in neural information processing systems 2 (2002) 849–856.