Online Stochastic Control of Discrete Loads in Distribution Grids

Online Stochastic Control of Discrete Loads in Distribution Grids

Xinyang Zhou    Emiliano Dall’Anese    Lijun Chen X. Zhou and L. Chen are with the College of Engineering and Applied Science, University of Colorado, Boulder, CO 80309, USA (Emails: {xinyang.zhou, lijun.chen} Dall’Anese is with the National Renewable Energy Laboratory, Golden, CO 80401, USA (Email:

This paper considers power distribution grids with distributed energy resources (DERs) and flexible controllable loads, and designs an incentive-based algorithm to coordinate the network operator and customers to pursue personal and system-wide operational and economic objectives. Heterogeneous DERs and controllable loads with continuous and discrete control commands and various dynamics are considered. We first relax the discrete feasible sets into their convex hull and formulate a well-defined social-welfare optimization problem with general objective functions and constraints. We then propose an offline distributed stochastic dual algorithm for solving the relaxed problem while recovering original discrete set points at every iteration. We further implement our design in an online setting with time-varying optimal operation points, non-linear power flow equations and device dynamics, and asynchronous updates from devices. Stability and performance characterization of both offline and online algorithms are analytically established and numerically corroborated.


Discrete decision variable, real-time pricing, asynchronous updates, stochastic dual algorithm, distribution networks, time-varying optimization.

I Introduction

Power grids are enjoying an increasing control flexibility in both power supply and demand sides thanks to growing penetration of distributed energy resources (DERs) and controllable loads on the distribution level, including roof-top photovoltaic (PV) panels, electrical vehicle (EV) batteries, thermostatically controlled loads (TCLs, e.g., water heaters and air-conditioners (A/Cs)), etc. While these devices of flexibility may potentially provide ancillary services for the grids [dall2017unlocking], coordinating millions of prospective controllable devices of various dynamics and constraints with potential discrete decision variables to achieve network-oriented goals, e.g., voltage regulation, frequency control, and economic dispatch, is extremely challenging. Moreover, unlike traditional facilities owned by utility companies, the mass customer-owned devices are not naturally subject to control of network operators. This motivates us to design a distributed algorithm that can bring self-interested customers into the control loop such that network-oriented goals and constraints are guaranteed through inducing the desired customer behaviors by appropriate market mechanisms.

I-a Market-based Distribution Networks Control

Market-based algorithms have been recently developed to explore control flexibility of distributed energy assets owned by customers. The objective is to incentivize customers to provide ancillary services to the grids based on maximizing their own economic benefits and performance objectives [mohsenian2010autonomous, Maharjan13, Pedram14]. For example, customers may be incentivized to adjust the output powers of DERs or the power consumption of controllable loads in real time to aid voltage regulation [Liu08], control the aggregate network demand [li2016market], and follow regulating signals [vrettos2016robust].

In [zhou2017incentive2], we consider a social welfare optimization problem modeling operational objectives of both network operator and customers as well as voltage constraints. The formulated problem is non-convex due to its Stackelberg game structure, even though a linear approximation of the nonlinear power-flow equations is utilized. We reformulate the problem as a convex problem together with an optimal incentive signal design strategy, and prove that the reformulated problem is an exact relaxation of the original one, i.e., the solution of the relaxed problem is a global optima of the original non-convex problem. We further design a distributed algorithm based on primal-dual gradient descent for solving the relaxed problem, such that the network operator and the customers pursue given operational and economic objectives, while concurrently ensuring that voltages are within prescribed limits.

I-B Stochastic Dual Algorithm

While primal-dual gradient algorithms work well for devices with continuous decision variables and free from dynamics, like PV inverters, the design may be infeasible in the presence of more complex devices with discrete decision variables and dynamics, which are common in practice. For example, A/Cs have discrete power consumption rates (on/off) with future room temperatures to be controlled within acceptable ranges, and EV batteries pose requirement upon their state of charge (SOC) with preferred charging/discharging rates that preserve the battery lifespan.

In light of the market-based framework in [zhou2017incentive2], we seek a more practical algorithm to manage controllable devices with various dynamics by designing distributed algorithms based on dual ascent method. In our new design, the primal updates can be conducted through natural response expected from rational customers, i.e., to maximize individual utility subject to dynamics and constraints of devices given electricity price or ancillary service rewards, while the dual updates procedures, as part of the price/reward signal design, are carried out by network operator based on system-wide operational constraints. Our design presents a scalable way to handle devices of various dynamics by keeping the computation to their owners, which meanwhile also preserves customer privacy. Dual method is also consistent with smart home energy management system (HEMS), which usually achieve an optimal trade-off between comfort and electricity bill.

In addition, we propose a stochastic scheme to handle discrete decision variables. Particularly, we first relax the discrete feasible sets to their convex hull, and then apply aforementioned dual method to solve the relaxed problem while recovering feasible set points based on the solution of the relaxed problem at every iteration. The dual ascend method together with the stochastic scheme constitute the stochastic dual algorithm we propose in this paper.

The convergence of dual method is generally not guaranteed even for convex problems, and depends on the stepsize choice [molzahn2017survey]. The additional stochastic process we apply makes it even more challenging. In literature, diminishing stepsizes are usually necessary to guarantee the convergence [ram2009incremental, zhu2012distributed, polyak1987introduction, boyd2006subgradient, zhou2017stochastic], which is not practical especially in real-time settings. Nevertheless, we are able to leverage recent insights in dual method to provide sufficient conditions under which the dual algorithm with constant stepsize nevertheless converges to its optima, whilst the stochastic dual algorithm with constant stepsize converges to the optima on average with a radius depending on the variance of random process. To the best of our knowledge, this is leading convergence characterization for dual method applied in power system.

I-C Practical Implementation

This paper next considers implementation and analytical characterization of the designed algorithm under realistic surroundings, including time-varying optimal set points, nonlinear power flow model and device dynamics, and asynchronous updates of devices.

I-C1 Time-varying optimization problem

In the presence of intermittent renewable energy resources and fluctuation of uncontrollable loads, the optimal operating points varies constantly. This requires a real-time algorithm to track the optimal power set points trajectories of DERS, controllable loads, and system status like voltage levels and net powers at substation in a fast yet accurate way.

I-C2 Nonlinear dynamics

While the algorithm is designed based on linearized models of both power flow equations and device dynamics for computational tractability, the actual physical systems deviate from the approximated models. This motivates us to utilize feedback control where we feed the algorithm with data measured from real systems. This provides a good trade-off between computation complexity and performance guarantee.

I-C3 Asynchrony

Last but not least, we consider asynchronous updates from controllable devices with the following realistic motivation:

  • Customers are not obliged to synchronize their behaviors with each other; they instead make decisions at their convenience of arbitrary time.

  • Heterogeneous devices have different control granularity, e.g., PV inverters can change their set points in seconds, while A/Cs should not be switched on/off too often.

  • Synchronous updates from mass devices may cause detrimental crash and even outrage to the grids.

  • Synchrony can be seen as a special case of asynchrony.

We adapt our algorithm for realistic settings 1)–3), and deliver comprehensive performance analysis.

I-D Literature Review & Contributions of This Work

Related works can be categorized as follows:

I-D1 Discrete Decision Variables

In literature of power system, algorithms designed for discrete decision variables take either a deterministic way [bernstein2015design, liu1992discrete] which usually generates suboptimal solutions, or a stochastic way, e.g., [kim2013scalable, macfie2010proposed] which often lacks rigorous analytical performance characterization. Other works may rely on commercial solvers that may not be scalable: large number of devices and longer dispatch horizon require much more computation, and forbid real-time implementation.

I-D2 Market-based/Demand-response Design

Existing literature of market-based and demand-response problem formulation mostly focus on demand/supply balancing, without considering network structure [mohsenian2010autonomous, Pedram14, Maharjan13, Tushar14, li2016market, li2011optimal, tsui2012demand]. In those that are network-recognizant, [li2012demand] requires complex communication and constrained protocols, and [LinaCDC15] proposes a pricing strategy to internalize externalities of voltage constraints and line loss among customers but no feedback control is utilized and only real power is considered.

I-D3 Controlling Devices with Flexibility

Controllable devices with dynamics and discrete decision variables are usually studied in two ways: (1) heuristic control that is not designed based on optimization framework, e.g., [mathieu2013state] controls TCLs based on temperature status and analyze the control performance, and (2) optimization-based control that can integrate specific objective functions and constraints, e.g., [li2011optimal] considers device dynamics but does not involve discrete variables, and [tsui2012demand] considers device dynamics and discrete decision variables, but solves with commercial solvers.

I-D4 Distributed Optimization and Control for Distribution Networks

While our theoretic results apply to general operation constraints, one of the major application is voltage regulation. Decentralized voltage regulation [farivar2015local, zhou2015pseudo, zhou2016incremental, simpson2017voltage]; distributed control [baker2017network, dall2017optimal]. Online optimization with feedback control is studied in [hauswirth2017online, dallanese2016optimal, gan2016online, bolognani2015distributed], to deal with nonlinear power flow models. We also refer to [molzahn2017survey] for a recent comprehensive survey.

To sum up, this paper makes the following contributions:

  1. We design a market-based distributed stochastic dual algorithm to solve the social welfare optimization problem with discrete decision variables and various dynamics, and prove its stability.

  2. We adapt the algorithm for practical settings to handle time-varying optimization, nonlinear dynamics, and asynchronous updates, and characterize its performance.

The rest of this paper is organized as follows. Section II models the networked system and individual devices and formulates a social welfare optimization problem. Section III presents exact convex relaxation and an offline distributed algorithm for solving the problem. Section IV adapts the algorithm design into more practical online settings with nonlinear dynamics and asynchronous updates. Section V provides numerical examples to illustrate analytical results, and Section VI concludes this paper and discusses future works.

Set of nodes, excluding node ;
Set of distribution lines
Set of controllable devices of node
Set of controllable devices of node that update at
Non-controllable power injection of node at time
Power injection from device of node at time
(Convex hull of) Feasible set of device at node
Status of device from node at time
Aggregated power injections of node at time ,
Actual and linearized system states of node at time
Signal for real and reactive power of node at time
System states linearization parameters
Identity matrix
Identity vector
TABLE I: Notation. For notational simplicity, we apply bold symbols with superscript to represent the stacked variables within the timeslots starting from time , e.g., . Similarly, functions with bold symbols are vector-valued function over the window. Extra subscript in online algorithm denotes the time when the variables are computed. denotes the Frobenius norm, and without subscript denotes -norm.

Ii System Model and Problem Formulation

Ii-a Network Model

Consider a distribution network with nodes collected in the set with and distribution lines collected in the set . At time , denote by and the aggregated real and reactive power injections at node . Denote by the system-states, e.g., voltage magnitude, power at substation, etc. usually has a nonlinear relationship with and but can be approximated by with a linear function of and as


where are linearization parameters. We refer to [baran1989network, christakou2013efficient, guggilam2016scalable, bolognani2015fast, dall2017optimal] for multiple linearization methods for voltage magnitude and power at substation (and current magnitude?).


The linear model (1) is utilized to facilitate the design of computationally-affordable algorithms. Such approximation usually introduces a bounded error in system states (see e.g., [chiang1990existence, zhou2016vvac, swaroop2015linear, bolognani2015linear]), which further results in a bounded discrepancy between the generated and the optimal operating points. In Section IV, In Section V, we will utilize nonlinear power flow model for numerical examples.

Ii-B Node & Device Model

Ii-B1 Nodal Power Aggregation

At node , assume that the aggregated power injections and stem from two sources: non-controllable devices and controllable devices. Denote by and the overall real and reactive power injected by all the non-controllable devices of node . Meanwhile, assume a self-interested and rational [fudenberg1991game] customer in charge of all controllable devices collected in a set . Denote by and the real and reactive power injections of controllable device at time . Then the power aggregation at node is cast as:


Ii-B2 Controllable Device Model

At each node, we consider the following two typical categories of controllable devices: continuous-rate fast-updating devices (continuous devices or fast devices) whose power injections can be chosen from continuous feasible sets, e.g., PV inverters, collected in a subset , and discrete-rate slow-updating devices (discrete devices or slow devices) whose power injections are subject to discrete choices, e.g., TCLs, EV battery, etc., collected in a subset . We will use the terms fast/continuous devices, as well as slow/discrete devices, interchangeably for the rest of this paper. We also assume that discrete devices only inject real power for simplicity. Following is the modeling of the controllable devices in details.

(i) Fast devices: Let denote the feasible power injection set of a fast devices at time . is assumed to be convex and compact. For example, a PV inverter has in the form of:


where denotes the available real power from a PV system at time , and constant is the rated apparent capacity.

(ii) Slow devices: As to a discrete device , denote by its feasible set of discrete power rates together with its potential temporal dynamics and constraints, e.g., A/Cs control room temperatures to be within acceptable ranges, and batteries have required SOC. Generally, for a device , let collect its device-states from current time till future time . The relationship between and is modeled by an affine vector-valued function :


The requirement upon the device-states can be represented by a general convex vector-valued function as:


Substitute (4) into (5) to rewrite (5) as a convex function of :


Additionally, the power rates are confined as


where is a discrete set collecting the feasible power rates of device , introducing non-convexity. Therefore, the resultant is also discrete and non-convex. Following are two examples of slow devices:

a). For an A/C , let be the room temperature (device-state) to be controlled at time . Dynamics (4) can be represented for as:




are constants interpreted as conjectural room temperatures with zero power consumption, is the ambient temperature, and and are parameters specifying thermal characteristics of the A/C and its operating environment; see, e.g., [li2011optimal]. Constraint (5) can be specified, e.g., as


confining room temperatures to within acceptable ranges. We further assume this A/C can only be turned on/off, i.e.,


with working power rate . Then this A/C’s feasible set from to is , where the bold set notation represents the Cartesian product of the time-varying feasible sets over the windows starting from , e.g., .

b). For an EV battery , let be its SOC (device-state) at time , and (4) representing the evolution of SOC for is written as:


where is charging/discharging efficiency. And (5) can be required SOC by certain time to guarantee performance:


We assume a feasible power rates set for battery as


with preferred fixed charging power rate and discharging power rate that preserve battery lifespan. This battery’s feasible set from to is then .


While the actual devices dynamics are mostly non-convex (e.g., different charging/discharging efficiencies for batteries, dynamics of room temperature evolvement, etc.), we utilize the linear approximation (4) for computational affordability. As will be shown in the proposed algorithms, a feedback control is applied to make up for the discrepancy between linearized and actual dynamics.

Ii-C Problem Formulation

The overall objective is to design a strategy where the network operator and the customers pursue their own operational goals and economic objectives, while concurrently achieving global coordination to enforce system constraints. In our design, incentive signals are designed to achieve this goal.

For ease of expression, we consider the following notations:

as well as the succeeding notations for collected power injections:

1a) Customer Problem: Let and be the incentive signal vectors for time sent by the network operator to customer for real and reactive power injections.

Let collect the device-states of all devices of node at time , and consider a cost function that captures well-defined performance objectives of all devices at node within the coming timeslots. In light of being a function of , we utilize the form of for the objective function instead of for simplicity, keeping in mind that may characterize objectives directly related to the device-states as well.

Recalling that the slow devices have discrete decision variables, making the potential optimization problem non-convex, we next introduce convex relaxation for the slow devices by considering the convex hull of their discrete feasible sets, defined as


Apparently the relaxed sets is convex and compact, and so are their cartesian product. We first solve the problem within the relaxed feasible sets and use stochastic schemes to recover the original discrete feasible power rates later. Let be an identity matrix and we make the following assumption.


Functions are strongly convex in , i.e., there exists some such that . Moreover, the gradient of with respect to is Lipschitz continuous for any .


We assume strong convexity of for the sake of the existence and uniqueness of the solution. If in practice only a (not strongly) convex cost function is available, we can add a small regularization term to enforce strong convexity while introducing some tolerable discrepancy; see, e.g., [Koshal11].

Then for the customer at node , given signals , the following cost-minimization problem for the upcoming timeslots is solved at time :


where and represent (potential) payment to/reward from the network operator owing to power injection by device for time . Also notice that the device-states constraints are implicitly included in (16b) as well.

Since (16a) is strongly convex in and that the constraints are convex, given and , a unique solution exists and can be efficiently computed. We present with a vector-valued best-response function as


1b) Recovering Feasible Power Rates: For the discrete device, the solution to the relaxed problem is usually not implementable. We consider recovering and implementing its feasible value for current time with the following stochastic scheme: given , we randomly select a feasible power rate based on some corresponding probability distribution such that , where calculates the expectation.

While there can be multiple ways to determine the probability distribution of feasible set points based on , we exemplify the procedure as follows with a two-point distribution for illustrative purpose. We select two feasible power rates and denote them by and , such that . Then the related probability of the corresponding two-point distribution is calculated as:


where generates the probability of certain event.


While we recover the feasible power rates only for current time for immediate implementation, we leave the future arrangement unrecovered, since the values of future power rates are not required for now, and they will be re-calculated next time any ways.

2) Social-welfare Problem: Assume the following general network operational constraints for time :


where is affine. Based on (1)–(2) is an affine function of ; we define the vector-valued function that is still affine in for later use. We make the following assumption.


is an affine function of with full row rank.

Based on Assumption II-C, the Jacobian of is bounded on , i.e., there exists some constant such that


where denotes the Frobenius norm.

We formulate the following optimization problem to be solved by network operator, which captures both social welfare and system-states constraints of a distribution feeder, as well as the response function of all customers:


where the total payment from/reward to the customers cancels out the total payment received/reward paid by the network operator; thus, it is not explicitly shown in the social welfare objective function (21a).

Also note that (21g) is equivalent to embedding into , bringing non-convexity in the mean time. This is because the form of (21g) is usually not affine. For better illustration of the non-convexity of , consider the following simple example with single timeslot and single device with real power only. Assume a quadratic cost function and a box feasible set with upper and lower bounds for real power injections and , and we end up with a non-convex piece-wise linear function :

which will become more complex, or does not even have an explicit form, should one considers more complicated cost functions and feasible sets.

Iii Distributed Stochastic Dual Algorithm

Focusing on a particular problem instance at time , lends itself to a Stackelberg game interpretation where and are calculated via by the network operator (i.e., the leader) and broadcasted to all nodes ; subsequently, each consumer (i.e., the follower) computes the power set points from . By design, is an optimal point of .

While it is challenging for the network operator to solve because constraint (21g) introduces non-convexity, we will first in Section III-A introduce the exact convex relaxation of developed in our previous work [zhou2017incentive2]. Next in Section III-B and Section III-C we design a stochastic dual algorithm and its distributed implementation that is consistent with customers’ natural response towards incentive signals (i.e., solving ), to solve while recovering feasible power rates to implement for discrete devices at every iteration.

Iii-a Convex Reformulation & Signal Design

Consider the following reformulated convex optimization problem:


where the non-convex constraint (21g) in is replaced with convex constraints (22g). We assume is feasible.


satisfies Slater’s condition.

Given the strong convexity of the objective function (22a) in and the feasibility assumption, a unique optimal solution exists for problem . We next utilize the results in to show that together with a signal design strategy is an exact convex relaxation of , i.e., the solution of gives the solution of .

Substitute (22g)–(22g) into (22g), and denote by the dual variable associated with the constraints (22g) for time . Let be the optimal system-states produced by , and denote the optimal dual variables associated with (22g) as for time . Then, we design the signals for as:


The signals may have various realistic interpretation with different forms of . For example, when represents voltage regulation (e.g., (46)), the signals can be seen as prices of voltage violation; when is the supply/demand balance, the signals are market-clearing price; when is combination of the two aforementioned forms, the resultant signals are summation of both.

We show that the above signals are bounded, precluding the possibility of infinitely large signals.

Theorem (Theorem 1 of [zhou2017incentive2])

Under Assumptions II-CIII-A, the signals defined by (23) are bounded.

By examining the optimality conditions of and , we have the following result.

Theorem (Theorem 2 in [zhou2017incentive2])

The solution of problem along with the signals defined in (23) is a global optimal solution of problem , i.e., is an exact relaxation of .

From now on, we will use the optima of and interchangeably depending on the context. Next, we will first design a stochastic dual algorithm for solving while recovering feasible set points for discrete devices. Afterwards, based on Theorem III-A, we will develop an distributed algorithm without exposing any private information of the customers to the network operator.

Iii-B Stochastic Dual Algorithm

For the rest of this section, we focus on designing a stochastic dual algorithm and its distributed implementation for solving at given time , and analyzing its performance. For ease of exposition, we only consider single updating timescale for all devices, while extension to multiple updating timescales is mathematically straightforward [zhou2017stochastic]. Based on the results in this section, we will later in Section IV introduce an online algorithm implemented in more realistic settings with asynchronous updates, nonlinear power flow model, and time-varying optimization.

We first write the Lagrangian of as:


which is obtained by keeping the constraints (22g) and implicit.

We will implement a stochastic dual algorithm to solve the following minimax problem based on :


while recovering original discrete feasible power rates for slow devices. To this end, we define


and a resultant dual function:


with the corresponding dual problem:

Lemma (Theorem 2.1 of [necoara2014rate])

Under Assumption II-CII-C, the gradient of the dual function is given by , and is Lipschitz continuous with constant , i.e.,


for any feasible and .

We refer the proof to [necoara2014rate].

We consider the dual algorithm for solving (25) while recovering implementable feasible power rates for discrete devices at each iteration, and we have the following iterative stochastic dual algorithm:


where is a projection operator onto the nonnegative orthant, and the stepsize is some constant to be determined next to guarantee convergence. Also notice that in step (30c), is not the de facto gradient of (instead, is; see Lemma III-B); but as we will show next, the algorithm converges to bounded distance from the optimal solution on average.


Under Assumption II-CII-C, the dual function is strongly concave.


The proof follows [necoara2015linear]. By definition, the dual function is calculated by


Meanwhile, the conjugate function of is defined as


which gives us

Under Assumption II-C, is affine in . Therefore, . Moreover, under Assumption II-C, the conjugate function is strongly convex [rockafellar2009variational, Proposition 12.60]. Thus