Residential Energy Storage Management with Bidirectional Energy Control

Residential Energy Storage Management with Bidirectional Energy Control

Tianyi Li and Min Dong Corresponding author: Min Dong. Tianyi Li and Min Dong are with Department of Electrical, Computer and Software Engineering, University of Ontario Institute of Technology, Oshawa, Ontario, Canada (Email: {, min.dong}

We consider the residential energy storage management system with integrated renewable generation and capability of selling energy back to the power grid. We propose a real-time bidirectional energy control algorithm, aiming to minimize the net system cost of energy buying and selling and storage within a given time period, subject to the battery operational constraints and energy buying and selling constraints. We formulate the problem as a stochastic control optimization problem, which is then modified and transformed to allow us to apply Lyapunov optimization to develop the real-time energy control algorithm. Our algorithm is developed for arbitrary and unknown dynamics of renewable source, loads, and electricity prices. It provides a simple closed-form control solution based on current system states with minimum complexity for implementation. The proposed algorithm possesses a bounded performance guarantee to that of the optimal non-causal -slot look-ahead control policy. Simulation studies show the effectiveness of our proposed algorithm as compared with alternative real-time and non-causal algorithms.

Energy Storage, renewable generation, energy selling, home energy management, Lyapunov optimization, real-time control

I Introduction

Energy storage and renewable energy integration are considered key solutions for future power grid infrastructure and services to meet the fast rising energy demand and maintain energy sustainability [1, 2]. For the grid operator, energy storage can be exploited to shift energy across time to meet the demand and counter the fluctuation of intermittent renewable generation to improve grid reliability [2, 3, 4, 5, 6, 7, 8]. For the electricity customer, local energy storage can provide means for energy management to control energy flow in response to the demand-side management signal and to reduce electricity cost. For example, dynamic pricing is one of main demand-side management techniques to relieve grid congestion [9, 10]. Its effectiveness relies on the customer-side energy management solution to effectively control energy flow and demand in response to the pricing change. With local renewable generation and energy storage introduced to residential and commercial customers, there are potentially greater flexibility in energy control to respond to the dynamic pricing and demand fluctuation, as well as maximally harness the energy from renewable source to reduce electricity bills [11, 12, 13].

With more local renewable generation at customers available, the utility retailer now allows customers to sell energy back to the utility at a price dynamically controlled by the utility in an attempt to harness energy from distributed renewable generation at customers and further improve stability and reliability [14, 15]. This means both renewable generation and previously stored energy, either purchased from the grid or harnessed from the renewable source can be sold for profit by the customer. The ability to sell energy back enables bidirectional energy flow between the energy storage system and the grid. This also gives the customer a greater control capability to manage energy storage and usage based on the dynamic pricing for both buying and selling. The repayment provides return for the storage investment and further reduce the net cost at the customer. An intelligent energy management solution exploring these features to effectively manage storage and control the energy flow, especially at a real-time manner, is crucially needed to maximize the benefits.

Developing an effective energy management system faces many challenges. For the energy storage system itself, the renewable source is intermittent and random, and its statistical characteristics over each day are often inheritably time-varying, making it difficult to predict. The benefit of storage, either for electricity supply or for energy selling back, also comes at the cost of battery operation that should not be neglected. The bidirectional energy flow between the energy storage system and the grid under dynamic pricing complicates the energy control of the system when facing future uncertainty, and creates more challenges. More control decisions need to be made for energy flows among storage battery, the grid, the renewable generation, and the load. The potential profit for energy selling but with unpredictable pricing makes control decisions much more involved on when and how much to sell, store, or use. Moreover, the battery capacity limitation further makes the control decisions coupled over time and difficult to optimize. In this paper, we aim to develop a real-time energy control solution that addresses these challenges and effectively reduces the system cost at minimum required knowledge of unpredictable system dynamics.

Related Works: Energy storage has been considered at power grid operator or aggregator to combat the fluctuation of renewable generation, with many works in literature on storage control and assessing its role in renewable generation [2, 3], for power balancing with fixed load [4, 5] or flexible load control [8, 7], and for phase balancing [6]. Residential energy storage system to reduce electricity cost has been considered without renewable [16, 17] and with renewable integration [18, 19, 11, 20, 21, 22, 23, 24, 25, 12, 13]. Only energy purchasing was considered in these works. Among them, off-line storage control strategies for dynamics systems were proposed [19, 11, 18], where combined strategies of load prediction and day-ahead scheduling on respective large and small timescales are proposed, with the knowledge of load statistics and renewable energy arrivals ahead of time, while no battery operational cost are considered.

For real-time energy storage management, [20] formulated the storage control as a Markov Decision Process (MDP) and solved it by Dynamic Programming. Lyapunov optimization technique [26] has been employed to develop real-time control strategies in [12, 13, 21, 22, 23, 24]. For independent and identically distributed or stationary system dynamics (e.g., pricing, renewable, and load), energy control algorithms were proposed in [21, 22] without considering battery operational cost, and in [23] with battery charging and discharging operational cost considered. All the above works aim to minimize the long-term average system cost. A real-time energy control algorithm to minimize the system cost within a finite time period was designed in [12] for arbitrary system dynamics. Furthermore, joint storage control and flexible load scheduling was considered in [13] where the closed-form sequential solution was developed to minimize the system cost while meeting the load deadlines.

The idea of energy selling back or trading was considered in [27, 28, 29], where [27, 28] focused on demand-side management via pricing schemes using game approaches for load scheduling among customers, and [29] considered a microgrid operation and supply. All these works considered the grid level operation and interaction and the cost associated with it, and assumed a simple storage model. Since the consumers may prefer a cost saving solution in a customer defined time period and system dynamics may not be stationary, it is important to provide a cost-minimizing solution to meet such need. To the best of our knowledge, there is no such existing bidirectional energy control solution with selling-back capability.

Contributions: In this paper, we consider residential energy storage management system with integrated renewable generation and capability to sell energy back to the grid. We develop a real-time energy storage control algorithm, aiming to minimize the net system cost within a finite time period subject to the battery operational constraints and energy buying and selling constraints. For the system cost consideration, we incorporate the storage cost by modeling the battery operational cost associated with charging/discharging activities. We consider arbitrary system dynamics, including renewable source, buying/selling electricity pricing, and the load, with no knowledge about their statistics.

We formulate the net system cost minimization as a stochastic optimization problem over a finite time horizon. The interaction of storage, renewable, and the grid, as well as cost associated with energy buying and selling, the battery storage limit, and finite time period for optimization complicates the energy control decision making over time. To tackle this difficult problem, we adopt special techniques to modify and transform the original problem into the one which we are able to employ Lyapunov optimization to develop a real-time control algorithm to solve it. Our developed real-time energy control algorithm has a simple closed-form solution, which only relies on current battery level, pricing, load, and renewable generation, and thus is simple to implement. The closed-form expression also reveal how the battery energy level and pricing affect the decision on when to buy or sell energy, when to store or use energy from the battery, and the priority order of multiple sources for storing or selling energy from multiple sources. We show that our proposed real-time algorithm provides the performance within a bounded gap to that of the optimal -slot look-ahead solution which has full information available before hand. The proposed algorithm is also shown to be asymptotically optimal as the battery capacity and the time duration go to infinity. Simulation results demonstrate the effectiveness of the proposed energy storage control algorithm as compared with alternative real-time or non-causal solutions. Furthermore, simulation studies are provided to understand the effect of selling price on the energy storage behaviors.

Organization: The rest of this paper is organized as follows. In Section II, we describe the energy storage and management system model. In Section III, we formulate the ESM stochastic control optimization problem within a finite period. In Section IV, we develop our real-time energy control algorithm. In Section V, we analyze the performance of our algorithm. Simulation results are provided in Section VI, and followed by conclusion in Section VII.

Ii System Model

We consider a residential-side energy storage and management (ESM) system as shown in Fig. 1. The system contains an energy storage battery which is connected to an on-site renewable generator (RG) and the conventional grid (CG). Energy can be charged into the battery from both the RG and the CG, discharged from the battery to supply customer’s electricity demand, or sell back to the CG. Both the RG and the CG can also directly supply energy to the customer. We assume the ESM system operates in discrete time slots with , and all energy control operations are performed per time slot .

Fig. 1: An ESM system with RG and bidirectional energy flow from/to CG.

Ii-1 Rg

Let be the amount of energy harvested from the RG at time slot . Due to the uncertainty of the renewable source, is random, and we assume no prior knowledge of or its statistics. Let be the customer’s demand at time slot . We assume a priority of using to directly supply . Let be this portion of at time slot . We have . A controller will determine whether the remaining portion, if any, should be stored into the battery and/or sold back to the CG. We denote the stored amount and sold amount by and , respectively, satisfying


Ii-2 Cg

The customer can buy energy from or sell energy to the CG at real-time unit buying price and selling price , respectively. Both and are known at time slot . To avoid energy arbitrage, the buying price is strictly greater than the selling price at any time, i.e.,  . Let denote the amount of energy bought from the CG at time slot , bounded by


where is the maximum amount of energy that can be bought per time slot. Let denote the portion of that is stored into the battery. The remaining portion directly supplies the customer’s demand. Let be the amount of energy from the battery that is sold back to the CG. The total energy sold back from the battery and the RG is bounded by


where is the maximum amount of energy that can be sold back to the CG111This amount may be regulated by the utility.

Note that while from the RG can be sold back to the CG at any time, energy buying from or selling to the CG should not happen at the same time to avoid energy arbitrage, which is ensured by the constraint . With this constraint, we expect the following condition to be satisfied


We will verify that our proposed algorithm satisfies (4) in Section V.

Ii-3 Battery Storage Operation

Storage management: The battery can be charged from multiple sources (i.e., the CG and the RG) at the same time. The total charging amount at time slot should satisfy


where is the maximum charging amount per time slot. Similarly, energy stored in the battery can be used to either supply the customer’s demand and/or sell back to the CG. Let denote the discharging energy amount to supply the customer at time slot . The total discharging amount is bounded by


where is the maximum discharging amount per time slot. We assume that there is no simultaneous charging and discharging, i.e.,


Let be the battery energy level at time slot , bounded by


where and are the minimum energy required and maximum energy allowed in the battery, respectively, which values depend on the battery type and capacity. Based on charging and discharging activities, evolves over time as


Finally, by the demand-and-supply balance requirement, we need to satisfy


Battery degradation cost: It is well known that frequent charging/discharging activities cause a battery to degrade [30]. We model two types of battery degradation cost: entry cost and usage cost. The entry cost is a fixed cost incurred due to each charging or discharging activity. Define two indicator functions to represent charging and discharging activities: and . Denote the entry cost for charging by and that for discharging by . The entry cost for battery usage at time slot is given by .

The battery usage cost is the cost associated with the charging/discharging amount. Denote the net change of battery energy level at time slot by . From (5) and (6), it follows that is bounded by


It is known that typically faster charging/discharging, i.e., larger , has a more detrimental effect on the life time of the battery. Thus, we assume the associated cost function for usage , denoted by , is a continuous, convex, non-decreasing function with maximum derivative .

Iii ESM with Bidirectional Energy Control: Problem Formulation

For the ESM system, the system cost includes the energy purchasing cost minus selling profit and the battery degradation cost. Within a pre-defined -slot time period, the average net cost of energy purchasing and selling over the CG is given by . For the battery operation, the average entry cost and average net change over the -slot period are respectively given by


where by (11), is bounded by


and the battery average usage cost is . Thus, the average battery degradation cost over the -slot period is .

Denote the system inputs by and the control actions for energy storage management by at time slot . With only current known, our goal is to determine a control policy for (i.e., a mapping ) to minimize the average system cost within the -slot period. This stochastic control optimization problem is formulated by


Note that constraints (14) and (15) are the derived results of constraints (5) – (9). P1 is a difficult stochastic control optimization problem due to the finite time period and the correlated control actions {} over time as a result of time-coupling dynamics of in (9). In the following, we develop special techniques to overcome these difficulties for a real-time control solution. Specifically, we first apply a sequence of modification and reformulation of P1, and then we propose a real-time control algorithm to solve the resulting energy management optimization problem.

Iii-a Problem Modification

To make P1 tractable, we first remove the coupling of control actions over time by modifying the constraints (14) and (15) on the per-slot charging and discharging amounts. From (9), we set the change of over the -slot period to be a desired value as


where by (5)(6)(8), the range of is given by , with . We point out that, is only a desired value we set. It may not be achieved at the end of -slot period by a control algorithm. In Section V, we will quantify the amount of mismatch with respect to under our proposed control algorithm. We now modify P1 to the follow optimization problem


From P1 to P2, by imposing the new constraint (16), we remove the dependency of per-slot charging/discharging amount on in constraints (14) and (15), and replace them by (5) and (6), respectively.

Iii-B Problem Transformation

The objective of P2 contains which is a cost function of a time-averaged net change . Directly dealing with such function is difficult. Adopting the technique introduced in [31], we transform the problem to one that contains the time-averaged function. To do so, we introduce an auxiliary variable and its time average satisfying


These constraints ensure that the auxiliary variable and lie in the same range and have the same time-averaged behavior. Define as the time average of . By using instead of , we replace constraint (13) with (17) and (18), and transform P2 into the following problem in which we determine a control policy for that minimizes the -slot time average of system cost


It can be shown that P2 and P3 are equivalent problems (see Appendix A). The modification and transformation from P1 to P3 enable us to utilize Lyapunov optimization techniques [26] to design real-time control policy to solve P3. We will then design the control parameters that we introduce in the policy to ensure the solution for P3 is also feasible to the original problem P1.

Iv Real-Time ESM Control Algorithm

Based on the Lyapunov framework, for time-averaged constraints (16) and (18), we introduce two virtual queues and respectively as


From (9) and (19), and have the following relation


where in which is a constant shift and ensures that the left hand side equality in (16) is satisfied. We will revisit the value of to ensure a feasible solution for P1.

Define . Define the quadratic Lyapunov function . Divide slots into sub-frames of -slot duration as , for . We define a one-slot sample path Lyapunov drift as , which only depends on the current system inputs . Instead of the system cost considered in the objective in P3, we define a drift-plus-cost metric which is a weighted sum of the drift and the system cost at current time slot , given by


where constant sets the relative weight between the drift and the system cost. Directly using the drift-plus-cost metric to design a control policy is still difficult, instead, we present an upper bound on the drift , which will be used for designing our real-time control algorithm.

Lemma 1

Lyapunov drift is upper bounded by


where and


See Appendix B. \qed

By Lemma 1, we have an upper bound on the per-slot drift-plus-cost metric in (22). In the following, we propose a real-time control algorithm that is to minimize this upper bound on the drift-plus-cost metric per slot.

Removing all the constant terms in the upper bound independent of and , we have the equivalent optimization problem which can be further separated into two sub problems for and , respectively, as follows

First, we solve to obtain the optimal solution . Note that is convex for being convex. Thus, we can directly solve it and obtain the optimal of .

Lemma 2

The optimal solution of is given by


where , is the first derivative of , and the inverse function of .


See Appendix C. \qed

Next, we obtain the optimal of and provide the conditions under which is feasible to P1.

Iv-a The Optimal for

Define the objective function of as . Define the idle state of the battery as the state where there is no charging or discharging activity. The control decision in the idle state is given by , where , , and . Then, in the idle state, we have . Define . We derive the optimal control decision in five cases, given in Proposition 1 below.

Proposition 1

Define as follows: If : , ; Otherwise, , .

Denote as the control decision in the charging or discharging state. The optimal control solution for is given in the following cases:

  • For : The battery is either in the charging state or the idle state. Let


    If , then ; Otherwise, .

  • For : The battery is either in the charging state (from the RG only) or the discharging state (to the customer’s load only). Let


    If , then ; Otherwise, .

  • For : The battery is either in the charging state (from the RG only) or the discharging state. Define in the discharging state as


    Define in the charging state as


    Then, . If , then ; Otherwise, .

  • For and : The battery is in the discharging state (to the customer’s load only). Let


    If , then ; Otherwise, .

  • For : The battery is either in the discharging state or the idle state. If , let


    Otherwise, let


    If , then ; Otherwise, .


See Appendix D. \qed

Proposition  1 provides the closed-form control solution in five cases, depending on the battery energy level (via ), battery usage cost (via ), and the prices. In each case, in the charging (or discharging) state is compared with in the idle state, and the optimal is the control solution of the state with the minimum objective value.

Note that there are two sources to be controlled for selling energy back, from the battery and  from the RG. Whether to sell energy from the battery back to the grid depends on the battery energy level. When the battery energy level is low (Case 1), energy is kept in the battery. When the battery has a moderately low energy level (Case 2), it may be in either the charging or discharging state. For the latter, the battery only supplies enough energy to the customer but does not sell energy back. When the battery energy level is higher but still moderate (Case 3), it may still be in either the charging or discharging state. For the latter, the battery may sell energy back to the grid. When the battery has just sufficient energy (Case 4), it may supply energy to the customer, but will not sell energy back to the grid. When the energy level in the battery is high (Cases 5), it may supply energy to the customer and at the same time sell energy back. In contrast, the renewable energy can be sold to the grid regardless of the battery energy level, state (charging, discharging, or idle) and the price to make an additional profit. As the result, energy generated by the renewable will be utilized as much as possible. However, when the system wants to sell energy from both the battery and the renewable, the order to determine and depends on which results in the minimum cost in . In Case 5, for the control decision in (30), is determined after , while in (31), is determined after .

Iv-B Feasible for P1

The optimal solution of provides a real-time solution for P3. However, it may not be feasible to P1, because the battery capacity constraint (8) on may be violated. By properly designing and , we can guarantee that satisfies constraint (8), and ensure the feasibility of the solution. Define . The result is stated below.

Proposition 2

Under the proposed real-time control algorithm, for in (21) with


and with


satisfies the battery capacity constraint (8), and control solution of , for any , is feasible to P1.


We provide a brief outline of our proof and leave the details in Appendix E. Using the solutions and of and , respectively, we can show that both and are upper and lower bounded. Then, by applying these bounds to (21) and using the battery capacity constraint , we obtain as the minimum value that can be achieved with a given value of . With obtained, we derive the upper bound of , i.e., , to ensure that is satisfied. \qed

Note that in (33) is generally satisfied for practical battery storage capacity and being set relatively small. We should also point out that since is a desired value set by our proposed algorithm, the solution of may not necessarily satisfy constraint (16) at the end of the -slot period, and thus may not be feasible to P2. However, Proposition 2 provides the values of and to guarantee the control solutions being feasible to P1.

We summarize the proposed real-time control algorithm in Algorithm 1. We emphasize that the proposed algorithm does not rely on any statistical assumption on the prices, demand, and renewable processes , and thus can be applied to general scenarios, especially when these processes are non-ergodic or difficult to predict in a highly dynamic environment.

Initialize: .
Determine .
Set .
Set and as in (32) and (33), respectively.
At time slot :

1:Observe the system inputs and queues and .
2:Solve and obtain in (24); Solve and obtain by following cases (25)-(29).
3:Use and to update and in (19) and (20), respectively.
4:Output control decision .
Algorithm 1 Real-time battery management control algorithm

V Performance Analysis

To analyze the performance of our real-time solution in Algorithm 1 with respect to P1, let denote the -slot average system cost objective of P1 achieved by Algorithm 1, which depends on the value of set by Algorithm 1. For comparison, we partition slots into frames with , for some integers . Within each frame , we consider a -slot look-ahead optimal control policy, where are known non-causally for the entire frame beforehand. Let denote the minimum -slot average cost for frame achieved by this optimal policy. We can view as the minimum objective value of P1 with under the optimal -slot look-ahead solution. The performance gap of our proposed real-time algorithm to the optimal -slot lookahead policy is bounded in the following theorem.

Theorem 1

For any arbitrary system inputs , and any with , the -slot average system cost under Algorithm 1 to that under the optimal -slot look-ahead policy satisfies


with the bound at the right hand side being finite. Asymptotically as ,


See Appendix F. \qed

By Theorem 1, the performance gap of Algorithm 1 to the -slot lookahead optimal policy is upper bounded in (1), for any with . To minimize the gap, we should always set . From (35), as the duration goes to infinity, the asymptotic gap is in the order of . Since increases with , When , Algorithm 1 is asymptotically equivalent to the -slot lookahead optimal policy as the battery capacity and time duration increases.

As discussed at the end of Section IV-B, constraint (16) in P2 sets a desired value for which may not be achieved by our proposed algorithm at the end of slots. Denote this mismatch under Algorithm 1 by . This mismatch is quantified below.

Proposition 3

For any arbitrary system inputs and any initial queue value , the mismatch for constraint (16) under Algorithm 1 is given by , and is bounded by


See Appendix G. \qed

Finally, we expected constraint (4) to being satisfied by Algorithm 1, i.e., buying energy () and selling back from battery storage () should not occur at same time. This is verified in the following result.

Proposition 4

For any system inputs , the optimal control solution under Algorithm 1 guarantees constraint (4) being satisfied.


See Appendix H. \qed

Vi Simulation Results

We set the slot duration to be 5 minutes, and assume that system input remains unchanged within each slot. We set the buying price using the data collected from Ontario Energy Board [32]. As shown Fig. 4 top, follows a three-stage price pattern repeated each day as . We use solar energy for the RG to generate . As a result, is a non-ergodic process, with the mean changing periodically over hours. As shown in Fig. 4 middle, we model by a three-stage pattern as  kWh and set standard deviation , for . We also model the load as a non-ergodic process with mean following a three-stage pattern over each day as  kWh, shown in Fig. 4 bottom, and set standard deviation