Residential Energy Storage Management with Bidirectional Energy Control
Abstract
We consider the residential energy storage management system with integrated renewable generation and capability of selling energy back to the power grid. We propose a realtime bidirectional energy control algorithm, aiming to minimize the net system cost of energy buying and selling and storage within a given time period, subject to the battery operational constraints and energy buying and selling constraints. We formulate the problem as a stochastic control optimization problem, which is then modified and transformed to allow us to apply Lyapunov optimization to develop the realtime energy control algorithm. Our algorithm is developed for arbitrary and unknown dynamics of renewable source, loads, and electricity prices. It provides a simple closedform control solution based on current system states with minimum complexity for implementation. The proposed algorithm possesses a bounded performance guarantee to that of the optimal noncausal slot lookahead control policy. Simulation studies show the effectiveness of our proposed algorithm as compared with alternative realtime and noncausal algorithms.
I Introduction
Energy storage and renewable energy integration are considered key solutions for future power grid infrastructure and services to meet the fast rising energy demand and maintain energy sustainability [1, 2]. For the grid operator, energy storage can be exploited to shift energy across time to meet the demand and counter the fluctuation of intermittent renewable generation to improve grid reliability [2, 3, 4, 5, 6, 7, 8]. For the electricity customer, local energy storage can provide means for energy management to control energy flow in response to the demandside management signal and to reduce electricity cost. For example, dynamic pricing is one of main demandside management techniques to relieve grid congestion [9, 10]. Its effectiveness relies on the customerside energy management solution to effectively control energy flow and demand in response to the pricing change. With local renewable generation and energy storage introduced to residential and commercial customers, there are potentially greater flexibility in energy control to respond to the dynamic pricing and demand fluctuation, as well as maximally harness the energy from renewable source to reduce electricity bills [11, 12, 13].
With more local renewable generation at customers available, the utility retailer now allows customers to sell energy back to the utility at a price dynamically controlled by the utility in an attempt to harness energy from distributed renewable generation at customers and further improve stability and reliability [14, 15]. This means both renewable generation and previously stored energy, either purchased from the grid or harnessed from the renewable source can be sold for profit by the customer. The ability to sell energy back enables bidirectional energy flow between the energy storage system and the grid. This also gives the customer a greater control capability to manage energy storage and usage based on the dynamic pricing for both buying and selling. The repayment provides return for the storage investment and further reduce the net cost at the customer. An intelligent energy management solution exploring these features to effectively manage storage and control the energy flow, especially at a realtime manner, is crucially needed to maximize the benefits.
Developing an effective energy management system faces many challenges. For the energy storage system itself, the renewable source is intermittent and random, and its statistical characteristics over each day are often inheritably timevarying, making it difficult to predict. The benefit of storage, either for electricity supply or for energy selling back, also comes at the cost of battery operation that should not be neglected. The bidirectional energy flow between the energy storage system and the grid under dynamic pricing complicates the energy control of the system when facing future uncertainty, and creates more challenges. More control decisions need to be made for energy flows among storage battery, the grid, the renewable generation, and the load. The potential profit for energy selling but with unpredictable pricing makes control decisions much more involved on when and how much to sell, store, or use. Moreover, the battery capacity limitation further makes the control decisions coupled over time and difficult to optimize. In this paper, we aim to develop a realtime energy control solution that addresses these challenges and effectively reduces the system cost at minimum required knowledge of unpredictable system dynamics.
Related Works: Energy storage has been considered at power grid operator or aggregator to combat the fluctuation of renewable generation, with many works in literature on storage control and assessing its role in renewable generation [2, 3], for power balancing with fixed load [4, 5] or flexible load control [8, 7], and for phase balancing [6]. Residential energy storage system to reduce electricity cost has been considered without renewable [16, 17] and with renewable integration [18, 19, 11, 20, 21, 22, 23, 24, 25, 12, 13]. Only energy purchasing was considered in these works. Among them, offline storage control strategies for dynamics systems were proposed [19, 11, 18], where combined strategies of load prediction and dayahead scheduling on respective large and small timescales are proposed, with the knowledge of load statistics and renewable energy arrivals ahead of time, while no battery operational cost are considered.
For realtime energy storage management, [20] formulated the storage control as a Markov Decision Process (MDP) and solved it by Dynamic Programming. Lyapunov optimization technique [26] has been employed to develop realtime control strategies in [12, 13, 21, 22, 23, 24]. For independent and identically distributed or stationary system dynamics (e.g., pricing, renewable, and load), energy control algorithms were proposed in [21, 22] without considering battery operational cost, and in [23] with battery charging and discharging operational cost considered. All the above works aim to minimize the longterm average system cost. A realtime energy control algorithm to minimize the system cost within a finite time period was designed in [12] for arbitrary system dynamics. Furthermore, joint storage control and flexible load scheduling was considered in [13] where the closedform sequential solution was developed to minimize the system cost while meeting the load deadlines.
The idea of energy selling back or trading was considered in [27, 28, 29], where [27, 28] focused on demandside management via pricing schemes using game approaches for load scheduling among customers, and [29] considered a microgrid operation and supply. All these works considered the grid level operation and interaction and the cost associated with it, and assumed a simple storage model. Since the consumers may prefer a cost saving solution in a customer defined time period and system dynamics may not be stationary, it is important to provide a costminimizing solution to meet such need. To the best of our knowledge, there is no such existing bidirectional energy control solution with sellingback capability.
Contributions: In this paper, we consider residential energy storage management system with integrated renewable generation and capability to sell energy back to the grid. We develop a realtime energy storage control algorithm, aiming to minimize the net system cost within a finite time period subject to the battery operational constraints and energy buying and selling constraints. For the system cost consideration, we incorporate the storage cost by modeling the battery operational cost associated with charging/discharging activities. We consider arbitrary system dynamics, including renewable source, buying/selling electricity pricing, and the load, with no knowledge about their statistics.
We formulate the net system cost minimization as a stochastic optimization problem over a finite time horizon. The interaction of storage, renewable, and the grid, as well as cost associated with energy buying and selling, the battery storage limit, and finite time period for optimization complicates the energy control decision making over time. To tackle this difficult problem, we adopt special techniques to modify and transform the original problem into the one which we are able to employ Lyapunov optimization to develop a realtime control algorithm to solve it. Our developed realtime energy control algorithm has a simple closedform solution, which only relies on current battery level, pricing, load, and renewable generation, and thus is simple to implement. The closedform expression also reveal how the battery energy level and pricing affect the decision on when to buy or sell energy, when to store or use energy from the battery, and the priority order of multiple sources for storing or selling energy from multiple sources. We show that our proposed realtime algorithm provides the performance within a bounded gap to that of the optimal slot lookahead solution which has full information available before hand. The proposed algorithm is also shown to be asymptotically optimal as the battery capacity and the time duration go to infinity. Simulation results demonstrate the effectiveness of the proposed energy storage control algorithm as compared with alternative realtime or noncausal solutions. Furthermore, simulation studies are provided to understand the effect of selling price on the energy storage behaviors.
Organization: The rest of this paper is organized as follows. In Section II, we describe the energy storage and management system model. In Section III, we formulate the ESM stochastic control optimization problem within a finite period. In Section IV, we develop our realtime energy control algorithm. In Section V, we analyze the performance of our algorithm. Simulation results are provided in Section VI, and followed by conclusion in Section VII.
Ii System Model
We consider a residentialside energy storage and management (ESM) system as shown in Fig. 1. The system contains an energy storage battery which is connected to an onsite renewable generator (RG) and the conventional grid (CG). Energy can be charged into the battery from both the RG and the CG, discharged from the battery to supply customer’s electricity demand, or sell back to the CG. Both the RG and the CG can also directly supply energy to the customer. We assume the ESM system operates in discrete time slots with , and all energy control operations are performed per time slot .
Ii1 Rg
Let be the amount of energy harvested from the RG at time slot . Due to the uncertainty of the renewable source, is random, and we assume no prior knowledge of or its statistics. Let be the customer’s demand at time slot . We assume a priority of using to directly supply . Let be this portion of at time slot . We have . A controller will determine whether the remaining portion, if any, should be stored into the battery and/or sold back to the CG. We denote the stored amount and sold amount by and , respectively, satisfying
(1) 
Ii2 Cg
The customer can buy energy from or sell energy to the CG at realtime unit buying price and selling price , respectively. Both and are known at time slot . To avoid energy arbitrage, the buying price is strictly greater than the selling price at any time, i.e., . Let denote the amount of energy bought from the CG at time slot , bounded by
(2) 
where is the maximum amount of energy that can be bought per time slot. Let denote the portion of that is stored into the battery. The remaining portion directly supplies the customer’s demand. Let be the amount of energy from the battery that is sold back to the CG. The total energy sold back from the battery and the RG is bounded by
(3) 
where is the maximum amount of energy that can be sold back to the CG^{1}^{1}1This amount may be regulated by the utility.
Note that while from the RG can be sold back to the CG at any time, energy buying from or selling to the CG should not happen at the same time to avoid energy arbitrage, which is ensured by the constraint . With this constraint, we expect the following condition to be satisfied
(4) 
We will verify that our proposed algorithm satisfies (4) in Section V.
Ii3 Battery Storage Operation
Storage management: The battery can be charged from multiple sources (i.e., the CG and the RG) at the same time. The total charging amount at time slot should satisfy
(5) 
where is the maximum charging amount per time slot. Similarly, energy stored in the battery can be used to either supply the customer’s demand and/or sell back to the CG. Let denote the discharging energy amount to supply the customer at time slot . The total discharging amount is bounded by
(6) 
where is the maximum discharging amount per time slot. We assume that there is no simultaneous charging and discharging, i.e.,
(7) 
Let be the battery energy level at time slot , bounded by
(8) 
where and are the minimum energy required and maximum energy allowed in the battery, respectively, which values depend on the battery type and capacity. Based on charging and discharging activities, evolves over time as
(9) 
Finally, by the demandandsupply balance requirement, we need to satisfy
(10) 
Battery degradation cost: It is well known that frequent charging/discharging activities cause a battery to degrade [30]. We model two types of battery degradation cost: entry cost and usage cost. The entry cost is a fixed cost incurred due to each charging or discharging activity. Define two indicator functions to represent charging and discharging activities: and . Denote the entry cost for charging by and that for discharging by . The entry cost for battery usage at time slot is given by .
The battery usage cost is the cost associated with the charging/discharging amount. Denote the net change of battery energy level at time slot by . From (5) and (6), it follows that is bounded by
(11) 
It is known that typically faster charging/discharging, i.e., larger , has a more detrimental effect on the life time of the battery. Thus, we assume the associated cost function for usage , denoted by , is a continuous, convex, nondecreasing function with maximum derivative .
Iii ESM with Bidirectional Energy Control: Problem Formulation
For the ESM system, the system cost includes the energy purchasing cost minus selling profit and the battery degradation cost. Within a predefined slot time period, the average net cost of energy purchasing and selling over the CG is given by . For the battery operation, the average entry cost and average net change over the slot period are respectively given by
(12) 
where by (11), is bounded by
(13) 
and the battery average usage cost is . Thus, the average battery degradation cost over the slot period is .
Denote the system inputs by and the control actions for energy storage management by at time slot . With only current known, our goal is to determine a control policy for (i.e., a mapping ) to minimize the average system cost within the slot period. This stochastic control optimization problem is formulated by
P1:  
(14)  
(15) 
Note that constraints (14) and (15) are the derived results of constraints (5) – (9). P1 is a difficult stochastic control optimization problem due to the finite time period and the correlated control actions {} over time as a result of timecoupling dynamics of in (9). In the following, we develop special techniques to overcome these difficulties for a realtime control solution. Specifically, we first apply a sequence of modification and reformulation of P1, and then we propose a realtime control algorithm to solve the resulting energy management optimization problem.
Iiia Problem Modification
To make P1 tractable, we first remove the coupling of control actions over time by modifying the constraints (14) and (15) on the perslot charging and discharging amounts. From (9), we set the change of over the slot period to be a desired value as
(16) 
where by (5)(6)(8), the range of is given by , with . We point out that, is only a desired value we set. It may not be achieved at the end of slot period by a control algorithm. In Section V, we will quantify the amount of mismatch with respect to under our proposed control algorithm. We now modify P1 to the follow optimization problem
P2:  
From P1 to P2, by imposing the new constraint (16), we remove the dependency of perslot charging/discharging amount on in constraints (14) and (15), and replace them by (5) and (6), respectively.
IiiB Problem Transformation
The objective of P2 contains which is a cost function of a timeaveraged net change . Directly dealing with such function is difficult. Adopting the technique introduced in [31], we transform the problem to one that contains the timeaveraged function. To do so, we introduce an auxiliary variable and its time average satisfying
(17)  
(18) 
These constraints ensure that the auxiliary variable and lie in the same range and have the same timeaveraged behavior. Define as the time average of . By using instead of , we replace constraint (13) with (17) and (18), and transform P2 into the following problem in which we determine a control policy for that minimizes the slot time average of system cost
P3:  
It can be shown that P2 and P3 are equivalent problems (see Appendix A). The modification and transformation from P1 to P3 enable us to utilize Lyapunov optimization techniques [26] to design realtime control policy to solve P3. We will then design the control parameters that we introduce in the policy to ensure the solution for P3 is also feasible to the original problem P1.
Iv RealTime ESM Control Algorithm
Based on the Lyapunov framework, for timeaveraged constraints (16) and (18), we introduce two virtual queues and respectively as
(19)  
(20) 
From (9) and (19), and have the following relation
(21) 
where in which is a constant shift and ensures that the left hand side equality in (16) is satisfied. We will revisit the value of to ensure a feasible solution for P1.
Define . Define the quadratic Lyapunov function . Divide slots into subframes of slot duration as , for . We define a oneslot sample path Lyapunov drift as , which only depends on the current system inputs . Instead of the system cost considered in the objective in P3, we define a driftpluscost metric which is a weighted sum of the drift and the system cost at current time slot , given by
(22) 
where constant sets the relative weight between the drift and the system cost. Directly using the driftpluscost metric to design a control policy is still difficult, instead, we present an upper bound on the drift , which will be used for designing our realtime control algorithm.
Lemma 1
Lyapunov drift is upper bounded by
(23) 
where and
Proof:
See Appendix B. \qed
By Lemma 1, we have an upper bound on the perslot driftpluscost metric in (22). In the following, we propose a realtime control algorithm that is to minimize this upper bound on the driftpluscost metric per slot.
Removing all the constant terms in the upper bound independent of and , we have the equivalent optimization problem which can be further separated into two sub problems for and , respectively, as follows
First, we solve to obtain the optimal solution . Note that is convex for being convex. Thus, we can directly solve it and obtain the optimal of .
Lemma 2
The optimal solution of is given by
(24) 
where , is the first derivative of , and the inverse function of .
Proof:
See Appendix C. \qed
Next, we obtain the optimal of and provide the conditions under which is feasible to P1.
Iva The Optimal for
Define the objective function of as . Define the idle state of the battery as the state where there is no charging or discharging activity. The control decision in the idle state is given by , where , , and . Then, in the idle state, we have . Define . We derive the optimal control decision in five cases, given in Proposition 1 below.
Proposition 1
Define as follows: If : , ; Otherwise, , .
Denote as the control decision in the charging or discharging state. The optimal control solution for is given in the following cases:

For : The battery is either in the charging state or the idle state. Let
(25) If , then ; Otherwise, .

For : The battery is either in the charging state (from the RG only) or the discharging state (to the customer’s load only). Let
(26) If , then ; Otherwise, .

For : The battery is either in the charging state (from the RG only) or the discharging state. Define in the discharging state as
(27) Define in the charging state as
(28) Then, . If , then ; Otherwise, .

For and : The battery is in the discharging state (to the customer’s load only). Let
(29) If , then ; Otherwise, .

For : The battery is either in the discharging state or the idle state. If , let
(30) Otherwise, let
(31) If , then ; Otherwise, .
Proof:
See Appendix D. \qed
Proposition 1 provides the closedform control solution in five cases, depending on the battery energy level (via ), battery usage cost (via ), and the prices. In each case, in the charging (or discharging) state is compared with in the idle state, and the optimal is the control solution of the state with the minimum objective value.
Note that there are two sources to be controlled for selling energy back, from the battery and from the RG. Whether to sell energy from the battery back to the grid depends on the battery energy level. When the battery energy level is low (Case 1), energy is kept in the battery. When the battery has a moderately low energy level (Case 2), it may be in either the charging or discharging state. For the latter, the battery only supplies enough energy to the customer but does not sell energy back. When the battery energy level is higher but still moderate (Case 3), it may still be in either the charging or discharging state. For the latter, the battery may sell energy back to the grid. When the battery has just sufficient energy (Case 4), it may supply energy to the customer, but will not sell energy back to the grid. When the energy level in the battery is high (Cases 5), it may supply energy to the customer and at the same time sell energy back. In contrast, the renewable energy can be sold to the grid regardless of the battery energy level, state (charging, discharging, or idle) and the price to make an additional profit. As the result, energy generated by the renewable will be utilized as much as possible. However, when the system wants to sell energy from both the battery and the renewable, the order to determine and depends on which results in the minimum cost in . In Case 5, for the control decision in (30), is determined after , while in (31), is determined after .
IvB Feasible for P1
The optimal solution of provides a realtime solution for P3. However, it may not be feasible to P1, because the battery capacity constraint (8) on may be violated. By properly designing and , we can guarantee that satisfies constraint (8), and ensure the feasibility of the solution. Define . The result is stated below.
Proposition 2
Proof:
We provide a brief outline of our proof and leave the details in Appendix E. Using the solutions and of and , respectively, we can show that both and are upper and lower bounded. Then, by applying these bounds to (21) and using the battery capacity constraint , we obtain as the minimum value that can be achieved with a given value of . With obtained, we derive the upper bound of , i.e., , to ensure that is satisfied. \qed
Note that in (33) is generally satisfied for practical battery storage capacity and being set relatively small. We should also point out that since is a desired value set by our proposed algorithm, the solution of may not necessarily satisfy constraint (16) at the end of the slot period, and thus may not be feasible to P2. However, Proposition 2 provides the values of and to guarantee the control solutions being feasible to P1.
We summarize the proposed realtime control algorithm in Algorithm 1. We emphasize that the proposed algorithm does not rely on any statistical assumption on the prices, demand, and renewable processes , and thus can be applied to general scenarios, especially when these processes are nonergodic or difficult to predict in a highly dynamic environment.
V Performance Analysis
To analyze the performance of our realtime solution in Algorithm 1 with respect to P1, let denote the slot average system cost objective of P1 achieved by Algorithm 1, which depends on the value of set by Algorithm 1. For comparison, we partition slots into frames with , for some integers . Within each frame , we consider a slot lookahead optimal control policy, where are known noncausally for the entire frame beforehand. Let denote the minimum slot average cost for frame achieved by this optimal policy. We can view as the minimum objective value of P1 with under the optimal slot lookahead solution. The performance gap of our proposed realtime algorithm to the optimal slot lookahead policy is bounded in the following theorem.
Theorem 1
For any arbitrary system inputs , and any with , the slot average system cost under Algorithm 1 to that under the optimal slot lookahead policy satisfies
(34) 
with the bound at the right hand side being finite.
Asymptotically as ,
(35) 
Proof:
See Appendix F. \qed
By Theorem 1, the performance gap of Algorithm 1 to the slot lookahead optimal policy is upper bounded in (1), for any with . To minimize the gap, we should always set . From (35), as the duration goes to infinity, the asymptotic gap is in the order of . Since increases with , When , Algorithm 1 is asymptotically equivalent to the slot lookahead optimal policy as the battery capacity and time duration increases.
As discussed at the end of Section IVB, constraint (16) in P2 sets a desired value for which may not be achieved by our proposed algorithm at the end of slots. Denote this mismatch under Algorithm 1 by . This mismatch is quantified below.
Proposition 3
Proof:
See Appendix G. \qed
Finally, we expected constraint (4) to being satisfied by Algorithm 1, i.e., buying energy () and selling back from battery storage () should not occur at same time. This is verified in the following result.
Proposition 4
Proof:
See Appendix H. \qed
Vi Simulation Results
We set the slot duration to be 5 minutes, and assume that system input remains unchanged within each slot. We set the buying price using the data collected from Ontario Energy Board [32]. As shown Fig. 4 top, follows a threestage price pattern repeated each day as . We use solar energy for the RG to generate . As a result, is a nonergodic process, with the mean changing periodically over hours. As shown in Fig. 4 middle, we model by a threestage pattern as kWh and set standard deviation , for . We also model the load as a nonergodic process with mean following a threestage pattern over each day as kWh, shown in Fig. 4 bottom, and set standard deviation