Optimal control of storage for arbitrage, with applications to energy systems
Abstract
We study the optimal control of storage which is used for arbitrage, i.e. for buying a commodity when it is cheap and selling it when it is expensive. Our particular concern is with the management of energy systems, although the results are generally applicable. We consider a model which may account for nonlinear cost functions, market impact, input and output rate constraints and inefficiencies or losses in the storage process. We develop an algorithm which is maximally efficient in the sense that it incorporates the result that, at each point in time, the optimal management decision depends only a finite, and typically short, time horizon. We give examples related to the management of a realworld system.
1 Introduction
How should one optimally control a store which is used to make money by buying a commodity when it is cheap, and selling it when it is expensive? We are interested in this question primarily in the context of electrical energy systems, where a store may, for example, take in energy at night, when there may be a surplus of supply over demand rendering the excess energy cheap, and release that energy during the day. However, the mathematics we develop is of course more generally applicable.
A major constraint on the operation of electrical energy systems—for example, the UK national grid or similar systems in other countries or continents—is that supply and demand need to be kept in very close balance at all times. It has always been the case that electricity demand is highly variable, notably on daily, weekly and annual cycles, although this variation is in general at least predictable. However, the increasing reliance on renewable sources of generation such as wind and solar power is now introducing both variability and unpredictability in electricity supplies. In order to assist in keeping supply and demand well balanced it is useful to be able to shift electrical energy through time. The most obvious way to do this is through storage, which rearranges the profile in time of energy supply. However, the profile in time of demand may also be rearranged through what is generally referred to as demandside management, and it should be noted that the postponement of demand is mathematically equivalent to the use of negative storage—although the practical difficulties with demandside management are somewhat different.
Storage may assist in a large number of ways, most notably:

in shifting energy from times of low demand, when its generation is typically cheap, to times of high demand, when its generation is typically expensive;

in stabilising the system with respect to small and transient imbalances;

in reacting to major disturbances, such as sudden loss of generation, transmission failures, or sudden surges in demand.
Our interest here is primarily in the first of these—see, for example, [1, 2, 3, 4] for a broader discussion of storage. We take an economic view, and investigate the value of energy storage for arbitrage, that is smoothing price fluctuations over time. Thus it is assumed that energy is always available from somewhere, at a sufficiently high price, and that the value of storage consists of its ability to buy energy when it is cheap and release it when it is expensive. We work here in a deterministic setting in which we assume that all relevant buying and selling prices are known in advance.
We think of the available storage as a single store. Its value is equal to the profit which can be made by a notional store “owner” buying and selling as above. In the case where the activities of the store are sufficiently significant as to have a market impact (the store becomes a “price maker”), the system or societal value of the store may be similarly calculated by adjusting the notional buying and selling prices so that the store “owner” is required to bear also the external costs of the store’s activities. Thus our framework is in this respect completely general. We also allow for nonlinear cost functions, for differences in buying and selling prices, for inefficiencies in the storage process, and for input and output rate constraints.
In Section 2 we formally define the relevant mathematical problem and characterise mathematically its optimal solution. In Section 3 we provide an algorithm for its solution, which is efficient in the sense we there explain. In particular the decisions to be made at each point in time typically depend only on a very short future horizon—which is identifiable, but not determined in advance. In Section 4 we give examples based on real data and a real pumped storage facility. Finally, in Section 5 we outline some extensions and give concluding remarks.
2 Problem formulation and characterisation of solution
We work in discrete time, which we take to be integer. We assume that the store has total capacity of units of energy, and input and output rate constraints of and units of power respectively. We consider also a timeindependent (in)efficiency associated with the store. This may be defined as the fraction of energy input which is available for output. It may be captured in our model either by adjusting buy prices by a factor or by multiplying sell prices by (the values of in the two cases differing by a factor of ). Hence, without loss of generality, we take throughout in our mathematical formulation of the problem and its solution. (A further type of (in)efficiency, which may be regarded as leakage over time is considered in Section 5.)
Both buying and selling prices at time may conveniently be represented by a function with , which is increasing and convex in and which, for positive , is the price of buying units of energy, and, for negative , is the negative of the price for selling units of energy. Thus the cost of increasing the level of energy in the store by , positive or negative, is always . The convexity assumption corresponds, for each time , to an increasing cost to the store of buying each additional unit of energy, a decreasing revenue obtained for selling each additional unit of energy, and every unit buying price being at least as great as every unit selling price.
As indicated above, if the problem is to determine the value of the store to the entire energy system, or to society, then these prices are taken to be those appropriate to the system or societal costs. Thus, for example, for positive, is the price paid by the store at time for units of energy plus the increased cost paid by other energy users at that time as a result of the store’s purchase increasing market prices.
A special case is that of a “small” store, whose operations do not influence the market (the store is a “pricetaker” rather than a “pricemaker”), and which at time buys and sells energy at given prices per unit of and respectively, where we assume that . Here the function is given by
(1) 
Finally, we assume that all prices are known in advance, so that the problem of controlling the store is deterministic.
Consider the problem of controlling the store so as to maximise profit over a time interval . Note that the rate constraints may, if we choose, be absorbed into the cost function—for example, by defining, for each
(formally this defines an extension of the range of the cost functions to include the point at but it is readily verified that this causes no problems). For simplicity in the presentation of the theory below we assume this to have been done; thus see Figure 1 for an illustration of a typical cost function.
Denote the successive levels of the store by a vector where is the level of the store at each successive time . Define also the vector by for each , so that represents the energy added to the store at time . It is convenient to assume that both the initial level and the final level of the store are fixed in advance at and . (If the final level is not fixed and the cost function is strictly increasing, then, for an optimal control, we may take to be minimised; however, we might, for example, require —as a contribution to a toroidal solution.)
The problem thus becomes:

choose so as to minimise
(2) with and , and subject to the capacity constraints
(3)
In the case where the cost functions are linear, or piecewise linear, as in the “small store” case given by (1), and in which rate constraints may be present, the problem may be reformulated as a linear programming problem, and solved by, for example, the use of the minimum cost circulation algorithm (see, for example, [5, 7]). Our aim in the present paper is to deal with the general case, and to develop an algorithm which proceeds locally in time, providing a solution which is efficient both in the general and in the linear case.
In Theorem 1 below we use strong Lagrangian theory [5, 6] to give sufficient conditions for a value of to solve the problem . We then discuss briefly how Lagrangian theory may be used to show that a solution of the form given always exists. The truth of this latter assertion is also demonstrated in Section 3 when we consider an algorithm for the determination of the solution.
Theorem 1.
Suppose that there exists a vector and a value of such that

is feasible for the stated problem,

for each with , minimises over all ,

the pair satisfies the complementary slackness conditions, for ,
(4)
Then solves the stated problem .
Proof.
Let be any vector which is feasible for the problem (with and ). Then, from the condition (ii),
Rearranging and recalling that and agree at and at , we have
(5) 
by the condition (iii), so that the result follows. ∎
As previously discussed, Theorem 1 gives a sufficient condition for a pair to solve the problem . As is clear from the condition (ii) of the theorem, the vector has the interpretation that, at each time , the quantity may be regarded as a notional value per unit of energy in store, and may be used to determine how much further energy to buy or sell at that time. An optimal solution to the problem is given by keeping this reference value as constant as possible over time (for otherwise a “solution” may be improved by using a more consistent value of the vector ); the exceptions occur at the boundaries and of the capacity constraint region, where the above improvements may not be possible and where is allowed to decrease immediately subsequent to those times when the store is empty and to increase immediately subsequent to those times when it is full.
An examination of the relevant strong Lagrangian theory (again see [5, 6]) shows that the vector has a representation as
where each and are (strong) Lagrange multipliers associated with respectively the lower bound and upper bound of the capacity constraint at time . The standard convexity condition of the supporting hyperplane theorem shows that here a sufficient condition for the existence of such Lagrange multipliers is given by the assumed convexity of the cost functions , and this in its turn is sufficient for the existence of a pair as in Theorem 1. We do not give a formal proof of this assertion here; rather the algorithm given in the following section constructs such a pair directly.
3 Algorithm
We now give an explicit construction of a pair as in Theorem 1, and hence also an algorithm for the solution of the problem. This algorithm below may briefly be described as that of attempting to choose so as to satisfy the conditions of Theorem 1, by choosing the components of these vectors successively in time and by keeping as constant as possible over , changes only being allowed at times when the store is either empty or full.
For further simplicity, we suppose first that the cost functions are all strictly convex. For any such that and any (scalar) , define to be the unique value of which minimises . Note that is then continuous and increasing (though not necessarily strictly so) in . Define a sequence of times and the pair inductively as follows. Suppose that is such that together with and , are all defined. For each (scalar) , define a vector by
(6) 
Define the sets
and
Thus and are the sets of for which violates one of the capacity constraints and first does so respectively below or above—in either case at a time which we denote by . Note that, since each is increasing in , we have for all , . In particular the sets and are disjoint. Note also that since, for all , we have as , the set is nonempty. Let . We now consider the behaviour of , for which there are three possibilities:

the vector is feasible; in this case we take and with for (thus also for all );

the vector belongs to the set ; here there necessarily exists at least one such that (for otherwise, by the continuity of each in , could be increased above while still belonging to the set ); define to be any such , say the largest, and (again) take and for all such that ; note also that we then have so that we shall necessarily have ;

the vector belongs to the set ; here, similarly to the case (b), there necessarily exists at least one such that ; define to be any such , again say the largest, and again take and for all such that ; further, in this case we have so that we shall necessarily have .
In the case where the cost functions are all strictly convex, it now follows immediately from the above construction of the pair that this pair satisfies the conditions (i)–(iii) of Theorem 1.
In the case where, for at least some , the cost function is convex, but not necessarily strictly convex, a little extra care is required. Here, for such , the function is not in general uniquely defined, and, for any given choice, this function is not in general continuous. However, in essence, the above construction of continues to hold—it is simply a matter, where necessary, of choosing the right value of .
We summarise our results in Theorem 2 below.
Theorem 2.
The pair given by the above recursive construction satisfies the conditions (i)–(iii) of Theorem 1.
The above algorithm requires the determination, at each of the successive times , , of the succeeding time and of the common value of for . This is done by looking ahead for the minimum time horizon necessary for the above determination; the process then restarts at the time . A lengthening of the total time over which the optimization is to be performed does not in general change the values of the times , but rather simply creates more of them. In this sense both the solution to the problem and the above algorithm are local in time, so that the solution to involves computation which grows essentially linearly in . The typical length of the intervals between the successive times depends on the shape of the cost functions (notably the difference between buying and selling prices), together with the rate at which these functions fluctuate in time. This is to be expected as the store operates by selling at prices above those at which it bought, and what is important is the frequency with which such events can occur. For example, such fluctuations may occur in a 24hour cycle, and, depending on the shape of the cost functions, the typical length of the intervals between the successive times may then be of the order of around 12 hours. These points are illustrated further in the examples of the following section.
We observe also that, for each time as above, the determination of the succeeding time and of involves some form of search over an interval of the real line and as such may typically only be carried out to a specified degree of precision. This is inevitable given general convex cost functions.
4 The “small” store
In this section we look further at the case of a “small” store, whose operations do not influence the market, and which at time buys and sells energy at given prices per unit of and respectively (with ), so that each of the functions is as given by (1). We continue to assume the existence of a rate constraint , which, for mathematical purposes may, as previously observed, be absorbed into the cost functions by appropriately modifying them. We give a number of results for this case, illustrating them with examples based on realworld data.
It follows from the results of the previous section that, given an initial level and a final level of the store, there exists a pair as in Theorem 1 and such that defines the optimal control of the store over the time interval . One immediate consequence of this is that the optimal control is here bangbang is the sense that, at each time , the store should either buy as much as possible (subject to the capacity and rate constraints), do nothing, or sell as much as possible, according to whether the current “reference value” of is above the buy price , between and the (lower) sell price , or below .
Typically we may have for some factor which may be interpreted as representing the efficiency of the store. As is decreased below the set of times at which buying or selling actually takes place is correspondingly reduced—see the example below.
Now note that, apart from the obvious scale factor, the solution to the optimization problem of problem depends on capacity constraint and the rate constraint only through the ratio , which has the dimension of time. As the store capacity is increased (with held fixed), the time horizon required for the determination of each optimal action becomes longer and the corresponding optimal solution more global in character. For there is some scalar such that for all , so that in an optimal solution, at each time , the store buys if and only if and sells if and only if . The scalar is such that the final level of the store is as required. In contrast, as the rate constraint is increased (with held fixed), the time horizon required for the determination of each optimal action becomes shorter and the corresponding optimal solution more local in character. These results are illustrated in the examples that follow.
We illustrate our methodology with an example storage facility using parameters motivated by the Dinorwig pumpedstorage power station in Snowdonia, north Wales—see [8] for a good description of this power station and its uses. (Note, however, that Dinorwig is not currently primarily used for price arbitrage, but rather for the provision of fast response services to the GB energy network.) We use the “small store” cost structure (1), with for all . A typical value for would be reflecting the approximate efficiency of the Dinorwig plant. We assume also a common input and output rate constraint , say. The cost series are proportional to the real halfhourly spot market wholesale electricity prices during the period corresponding to the example. As might be expected these prices show a strong daily cyclical behaviour. As already observed, for the “small store” essentially linear cost structure (1), the optimal control is bangbang in the sense already described above.
In the first of our examples we take the ratio to correspond to 10 halfhourly periods—the total length of time which the Dinorwig facility takes to either fill or empty. Specially, we considered the choice energy units and energy unit per halfhour. Figures 2, 3 and 4 show the two price series and for the 7day period Sunday 9 January 2011 to Saturday 15 January 2011 inclusive for efficiencies of and , respectively. The decisions to buy, sell or keep the level of the store unchanged are indicated by the red, blue and black line segments, respectively. In the lower panel we show the series of storage values, , over this oneweek period. Each day storage is emptied when prices are sufficiently high and filled when prices are low. Notice that as the efficiency, , is reduced the number of periods at which it is economic to either buy or sell (as opposed to doing nothing) is similarly reduced.
In our second example we investigate the operation of the storage plant with increasing storage capacity while keeping the rate constraint and the two price series as before. Figure 5 corresponds to the situation that arises when has increased to the extent that the capacity constraint is no longer active provided only that the initial level and final level of the store are taken sufficiently large. Here the storage facility remains nonempty over long periods of time and may take advantage of the price difference between, for example, different seasons of the year.
Our third and final example shows in Figure 6 the complementary circumstance when there is effectively no rate constraint, that is we hold fixed at one energy unit and increase until the rate constraint in no longer active. Accordingly the finite capacity store is always able to fill entirely and empty completely within a single halfhour period.
5 Commentary and conclusions
In the preceding sections we have developed the optimization theory associated with the use of storage for arbitrage, and given an algorithm for determining the optimal control policy for, and hence the value of, storage when used for this purpose. In particular our algorithm captures the fact that the control policy is essentially local in time, in that, for a given system subject to given capacity and rate constraints, at each time optimal decisions are dependent only on the relevant cost functions for what is typically a very short time horizon.
Our model accounts for nonlinear cost functions, rate constraints, storage inefficiencies, and the effect of externalities caused by the activities of the store impacting the market. What we have not done in the present paper is to consider the use of storage for providing a reserve in case of unexpected system shocks, such as sudden surges in demand or shortfalls in supply. This problem is considered by other authors, in which the probabilities of storage underflows or overflows are controlled to fixed levels. However, we believe that a further approach here would be to attach economic values to such underflows or overflows, translating to attaching an economic worth to the absolute level the store (as opposed to attaching a worth to a change in the level of the store as in the present paper). Since in practice storage is used both for arbitrage and for buffering or control as described above, this would provide a more integrated approach to the full economic valuation of such storage.
Acknowledgements
The authors wish to thank their coworkers Andrei Bejan, Janusz Bialek, Chris Dent and Frank Kelly for very helpful discussions during the preliminary part of this work. They are also most grateful to the Isaac Newton Institute for Mathematical Sciences in Cambridge for their funding and hosting of a number of most useful workshops to discuss this and other mathematical problems arising in particular in the consideration of the management of complex energy systems. They are further grateful to National Grid plc for additional discussion and the provision of data, and finally to the Engineering and Physical Sciences Research Council for the support of the research programme under which the present research is carried out. (The EPSRC grant references are as follows: EP/I017054/1 and EP/I016023/1.)
References
 A.Iu Bejan, R.J. Gibbens and F.P.Kelly. Statistical Aspects of Storage Systems Modelling in Energy Networks. 46th Annual Conference on Information Sciences and Systems (invited session on Optimization of Communication Networks). March 21–23, 2012, Princeton University, USA.
 N. G. Gast, D.C. Tomozei and J.Y. Le Boudec. Optimal Storage Policies with Wind Forecast Uncertainties. Greenmetrics 2012, Imperial College, London, UK, 2012.
 Y. Huang and S. Mao and R.M. Nelms. Adaptive electricity scheduling in microgrids. Proc. IEEE INFOCOM, 2013. Turin, Italy.
 J.C. Williams and B.D.Wright. Storage and commodity markets. Cambridge University Press. 2005.
 S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press. 2004.
 P. Whittle. Optimization under constraints: theory and applications of nonlinear programming. Wiley. 1971.
 R.K. Ahuja, T.L. Magnanti and J.B. Orlin. Network Flows: Theory, Algorithms and Applications. Prentice Hall. 1993.
 https://en.wikipedia.org/wiki/Dinorwig_Power_Station.