Stability and Performance of Coalitions of Prosumers Through Diversification in the Smart Grid
Abstract
Achieving a successful energetic transition through a smarter and greener electricity grid is a major goal for the 21st century. It is assumed that such smart grids will be characterized by bidirectional electricity flows coupled with the use of small renewable generators and a proper efficient information system. All these bricks might enable end users to take part in the grid stability by injecting power, or by shaping their consumption against financial compensation. In this paper, we propose an algorithm that forms coalitions of agents, called prosumers, that both produce and consume. It is designed to be used by aggregators that aim at selling the aggregated surplus of production of the prosumers they control. We rely on real weather data sampled across stations of a given territory in order to simulate realistic production and consumption patterns for each prosumer. This approach enables us to capture geographical correlations among the agents while preserving the diversity due to different behaviors. As aggregators are bound to the grid operator by a contract, they seek to maximize their offer while minimizing their risk. The proposed graph based algorithm takes the underlying correlation structure of the agents into account and outputs coalitions with both high productivity and low variability. We show then that the resulting diversified coalitions are able to generate higher benefits on a constrained energy market, and are more resilient to random failures of the agents.
I Introduction
Designing stable power systems is a classical engineering challenge since blackouts can have catastrophic consequences. One obvious condition for stability revolves around sustaining at any time a power generation that meets the demand. If one is larger than the other, the system deviates from its synchronous state. If nothing is done to return to the synchronized equilibrium, this can lead to catastrophic cascades of failures [1] [2]. With traditional power plants based on fossil energies, the production can be scheduled in order to sustain the predicted consumption. Deviations from this schedule can then be supported almost on real time by using fast response power plants, interconnections with border countries, as well as regulator entities. Even if most individual consumers may not realize it, the electricity prices are not constant rates and evolve with the production/consumption conditions on the grid. It is often assumed that the granularity of these prices is meant to increase in the future as a way to pass on the production conditions on the end users [3]. The pricing of electricity and the necessity for the grid operator to have different emergency reserves lead to an economy setting for electricity, where operations and various kind of contracts are decided on a market [4]. Such a market environment clearly necessitates communication in order to monitor and obtain information from the grid as well as exchange between participants. These kind of settings are already being used at the transport level of current power grid with large renewable plants in the production portfolio. Using electricity markets appears thus as a way to manage the reliability of the whole system, especially when the use of renewables is important.
The smart grid vision however goes further and revolutionizes this topdown centralized architecture by assuming that bidirectional power flows are allowed. This would change completely the nature of the traditional grid since production could be located down to the very end of the distribution networks. Nodes that were simply pure loads yesterday could behave tomorrow as generators or loads [5]. On the other hand, the user of renewables in the production is continuously growing, and is expected to become majority in a near future. These plants completely rely on the presence and the intensity of their respective resources (wind, sun, tiles…). Balancing production and demand in such a scenario appears much more challenging and it seems inevitable that the grid should modernize its infrastructures to sustain this transition [4]. The key to drive safely such a system is assumed to be information, and more precisely, the capacity to measure, communicate, and analyze data on real time.
In this paper, we focus on these "nodes" in the distribution network that can produce and consume electricity. More precisely, we consider a set of agents that own both renewable distributed energy resources (DER) and electrical loads. The production of an agent can be, of course, used to meet its own demand, but in cases when it is overproducing, we consider that it has the possibility to inject, against financial compensation, its extraproduction in the grid. Such an agent model is known as a "prosumer" [6] (and will be called accordingly in this paper). A key point of this work is to merge the interests of the grid operator with the prosumers in a single utility function. While the latter intend to maximize their benefits with higher production contracts, the operator primary concerns are related to the quality and stability of the injections. The coalitions that we wish to form should thus be both stable and productive. This compromise can be difficult to obtain in real situations, especially when renewable generation and complex spatiotemporal correlation happen.
More precisely, in our model, any entity that wishes to sell its production on some energy market has to estimate and announce a production capacity for an upcoming period of time. If a contract is bought, the entity commits to injecting exactly, at any time of the contracted period, the announced amount of power, under financial penalties if it fails. Since prosumers use exclusively renewables, contracts come naturally with some amount of risk due to the intermittent nature of most renewables. Using storage as buffers in order to reduce risks is a popular approach. Actually, a whole branch of the smart grid literature is even studying the possibility to use electric vehicles as moving storage capacities for stabilizing the grid. Without storage and proper control, the overproducing state of the agents, and thus the power they inject in the grid, might be rather unstable. This is clearly unacceptable for the grid operator that cannot ensure system stability if it has to deal with numerous small unpredictable entities.
Forming coalitions/aggregations is an envisaged solution to both the number of entities soliciting the operator, and their high variability. Indeed, it has been argued that using a multilevel aggregators architecture for the control could lead to better performance on the communication side [7], since an aggregator can be considered as a single node by the level above it. On the other hand, it is wellknown that diversification of the assets is a way of minimizing risks when constructing a portfolio. One thus expect a more stable and predictable energy production for a coalition than for a single agent. Nevertheless, all coalitions are obviously not equally stable or productive, which means that special attention should be paid to the aggregation step. Recall that prosumers have both a consumption and a production component, each depending on location and time, meaning that there are complex underlying correlations between the agents. This is a central topic of the present paper : given N prosumers, what coalitions should be formed so that the compromise between expected production and variability is optimized ?
We will see that the variability in the aggregated productions can be quantified to a certain extent by the correlation among the agents forming the coalitions. Understanding the correlation relationships among the agents can thus give an indication about what coalitions to form and how much they should sell. More precisely, we use a graph representation of the correlation structure to gain insight about the expected production to risk ratios of different coalitions. We build a framework in which the system operator specifies both the minimum production acceptable to enter the market and restrain the amount of risk he is willing to take. We then propose a graph heuristic that uses decorrelated cliques in order to form diversified productive and stable coalitions, and we compare the results with other formation strategies (see section VII).
Rather than maximizing the profit of each agent in the grid (prosumers and grid operator), our algorithm tries to form the most productive coalitions given a maximum amount of risk acceptable. In other words, it tries to maximize coalitions profits without considering individual retributions to single agents (a task often studied through game theory). Furthermore, the pricing strategies for both the prosumer and the grid is beyond the scope of this paper.
Because agents are susceptible to fail for diverse reasons, the propensity of the system to undertake these failures is critical [8]. We therefore investigate in this paper the resilience of the coalitions when prosumers fail. Despite the fact that losing agents is usually detrimental to the coalitions, we will see that the coalitions formed with our algorithm tend to be less impacted by random failures.
The paper is organized as follows, section II gives a brief overview of the related literature, section III clarifies how we generated realistic prosumer production traces based on weather data. In section IV, we define most of the notations and explain why correlation between prosumers is a quantity of interest for our objective. Based on the conclusions of section IV, we develop in section V a utility function quantifying how much power a coalition can announce on the market given an accepted risk level. We then develop, in section VI a greedy optimization algorithm that uses decorrelated cliques as inputs and improves their utility over a correlationconstrained environment. Finally, section VII provides some results both on performance of the method and resilience of the coalitions formed.
Ii Related Work
As explained in the section above, allowing entities using mainly renewables to inject power in the grid is a difficult challenge. Indeed, on the contrary to fossil plants whose production can be scheduled in advance to meet the expected consumption, renewables by definition only produce when the resource is present. Unfortunately, these moments do not necessarily coincide with the consumption peak hours [9]. One possible solution consists in shifting some of the loads towards high productivity periods. Demand side management techniques implemented in the end users’ smart meters, can mitigate the consumption peaks and wast less production [9] [10]. Since end users tend to seek a maximum utility at a minimum cost, dynamic pricing is believed to be a good way to give incentives to them. By carefully scheduling the prices, it is assumed that the load curve can be shaped to some extend.
Dynamic pricing is an interesting and useful tool, but it has also some limitations. It is likely that dynamic pricing will serve as a shaping mass tool while finer techniques will be needed locally. A popular approach in this direction consists in deploying storage devices in the network and using them as electricity buffers. Basically these storages would be charged when there is a surplus of production, and discharged when the consumption exceeds the production. Although quite simple, this idea causes numerous challenges when it comes to its real implementation. As long as the system considered remains small, a centralized control of these buffers could be envisioned. However, for real large systems it is likely that more sophisticated decentralized algorithms will be necessary. In [11], the authors introduce a distributed energy management system with a high use of renewables such that power is scheduled in a distributed fashion. In [12] the optimal storage capacity problem is addressed. There is indeed an interesting tradeoff between the costs of the equipments and the expected availability of power. The authors develop a framework that enables them to exhibit a Pareto front of efficient solutions.
These are only a few possibilities for balancing production and demand in the smart grid. Most of the time, these techniques will be coupled with predictions of the upcoming load curves and weather conditions. Combining all these technologies enables aggregators to quantify their expected production and the inherent risk that comes with it. The optimization of expected returns to risk is a traditional goal in finance, and a wide literature exists on this topic. It is wellknown for instance, that the more risk one is willing to take, the higher his potential gains. On the contrary, when investing exclusively on low risks assets, one should expect relatively small gains. This tradeoff is formalized in the Markowitz’ portfolio theory [13]. More precisely, given a set of assets for which we have some historic data of returns, the objective is to find a linear combination of these assets (the socalled portfolio) which maximizes the expected value while minimizing the variance of the portfolio’s return. Markowitz’s answer is a set of efficient portfolios that all optimize in some sense this tradeoff. If one is able to put a number on his risk acceptance or on the target expected return, the corresponding efficient portfolio is a priori the best option. One of the most controversial assumptions in the portfolio theory is that returns are jointly normally distributed (or, at least, that the returns distribution is jointly elliptical). Some economist have pointed out the fact that this assumption might not capture well the reality of financial markets [14].
Nevertheless, one of the key point in the Markowitz theory is to consider explicitly the correlation between the assets since they impact directly the variances of the portfolios. Since the work of [15], an interesting approach consists in computing a distance metric based on the correlation coefficients in order to organize the series in a correlation graph. Nodes represent the series considered while the edges are weighted by the metric. Because the metric can be computed for all pairs, these graphs are complete and of little use as is. Historically, the approach used by [15] was to compute a minimum spanning tree as to obtain a hierarchical clustering of the series. Later on, it was pointed out that, by definition, a spanning tree could not capture the underlying clustering structure hidden in the correlation graph. In this paper, we use another classical filtering technique called graph [16]. It consists in selecting a threshold , and filtering out edges with smaller weights. As we will see further in this paper, this approach has the advantage of preserving clusters of correlated series.
Iii Generating realistic prosumer patterns
An essential component of the smart grid is the smart meter which makes the interface between the end user and the rest of the system. Smart meters coupled with sensors measure quantities of interest (like instantaneous consumption), receive informations from the grid (electricity prices for instance), and take actions accordingly (demand side management programs). Smart meters are currently and gradually deployed, and will probably provide interesting datasets to work on. Unfortunately, at the time this paper was written, production and consumption data for prosumers over a large region were not yet available to our knowledge. Some interesting experiments are nonetheless being conducted and data are progressively made public [17].
In this paper, we use weather quantities like wind speed or solar radiance as alternative data for generating realistic production and consumption series. Fortunately, these kinds of data are easier to find, and since the development of small personal weather stations, their geographical granularity keeps increasing. Since these quantities depend both on time and location, we discretize time into slots and space into zones in the following. A zone is simply a portion of the considered region of study for which we sampled data. Therefore, if prosumers i and j are positioned on the same zone, they are exposed to the same weather. Adding some intrazone noise can easily be done though not considered in this paper.
More formally, we denote by the instantaneous extraproduction of agent i at time t :
(1) 
Where represents the total production of agent i at time t and its consumption at time t. In other words, represents the instantaneous surplus of power that agent i is willing to sell at time t. As explained above, since large datasets containing this quantity over time are not yet available, we simulated these traces by considering separately and .
For a prosumer i, it is possible to write both quantities as a sum over the distributed energy resources () and loads () of i :
(2) 
(3) 
For simplicity, in this paper we only consider windturbines (WT) and photovoltaic panels (PV) as possible DERs for the agents ():
(4) 
We denote by and the wind speed (in ) and the solar radiance ( in ) at the location of agent i and at time t, so that :
(5) 
Where (resp. ) is the power curve for the windturbines (resp. photovoltaic panels). We made here the implicit assumption that all windturbines (resp. photovoltaic panels) have the same power curve. The model can be easily extended to multiple power curves accounting for different types of generators. More details about power curves and their approximations can be found in the appendix B and in [18]. The process for generating the series is pictured in the first block of the process diagram (see figure 1).
Note that a prosumer i is defined by his zone as well as the sets and . That is, a prosumer can be configured to represent anything from a single windturbine for instance ( and ) to a pure load ( and ) through more complex combinations. In practice, we use random configurations for the agents.
In the rest of the paper, we use french weather data [19] starting in January 2006 and ending in December 2012, with a sampling frequency of three hours, and generate N timeseries of extraproduction over this date range.
Iv Notations
This section provides most of the notations and introduces important concepts for the rest of the paper. As explained in section III, we consider a set of N prosumers configured randomly, and for each agent, we simulate its extraproduction from 2006 to 2012. Based on these historical values, our objective is now to form groups of prosumers (the socalled coalitions) so that the global power production resulting from the superposition of individual’s extraproductions be both sufficiently high and predictable. Let be the extraproduction of coalition S at time t.
Suppose now that coalition S has to suggest a production value to enter the market. This means that, during the time S is on the market, it will have to inject in the grid exactly at any time t and will be rewarded proportionally to this amount, with penalties if it deviates. Obviously, the actual extraproduction will not be constant at this value and will oscillate due to intermittences in the production and consumption. If S always produces more than , it will never have to pay penalties, but it is losing some gains since it could have announced a higher contract value. If the production oscillates around , by using batteries or demand side management techniques (see section II), S could be able to maintain its production to the contract value at any time. Nevertheless, if the oscillations are too important compared to the available storage capacity, S will probably break the contract and pay penalties. We can see that there is a return over risk tradeoff here, meaning that coalitions should find the right balance between announcing too low and losing some potential gains, and claiming too high and paying penalties.
Let us illustrate the rest of the notations and concepts with a simple example. We consider only two agents i and j such that the distribution of their extraproduction can be approximated by normal distributions : and . This is only for explanation purposes as it is of course rather unrealistic in real situations where the distributions are skewed. Using simple statistics, we can write the distribution of the coalition as , where :
(6) 
being the Pearson’s correlation coefficient between and . If the coalition proposes a contract value , all instants when will produce less than is critical. Indeed, in this kind of situations, will either have to discharge batteries to keep up with its contract, or pay penalties to the grid. The probability that is underproducing compared to the contract : is thus an important indicator of the coalition’s quality. A wellknown result for normal distributions is that the cumulative distribution function can be written as :
(7) 
where is the error function : .
The contract a given coalition is willing to take depends on its capacity to compensate for underproducing (using batteries, backup generators…), and its risk acceptance. Selecting the right contract value appears thus as an interesting problem on its own that we plan to investigate in future works. In order to keep the present paper in a reasonable length, we simplify the contract value selection problem by giving some responsibilities to a third party named the grid operator. The role of the grid operator is to constrain the market entry to coalitions able to propose both sufficiently high and sufficiently credible contract values. More formally, let be the reliability threshold fixed by the grid operator as a maximum value for the probability of underproducing. The highest contract value that a coalition can propose is thus such that . In the Gaussian example, it implies that coalition is announcing :
(8) 
This is the best contract value that the coalition S can afford giving the stability policy of the grid operator. Figure 2 shows how evolves according to the reliability parameter . For illustration, the range of values is shown from 0 to 1, but in practice, only small values of really make sense : for instance means that coalitions can announce absolutely anything since the probability of producing less than any contract value is necessarily less than one by trivial definition of a probability. As visible on figure 2, coalitions with high expected productions but presenting a high unpredictability are penalized and can only afford small contracts.
In order not to overload the market with unrealistically small coalitions, the grid operator also specifies a lower bound on the contract values. We thus characterized a valid coalition as one satisfying the two conditions :
(9) 
On figure 2, is fixed to 2 units for illustration purpose. For , only blue triangles and cyan circles coalitions are valid while red diamonds coalition is not.
The Gaussian assumption of this small example is convenient as it allows us to write analytically. Nevertheless, such assumption is rather unrealistic in practice. In the following, we keep the same framework but release this Gaussian assumption unless the contrary is specified (see eq. 14). This assumption will indeed be convenient for computing some parameter estimates.
V Utility Function
In this section, we use the notions of contract values and valid coalitions developed in section IV in order to design a proper utility function. The contract basically indicates the rate at which a coalition has to inject power in the grid. It seems then natural that coalitions are remunerated proportionally to their contract values . More precisely, if is the unitary price rate for electricity, a coalition S injecting in the grid during a period earns :
(10) 
(since is supposed to be a constant rate over the contracted period). Using directly as a utility function suffers a major drawback. It is indeed not a concave function of the coalitions’ sizes, meaning that coalitions can grow as large as the number of agents allows it, without any counterbalance effects.
Such a model, that virtually allows infinitely large coalitions and contract values, is in practice not realistic. There are indeed costs (communication costs for instance) that increase with the coalitions sizes. We take this observation into account by rescaling the utility of a coalition S by its size in term of number of agents ():
(11) 
where parameter controls to what extent the size of a coalition impacts its utility, and is a normalizing factor. can be seen as the maximum production which can be injected in the grid.
Based on , the marginal contribution of an agent i can be expressed as . A coalition S has thus an interest in adding an additional agent i if this marginal contribution is positive :
(12) 
If is set to zero, agents are added as long as they increase the contract value of the coalition. If is greater than zero, additional agents have to increase the contract value by some factor. The utility function with is not necessarily convenient, here we relate to the mean sizes of the coalitions . otherwise tends to form coalitions of size approximately in the order of , then :
(13) 
In order to get an estimator for , we solve equation 13 in a Gaussian case (as in section IV). Furthermore, since considering all the possible interactions between agents is analytically intractable, we use here a mean approximation. Any quantity x that varies over the agent set is thus simplified in its mean value . Solving equation 13 for in these conditions leads to:
(14) 
Figure 3 shows how and the utility function evolves according to the mean size of the coalitions . These curves are only valid in the simplified Gaussian example considered here. Nonetheless, they will provide some guidance when using real data.
As can be pointed out, the purpose of is not to study coalitions stability against player defection which could be done through game theory, nor to redistribute the coalition’s utility in terms of individual payoffs. But we aim to design as a measure of how good a given coalition is according to our criteria. In other terms does a given coalitions has a good production to risk ratio.
Vi Coalition Formation
Section IV explained how contract values for the coalitions are computed, and in section V we related this quantity to the utility and gains of a coalition. Since the computation time of this quantity is not negligible, we derive in this section the heuristic we used to form the coalitions structure.
Via Representing the correlation structure
As seen in section IV, the variance of the aggregated production impacts directly the contract values, and depends on the covariances between the agents productions. We argue here that, by having some representation of the correlation structure between the agents, the search landscape for high utility coalitions could be reduced, such that good coalitions are more likely to be found quickly. Usually, this correlation structure is formalized with a covariance matrix or a correlation matrix that contains all the correlation coefficients between the agents : . By using a metric to map this matrix in a weighted adjacency matrix (see section II), it is possible to obtain a graph representation of the correlation relationships between the agents.
In the following, we use two opposite distance metrics for this mapping :
(15) 
Clearly, (resp. ) maps two correlated series as close points (resp. distant) while two uncorrelated series are distant (resp. close). These metrics enable us to compute a correlation graph and a "decorrelation" graph . For any i and j, the weight of the edge is in and in .
In both cases, we want to keep only the edges which weights are located in the lower tail of the distance distributions. In other words, we want to compute the graphs of and such that only meaningful edges remain. Selecting the right filter is thus an important point since it affects the landscape search for the coalition formation. Unfortunately, there seems to be no clear consensus in the literature on how to select such a threshold. We will see later in this section that cliques in are potential seeds for the coalitions. Since we want to generate coalitions, we need at least cliques of a given size to start. Besides, since we consider coalitions as disjoint, the starting cliques should be non overlapping. We thus select our optimal threshold for as :
(16) 
where is the decorrelation graph filtered by , and is the set of non overlapping cliques of size k in a given graph G. In other words we select as the smallest threshold possible such that the filtered decorrelation graph contains at least non overlapping cliques of size k. The existence of as defined in equation 16 is not guaranteed. The users has indeed to provide consistent values of or k compared to the size of the agent population .
ViB Cliques
In [16] the structural roles of weak and strong links on financial correlation graphs is investigated. The author shows that strong links, accounting for strong correlation relationships, are responsible for the clustering, while weak links provide the connectivity between clusters. Indeed, if we consider three items, say a, b, and c such that a and b are strongly correlated and b and c are also strongly correlated, then it is likely that a and c are also strongly correlated. It can be easily shown using the cosine addition formula^{1}^{1}1 , that if and with , then ). Correlation graphs capture this weak transitivity notion through clusters of correlated series.
Nevertheless decorrelation seems like a more complex concept than correlation in the sense that there is not even a partial notion of transitivity when it comes to it. Therefore, the clustering coefficients of is much higher than the one of . This can be seen as another formulation of [16] on the structural roles of weak and strong links on financial correlation graphs. Strong links, accounting for strong correlation relationships, are responsible for the clustering, while weak links provide the connectivity between clusters. Searching for clusters in and hoping that this strategy will provide a nice coalition structure of internally uncorrelated coalitions seems thus pointless.
Consider now a clique in , which is a complete subgraph of . This is indeed a structure of interest for our purpose. Since there is a link for every pairs of nodes, we know, by construction, that a clique has a mean correlation and a maximum correlation less than .
Figure 4 shows the distributions of the utility values for cliques of size 3 (triangles) in and for all the other possible triplets of agents. It is clearly visible that cliques tend to exhibit higher utilities because of their decorrelation property. Choosing cliques in as coalitions seems therefore appealing. Nevertheless, the quality of the results seems to decrease as the sizes of the cliques increase. Indeed, the larger the desired cliques, the more dense becomes (see equation 16). There is a point where cliques results more from noisy edges than true decorrelation, which decreases the quality of the results.
Directly mapping cliques to coalitions by this decorrelation oriented approach is thus not sufficient. It is indeed possible that adding agents to these cliques has the combined effect of increasing the expected production while decreasing its stability. The question revolves around measuring the benefits of this production surplus compared to the disadvantage of having coalition with high volatility. This can be quantified by the marginal benefit in equation 12.
ViC Algorithm
The algorithm takes inputs from :

The agents : historical series of available productions ,

The grid operator : market entrance policy ,

The "user" : Number of desired coalitions and size of starting cliques k.
The first steps consists in computing the decorrelation graph as well as the optimal threshold . Cliques of size k in are considered as coalition seeds. The next step is a local greedy improvement over the landscape represented by . Cliques add alternatively the node in their neighborhood that yields the best marginal benefit where is the neighborhood of a given clique. This addition occurs only if is not already involved in another coalition, and if , meaning that utilities are increasing. The algorithm stops when all nodes are distributed in a coalition or when the global utility stops increasing. See the details in algorithm 1 in the appendix.
Vii Results
The algorithm presented in the previous section is supposed to generate a given number of coalitions that have good utilities. As it comprises mainly of a greedy optimization based on local improvements, there is no guarantee that the algorithm finds the global optimum. Since there is, to our knowledge, no state of the art algorithm that aggregates uncorrelated agents in an optimum way (see section II for related problems), we compare the results with :

Random sampling of coalitions : Coalitions are formed randomly without any other constraint that the desired size. This enables us to have an idea about the distributions of utility values for coalitions of a given size.

Random sampling of coalition structures : Coalition structures are sampled randomly by shuffling and random divisions of the agents. Algorithm 2 uses such a sampling and returns the highest utility coalition structure sampled. This algorithm will be refered to as "random" in the following.

Correlated : This is the complete opposite of our algorithm. It uses the correlation graph and performs a community detection. The resulting coalitions have thus very high internal correlations. We thus expect this algorithm to perform very bad compared to the others. See algorithm 3.
Before running the algorithms, we need to calibrate the utility function by choosing the value of the parameter. Recall that the purpose of this parameter is to take into account some constraints on the coalition’s sizes if needed. In this paper, neither the communication network nor the electrical grid are explicitly considered. Thus, we do not have any technical constraints on coalition sizes even if we designed the utility such that these could be taken into account. We select the desired size as being (where means floor). Figure 5 shows how the mean utility of a coalition evolves with its size when the optimum size is set to 40 agents. Using equation 14 to estimate based on the mean quantities and Gaussian approximations seems to give acceptable results for the utility function behavior.
Figure 6 displays the evolution of the global utility and the number of involved agents during the course of the greedy algorithm 1. The transition from an invalid to a valid coalitions is clearly visible on the blue diamond curve and occurs between iteration 10 and 15. After this transition, coalition’s utilities improve slowly up to a maximum point.
Figure 7 shows the coalitions formed with the considered algorithms in the contract value / volatility space. The color map in the background indicates regions where we expect high utilities (red) and the ones where we expect very poor utility values (blue). The bottom right corner, with high contract values and low volatilities, is therefore the region where we wish to form our coalitions. A single coalition is represented by a marker and the color and shape of a marker indicates by which algorithm the coalition has been formed. Besides, the sizes of the coalitions are indicated on the markers, and the marker size is also proportional to the coalition size. We can see that the utility function results in approximatively balanced coalitions. Small yellow markers indicates the gravity centers of their respective coalition structures. The coalitions of correlated agents (green squares) are clearly of poor quality according to our criteria since they can only afford small production contracts, and with a very high volatility.
On figure 7, the decorrelated coalitions (blue dots) are closer to the bottom right corner indicating a much better quality in term of productivity over volatility ratio. The black dotted line indicates the mean values for the random coalitions sampling technique. Each small dot stands for the mean position of all sampled coalitions of this given size. Variances are not indicated for readability, but are usually quite large since this sampling only takes the size as a constraint. We can see that as coalitions get larger, they tend to increase on average their contract values, but at the price of a higher volatility.
On figure 7, the results of the random coalition structure sampling are shown with the red ellipses that represent the distribution of the gravity centers of the sampled structures. Since the center of the ellipses stands for the mean and each ellipse adds one standard deviation, more than of the sampled gravity centers are within the largest ellipse. The small yellow dot below the ellipses indicates the gravity center of our solution. It is thus visible that our greedy graph based algorithm is able to find a quite good coalition structure in terms of volatility and contract values.
A key point for the coalitions, besides stability and productivity, is their resilience. The resilience of a system can be roughly described as its ability to perform its tasks when subject to failures of its components. Therefore, the notion of resilience we will use in the following can be seen as the ability of the coalition structures to inject stable power in the grid when node failures occur. According to our model, the grid operator specified two thresholds ( and ) such that the power injected by every coalition is constrained : . As long as a coalition can propose a contract value higher than , it is valid and allowed to enter the energy market. We define the resilience of a coalition S as the probability that S produces more than the threshold :
(17) 
And we extend this measure to the coalition structures :
(18) 
We consider that prosumers fail randomly, and we denote by the fraction of agents that failed. Figure 8 exhibits how the resilience of the coalition structures evolves according to . On the top subplot, was voluntarily selected relatively low such that the resiliences of the three structures fit on the same figure. When the requirement increases, the differences between the algorithms also increase as visible on the bottom subplot of figure 8. The decorrelated coalitions seem to achieve a more resilient production on the market in the sense that they are able to sustain a higher fraction of node failures.
Viii Conclusion
In this paper we studied how aggregations of prosumers could be authorized to sell their surplus of production to the grid operator. By relying on the past values of the agents, we constrained the market entry to both sufficiently productive and stable coalitions. The power that a coalition is able to propose on the market is therefore related to production and stability. As the correlations between the prosumers that form these coalitions impact directly their volatilities, we seek uncorrelated aggregations of agents. We used a graph representation of the correlation relationships between the agents as a reduced landscape for the coalition formation. A greedy algorithm that starts with cliques of the "decorrelation" graph of the agents and makes local improvements offers a good compromise between speed and quality of the results. We compare these results with random samplings, and an opposite strategy that clusters correlated agents together. We showed that the coalitions resulting from our algorithm are able to provide more power to the grid with a lower volatility. Because they tend to have globally a better production over volatility ratios, these coalitions will tend to use less storage and waste less energy than more unstable coalitions. We plan to study these benefits for the control of the aggregations in future works.
Because in real situations, agents are prone to failure, resilience is also an important criterion for the quality of the aggregations. We therefore studied how the coalitions are able to remain on the market when their agents fail randomly one by one. We showed that, in this situation, the coalitions resulting from our algorithm better withstand losses of agents.
References
 [1] C. D. Brummitt, R. M. D’Souza, and E. a. Leicht, “Suppressing cascades of load in interdependent networks.” Proceedings of the National Academy of Sciences of the United States of America, vol. 109, no. 12, pp. E680–9, Mar. 2012.
 [2] J.W. Wang and L.L. Rong, “Cascadebased attack vulnerability on the US power grid,” Safety Science, vol. 47, no. 10, pp. 1332–1336, Dec. 2009.
 [3] T. Jiang, Y. Cao, L. Yu, and Z. Wang, “Load shaping strategy based on energy storage and dynamic pricing in smart grid,” Smart Grid, IEEE Transactions on, vol. 5, no. 6, pp. 2868–2876, Nov 2014.
 [4] F. Van Hulle, “Integrating Wind,” 2009.
 [5] S. D. Ramchurn, P. Vytelingum, A. Rogers, and N. R. Jennings, “Putting the ’smarts’ into the smart grid: A grand challenge for artificial intelligence,” Commun. ACM, vol. 55, no. 4, pp. 86–97, Apr. 2012.
 [6] A. J. D. Rathnayaka et al., “Prosumer management in sociotechnical smart grid,” in Proceedings of the CUBE International Information Technology Conference, 2012, pp. 483–489.
 [7] E. Negeri and N. Baken, “Architecting the Smart Grid As a Holarchy,” Proceedings of the 1st International Conference on Smart Grids and Green IT Systems, pp. 73–78, 2012.
 [8] S. Pahwa, C. Scoglio, S. Das, and N. Schulz, “Loadshedding strategies for preventing cascading failures in power grid,” Electric Power Components and Systems, vol. 41, no. 9, pp. 879–895, 2013.
 [9] M. Milligan and B. Kirby, “Utilizing Load Response for Wind and Solar Integration and Power System Reliability,” in Proc. of WindPower, 2010.
 [10] T. Logenthiran, D. Srinivasan, and T. Z. Shun, “Demand side management in smart grid using heuristic optimization,” IEEE Transactions on Smart Grid, vol. 3, no. 3, pp. 1244–1252, 2012.
 [11] Y. Zhang, N. Gatsis, and G. B. Giannakis, “Robust energy management for microgrids with highpenetration renewables,” IEEE Transactions on Sustainable Energy, vol. 4, no. 4, pp. 944–953, 2013.
 [12] M. Shadmand and R. Balog, “Multiobjective optimization and design of photovoltaicwind hybrid system for community smart dc microgrid,” Smart Grid, IEEE Transactions on, vol. 5, no. 5, pp. 2635–2643, Sept 2014.
 [13] H. M. Markowitz, Portfolio Selection: Efficient Diversification of Investments. Yale University Press, 1959.
 [14] R. Chicheportiche and J.P. Bouchaud, “The joint distribution of stock returns is not elliptical,” International Journal of Theoretical and Applied Finance, vol. 15, no. 03, p. 1250019, 2012.
 [15] R. Mantegna, “Hierarchical structure in financial markets,” The European Physical Journal B, vol. 11, no. 1, pp. 193–197, 1999.
 [16] A. Garas et al., “The structural role of weak and strong links in a financial market network,” The European Physical Journal B, vol. 63, no. 2, pp. 265–271, 2008.
 [17] “Irish Social Science Data Archive.” [Online]. Available: http://www.ucd.ie/issda/data/commissionforenergyregulationcer/
 [18] M. Lydia, S. S. Kumar, a. I. Selvakumar, and G. E. Prem Kumar, “A comprehensive review on wind turbine power curve modeling techniques,” Renewable and Sustainable Energy Reviews, vol. 30, pp. 452–460, 2014.
 [19] “Infoclimat.” [Online]. Available: http://www.infoclimat.fr
 [20] “National Centers for Environmental Information.” [Online]. Available: http://www.ncdc.noaa.gov/
 [21] C. Piedallu and J.C. GÃ©gout, “Multiscale computation of solar radiation for predictive vegetation modelling,” Annals of Forest Science, vol. 64, no. 8, pp. 899–909, 2007.
 [22] ——, “Efficient assessment of topographic solar radiation to improve plant distribution models,” Agricultural and Forest Meteorology, vol. 148, no. 11, pp. 1696 – 1706, 2008.
a Algorithms
B Net production series
Data were collected from [19] (similar data can be found at [20] for the United States). The variables used in the simulation are :

Average wind speed (in )

Nebulosity (integer in )

Temperature (in degree Celsius)
B1 Wind power curve
Power curves are functions that, for a given type of generator, map some input quantity to the output power produced. For windturbines and solar arrays these functions are well studied and approximations have been proposed [18] [21, 22]. For the wind turbines, the power curve can be specified by 4 values :

Cutinspeed : The wind speed at which the turbine first starts to rotate and generates power.

Ratedoutputpower : The maximum power that the turbine can generate.

Ratedoutputspeed : The wind speed at which the turbine attains its rated output power.

Cutoutspeed : The speed at which the turbine is turned off as not to damage the rotor.
The most interesting part is the increase of output power when the wind speed is in the cutinspeed ratedoutputspeed range. Even if sometimes a simple linear model is used, the increase has been shown to be non linear and some more complex exponential fit can be found in the literature [18].
B2 Solar power curve
The input quantity desired for our power curve model for solar arrays is a radiance in , which can be difficult to find in weather station available data. As we mainly collected nebulosity series, we used the Helios model described in [21, 22]. This model enabled us to compute perfect (clear blue sky situation) solar radiances at some specific locations on earth and at given timestamps. As nebulosity is a measure of the sky cloudiness, we can use the nebulosity series as degradation factors on the clear blue sky model (see [21, 22] for more details) :
(19) 
where and are respectively the clear blue sky and real radiances at time t, is the degradation factor at time t, and is the nebulosity index at time t.
Once we have input data in the forms of radiances, we compute the production of a solar array with the following simplified power curve :
(20) 
where is the surface of the array, and is its efficiency. The very simple form of this power curve is due to some simplifications in order not to overload the model. For instance, it does not take into account angles and orientations degradations. These could be incorporated if needed by changing the power curve in the simulations.
B3 Consumption
Modeling electric consumption has already been widely tackled in the literature. Models can be basically divided into two main categories : Topdown and bottomup approaches. Topdown techniques take aggregated consumption data as inputs and try to estimate individual consumption patterns while bottomup methods use a fine modeling of users consumptions as to obtain realistic aggregated consumption curves. In this paper, we used a bottomup model since the end user, or relatively small aggregations of end users, are in our interest. The main objective was to capture both daily patterns and seasonal variations of the consumptions. We assumed an additive model where the consumption of an agent is the sum of a seasonal heating term that depends on the outside temperature and an electronic consumption term that only depends on the hour of the day. By denoting the outside temperature at timestamp t, we can express the consumption of agent i at time t :
(21) 
where is the power curve that maps the temperature to a heating consumption, and computes the consumption of agent i (other than heating) at a given hour of the day. In the simulation, all agents have a desired inside temperature , supposed to be a constant for simplification. By using thermodynamic laws can be approximated by :
(22) 
where is the surface of thermal exchanges for agent i and is their thermal resistance.
We denote by the maximum consumption possible for agent i, which is basically the sum of all its appliances powers. We also denote by the vector of the average fraction of used for each hour. We can therefore write :
(23) 
where is a noise term. The vector enables us to easily differentiate agent consumption behaviors. Business or residential areas for instance can be easily distinguished with this kind of model.