Synthetic Generation of Solar States for Smart Grid: A Multiple Segment Markov Chain Approach

Synthetic Generation of Solar States for Smart Grid: A Multiple Segment Markov Chain Approach

Wayes Tushar1, Shisheng Huang1, Chau Yuen1, Jian (Andrew) Zhang2, and David B. Smith4 This work is supported by the Singapore University of Technology and Design (SUTD) under the Energy Innovation Research Program (EIRP) Singapore NRF2012EWT-EIRP002-045.4David Smith is also with the Australian National University (ANU), and his work is supported by NICTA. NICTA is funded by the Australian Government through the Department of Communications and the Australian Research Council through the ICT Centre of Excellence Program. 1Singapore University of Technology and Design, Singapore 138682.
2CSIRO Computational Informatics, Marsfield, NSW, Australia.
4National ICT Australia (NICTA), Canberra 2601, Australia.
Email: {wayes_tushar,yuenchau,shisheng_huang};;

The use of photovoltaic (PV) sources is becoming very popular in smart grid for their ecological benefits, with higher scalability and utilization for local generation and delivery. PV can also potentially avoid the energy losses that are normally associated with long-range grid distribution. The increased penetration of solar panels, however, has introduced a need for solar energy models that are capable of producing realistic synthetic data with small error margins. Such models, for instance, can be used to design the appropriate size of energy storage devices or to determine the maximum charging rate of a PV-powered electric vehicle (EV) charging station. In this regard, this paper proposes a stochastic model for solar generation using a Markov chain approach. Based on real data, it is first shown that the solar states are inter-dependent, and thus suitable for modeling using a Markov model. Then, the probabilities of transition between states are shown to be heterogeneous over different time segments. A model is proposed that captures the inter-temporal dependency of solar irradiance through segmentation of the Markov chain across different times of the day. In the studied model, different state transition matrices are constructed for different time segments, which the proposed algorithm then uses to generate the solar states for different times of the day. Numerical examples are provided to show the effectiveness of the proposed synthetic generator.


Multiple segment, Markov chains, photovoltaic, smart grid, real data.

I Introduction

Significant concerns of today’s energy sector consist of issues such as continuous increase in energy demand, fast depletion of conventional energy resources, and the effect on the environment [1, 2, 3, 4, 5, 6, 7, 8, 9]. Solar energy is a great source of renewable and clean energy that has the capability to effectively solve these problems. For example, most of the renewable energy from solar photovoltaics (PVs) at present is either delivered to the grid or to isolated loads, e.g., islanded micro-grids [10], to help meet demand. In fact, various forms of solar energy such as solar heat, solar PV, solar thermal, and solar fuels offer abundant, clean and environmental friendly energy resources that have the potential to address the most compelling energy problems that are faced by the energy sector. Furthermore, the continuing decrease in cost of PV arrays and the increase in their efficiency means there is an even more promising role for PV generating systems in the near future [2].

Due to the potential benefits of solar energy in future smart grids, problems related to the deployment of solar energy sources have been explored by much literature, as surveyed in [10]. For instance, the concept of PV generation, and its feasibility for practical application has been studied in [11], whereas [12] investigates the optimal sizing of stand alone photovoltaic systems using genetic algorithms. Modeling of photovoltaic cells is studied in, e.g., [13], and its use as a distributed energy source has been explored in, e.g., [14, 15], in terms of both experiment and simulation. In [16], the authors explore the utility level benefits of distributed PV systems that are coupled with electricity storage. The possibility of using a solar power system in combination with other different forms of generation, e.g., wind turbines, is studied in [17]. Finally, studies such as [18] have focussed on designing schemes/protocols that maximize the power output of PV generators.

Apart from the above, another important issue is modeling the synthetic generation of renewable energy, which has recently gained considerable attention. Modeling synthetic generation of solar energy is of paramount importance. Depending on the local irradiance states, the instantaneous power output of a solar generator can vary significantly. Therefore, it raises a concern as to the power quality of the electricity grid as of installed PV systems worldwide are grid connected [19]. Furthermore, with the advancement of smart grids, solar panels have started to be installed as distributed generators not just for household use, but also for heavy electric loads such as charging of EVs in vehicle charging stations and residential premises [20]. To this end, there is a need for energy consumers to understand the output of their solar irradiance states based on their solar energy usage In fact, devising accurate models that can generate solar irradiance states over time can significantly help electricity users to make decisions on when to use the generated solar energy, and for what type of work they need to use the energy for. For instance, if the intensity of the solar irradiance is very low, depending on the type of solar panels the amount of energy that would be generated might also be very low to accomplish any meaningful work. On the contrary, when the solar irradiance state is very high, the likelihood of very high generation of solar energy will be large, and thus this will enable the users to use the generated energy for heavy loads such as EV charging. In this regard, there is a need for accurate synthetic data generator models that can generate different solar states, e.g., low, medium or high, over time to not only allow the grid operator to implement on-the-fly grid management systems [19], but also to help consumers in making energy usage decisions, such as what is the appropriate size of storage device to store energy, or what is the maximum EV charging rate at any particular time of the day.

Due to the fact that the states of solar irradiance can randomly change from low to high or vice-versa during a day, we take the first step in this paper to use a Markov model to design a synthetic solar data generator. We classify the solar states into four categories including low, medium, high and very high depending on their application, as described in the next section, and propose a segmented first order Markov chain model to capture the transition between states. We note that Markov chain models have been used extensively to generate synthetic wind data [21]. However, their use is not well explored for generating synthetic solar data due to the fact, as we will see later, that the transition probability of solar states is not stationary over time. In this paper, we effectively capture this temporal variation by designing different state transition matrices [21], for different time segments of the day. Using real solar data, we differentiate between the four solar states by including a separate range of values of solar irradiation in each state. Depending on the range of values in each state, such a general classification would facilitate decision making of users as to when to use their PV energy for different purposes, irrespective of location and time.

To this end, we first show that the solar states are interdependent, and therefore the Markov model is a reasonable model to generate synthetic solar states. Then, we demonstrate that the transition probabilities of solar states are not stationary over times. With a view to capture the time dependency, we divide the entire solar irradiance time duration of interest into multiple segments, and model different state transition matrices for each time segment. We propose an algorithm to generate the synthetic solar state at each time-instant of interest based on the state transition probability. To show the effectiveness of the proposed generator, we compare the result with real measured solar states, and compute the affinity towards the real measured states in terms of their standard deviation. The proposed synthetic solar state generator can be used in any area of interest by simply computing the segmented transition probability matrices for that area’s specific time series data set.

Ii Solar Generation as Markov Chains

A Markov chain represents a stochastic process in which a state changes at discrete time steps. Mathematically, a first order Markov chain is a sequence of random variables such that a future state depends conditionally upon the current state only, and is independent of all past states . A Markov chain is modeled in terms of its transition probabilities , and a first order transition probability matrix , determines the probability of transitioning of a state from to regardless of previous states that were visited [22].

We note that the solar irradiance in a particular area is highly dependent on its geographic location and is subject to change from one time to the next based on weather conditions at different times of the day. For example, in a normal sunny day, the solar irradiance is lower in early morning, eventually increases with time to reach its peak at noon, and finally again gradually diminishes as the afternoon progresses. Since the change of solar states from one time instant to the next is gradual rather than abrupt, it is reasonable to model the random generation of solar power in terms of a Markov chain to capture its time correlation between states.

For that purpose, we represent solar data with a first order Markov chain approach. We define a solar irradiance intensity state space where each state of the chain refers to a range of solar radiation intensity in Watt per square meter (). For any given state there is a probability of what the next state will be, as noted above. We stress that, for different case studies, the discretization of irradiance states can be different in accordance with the chosen method and data sets. However, in this paper, we are more interested in observing the solar irradiance from a general point of view, such as high or low irradiance, based on some predefined threshold ranges. In this regard, we consider that the state space of the stochastic process modeling the solar intensity consists of four discrete states

State Range of irradiance Suitable applications
Not suitable for any meaningful application
Light energy consuming loads
High household loads
Very high loads (e.g., EV charging)
TABLE I: Interpretation of different solar states in the Markov chain.

where each state refers to a range of solar intensities in . To differentiate between states, we consider three threshold values and . We assume that when , the solar irradiance is very low, and hence is not suitable for generating enough solar energy to do any meaningful task. When , the generated solar energy from this irradiance has enough strength to perform light household tasks such as charging batteries, running televisions, and operating energy-saving bulbs and fans. The ranges and refer to high and very high states respectively. When the solar state is high, it is assumed that users can use the produced energy for high energy consuming household appliances such as washing machines and water heaters. Finally, when the solar irradiance state is at , the produced solar energy is enough to be used for very high loads such as charging electric vehicles or running air conditioners. We summarize the irradiance range of each defined state, and their suitability for different applications in Table I. It is important to note that the number of states in Table I can be increased by setting a different ranges of values from the time series data within each state, and that this increases computational complexity. However, to generalize the model irrespective of the location, we assume that and are always fixed111That is, irrespective of the location e.g., whether in USA or in Australia, the thresholds have the same value. for the defined states in (1).

Now, for a considered sequence of time instants , where is the total time of solar measurement and is the index for each time, the conditional probability that the solar state at time will switch to state at is


To this end, for the considered four states in (1), we define a first order state transition probability matrix of size as follows:


In (3), each is the probability of transition of a state at time slot to a state in the next time slot , where each is assumed to have a one-to-one relationship with states . We graphically explain the probabilities of transitions between states in Fig. 1.

Fig. 1: Probability of transition between different states of Markov chain. The transition probability from state to for each pair is written just above the transition line between the pair of states.

Ii-a Transition Probability Matrix

Fig. 2: Solar irradiance data for a typical day, both in summer and in winter, in Colorado, USA during 2013.

The analysis of solar irradiance is carried out over a time series data set that contains the values of solar irradiance for every five minutes during one month of summer222Similar analysis is equally applicable for time series data sets for any other season. in Colorado, USA in 2013 [23]. (An example solar irradiance for a typical day, both in summer and in winter, is shown in Fig. 2). Now for analysis, first the set of data is converted into four solar irradiance states according to Table I, where we consider , and W/m. Then, from the state transition matrix,


of the given data set, the first order transition probability matrix,


is constructed. According to (5), the highest probabilities occur on the diagonal of the matrix. Hence, if the current solar irradiance is known, it is most likely that the solar intensity of the next time instant would be in a similar range [24]. Now, before applying a Markov model to generate the synthetic data, we need to investigate the state dependency of the chain, i.e., whether or not the successive solar states are dependent upon each other.

Ii-B State dependency test

Theorem 1.

For the given time series data set, the successive states are dependent upon each other, and thus a Markov chain can be constructed from it.


First, we assume that the null hypothesis holds for the given observed data set. Therefore, the statistics , where


of the measured data set can be assumed to be distributed asymptotically as having degrees of freedom [24]. In (6), is the total number of states, is the frequency in state followed by state , and is the marginal probability for the th column of defined as


The value of is determined considering the average of whole time-series data set in summer for the duration am to pm. The value is found to be , which is significantly higher than value of at level with degrees of freedom333The value of at level with degrees of freedom, is calculated via the javascript program available online at Hence, the null hypothesis that the successive transitions are independent is rejected. Therefore, the state transition of the observed data depends on previous states, and thus possesses the property of a first order Markov chain. Thus, Theorem 1 is proved. ∎

However, in general, solar irradiance changes significantly with time. For instance, solar irradiance is higher during noon compared early morning and late afternoon. In this regard, we now investigate whether state transition probabilities are affected by the change of time across a day.

Ii-C Temporal stationarity test

Property 1.

The Markov chain constructed from the observed data is heterogeneous over time, and thus the probability of state transition changes from one time segment to the next.

To proof this property, first we divide the whole time series data set into five different intervals where each interval comprises three hours. Now, the Markov chain will possess the temporal stationarity property if the transition probability matrices for the solar data of each time interval are approximately equal to each other [24]. To that end, the computed five transition probability matrices444We assume in designing all transition probability matrices. and for the considered five segments of time are shown in Table II.

TABLE II: Table of different transition probability matrices for different intervals of the time series solar irradiance data.

As can be seen from the table, all transition matrices are different, and considerably unequal, to each other. Hence, the probability of transition of states of the Markov chain changes over time.

Ii-D Multiple segment Markov Chain

In this section, we propose a multiple segment Markov chain approach to capture the variability of the transition probability matrix between different times. We propose that the total time duration of interest will be divided into multiple time segments with the same number of states, where the duration of each segment does not need to be equal to other segments. Secondly, the probability transition matrix is calculated for each from the time series data. It is assumed that the initial state of the chain is known for the first time segment. For the rest of the segments, the final state of the previous segment will be considered as the initial state of the next.

Now, for a given state transition matrix of time segment , the distribution over states at different time instants of the segment can be expressed as a stochastic row vector with the relation [25]


For instance, if at time of segment the solar irradiance state is (from (1)) the probability of the irradiance state at time can be defined as


After determining the probability distribution over states at any time , the actual state is assumed to be the state with highest probability at that time. That is


The steps of determining solar irradiance states using a multiple segment Markov chain process are detailed in Algorithm 1.

1:  Initialization: Number of segments .
2:  Initialization: Time slots in each segment .
3:  Initialization: Vector of transition probability matrices .
4:  Initialization: Vector of states .
5:  Initialization: Initial state vector .
6:  for Each state  do
7:     if  then
9:     else
11:     end if
12:     for Each time  do
15:     end for
16:  end for
Algorithm 1 Algorithm to determine solar irradiance state via a multiple segment Markov chain.

Iii Case Study

To show the effectiveness of the proposed scheme to generate synthetic states of solar irradiance at different times of the day, we conduct numerical experiments for two cases: 1) for average time series solar data in summer and 2) for average time series solar data in winter. Both sets of data are collected from South Park, Colorado, and are accessible online at [23]. Based on the time series data, a total of four irradiance states are considered as in (1). In Table III, we show different threshold values chosen to define different states.

Thresholds in W/m
TABLE III: Threshold values for defining different solar irradiance state according to (1) and Table I.

We choose the time duration am to pm for summer, and am to pm for winter as the total period of time to generate the synthetic data using the proposed scheme. We note that the solar irradiance data beyond these two intervals are too small to produce any meaningful state transition matrix. We divide the total time interval into five different segments for summer and four segments for winter according to Table IV.

Time segment Summer Winter
4.30 am to 7.30 am 6.00 am to 9.00 am
7.30 am to 10.30 am 9.00 am to 12.00 pm
10.30 am to 1.30 pm 12.00 pm to 3.00 pm
1.30 pm to 4.30 pm 3.00 pm to 6.00 pm
4.30 pm to 7.30 pm -
TABLE IV: Duration of different time segments in different seasons.

The state transition matrices for different segments of time are computed, and solar irradiance states are determined using Algorithm 1. The experiments are conducted for the proposed scheme to generate the synthetic states, and are compared to the real measured states in Fig. 3 and Fig. 4.

Fig. 3: Comparison of real solar states and synthetic solar states generated by the proposed multiple segment Markov chain model in summer.
Fig. 4: Comparison of real solar states and synthetic solar states generated by the proposed multiple segment Markov chain model in winter.

In Fig. 3, we show the relation between the real observed solar states with the generated synthetic solar states via the proposed multiple segment Markov approach, in summer. It can be seen that the solar states gradually increases from state one to state four as the time of the day increases from am to pm, and again moves back to state one eventually around pm, which is in fact due to the sunrise and sunset times of the location. However, according to Fig. 3, the synthetic states generated via the proposed scheme have noticeably good performance in resembling real states, apart from minor deviations during early morning and early afternoon. According to the data set for summer, the standard deviation of the generated synthetic states is , which is almost equal to the standard deviation of the real observed set of states . Hence, according to the performance in Fig. 3, the proposed multiple segment Markov chain can be successfully used to produce synthetic solar data for summer-time.

To demonstrate how effectively the proposed scheme can capture solar variation in winter, we show the same comparison for the time series data set of winter in Fig. 4. We note that in calculating the states in winter we reduce the number of segments of total duration of time into four as the changes in solar irradiance are not significant enough to build transition probability matrices for more than four segments. As can be seen from Fig. 4, the performance of the proposed multiple segment Markov chain is affected by this change in solar irradiance. Hence, importantly, the generated states have relatively poorer performance than in summer. In fact, in winter, the solar conditions change more abruptly than in summer, including late sunrise and early sunset. Hence, to generate states that match the real states well is very challenging. However, except for a few time steps around noon, the proposed generator produces solar states for winter which are closely affine to the real states as can be seen from Fig. 4. According to Fig. 4, for winter, the standard deviation of real data set is , which is close to the standard deviation of the generated synthetic data. Therefore, considering the performance based on the considered data set, it can be concluded that the proposed multiple segment Markov chain approach is applicable to generate solar states in both summer and winter seasons.

Iv Conclusion

In this paper, a multiple segment Markov chain is studied to produce synthetic solar states at different times of the day for both summer and winter seasons. With a real time series data set, it has been shown theoretically that solar states are interdependent, and thus can be modeled by a Markov approach. It has further been proved that the probability of transition of one solar state to another state depends on time, and therefore the total time of interest is divided into multiple segments to capture this temporal dependency. Based on real solar data sets from Colorado, a probability transition matrix has been calculated for each of the segments, and an algorithm has been proposed to determine the state for each time instant of the day. The effectiveness of the proposed scheme has been demonstrated via numerical simulation, with noticeably good performance in terms of the resemblance of generated synthetic solar states with the real data set.

The proposed scheme can be extended and improved in various aspects. The number of states in the model can be better calibrated using real data from different locations, and thus can be used to propose a generic state derivation model. The theoretical relationship between the number of time segments and solar irradiance at different time slots is also worthy of further investigation. Further, by introducing a learning capability, the proposed scheme can be extended to efficiently characterize the inter-temporal behavior of the solar irradiance.


  • [1] Y. Liu, C. Yuen, S. Huang, N. U. Hassan, X. Wang, and S. Xie, “Peak-to-average ratio constrained demand-side management with consumer’s preference in residential smart grid,” IEEE Journal of Selected Topics in Signal Processing, vol. PP, no. 99, pp. 1–14, Jun 2014.
  • [2] W. Tushar, B. Chai, C. Yuen, D. B. Smith, K. L. Wood, Z. Yang, and H. V. Poor, “Three-party energy management with distributed energy resources in smart grid,” IEEE Transactions on Industrial Electronics, 2014, (To appear).
  • [3] N. U. Hassan, M. A. Pasha, C. Yuen, S. Huang, and X. Wang, “Impact of scheduling flexibility on demand profile flatness and user inconvenience in residential smart grid system,” Energies, vol. 6, no. 12, pp. 6608–6635, Dec 2013.
  • [4] Y. I. Khalid, N. U. Hassan, C. Yuen, and S. Huang, “Demand Response Management For Power Throttling Air Conditioning Loads In Residential Smart Grids,” in Proc. of IEEE International Conference on Smart Grid Communications (SmartGridComm), Venice, Italy, Nov. 2014, pp. 1–6.
  • [5] B. Chai and Z. Yang, “Impacts of unreliable communication and modified regret matching based anti-jamming approach in smart microgrid,” Elsevier Ad Hoc Networks, 2014, (to appear).
  • [6] B. Chai, J. Chen, Z. Yang, and Y. Zhang, “Demand response management with multiple utility companies: A two-level game approach,” IEEE Transactions on Smart Grid, vol. 5, no. 2, pp. 722–731, March 2014.
  • [7] R. Khan, J. Brown, and J. Khan, “Pilot protection schemes over a multi-service WiMAX network in the smart grid,” in Proc. of IEEE International Conference on Communications Workshops (ICC), Budapest, Hungary, June 2013, pp. 994–999.
  • [8] W. Tushar, C. Yuen, B. Chai, D. B. Smith, and H. V. Poor, “Feasibility of using discriminate pricing schemes for energy trading in smart grid,” in Proc. of IEEE Global Communications Conference (GLOBECOM), Austin, TX, Dec. 2014, pp. 1–7.
  • [9] Y. Liu, N. Hassan, S. Huang, and C. Yuen, “Electricity cost minimization for a residential smart grid with distributed generation and bidirectional power transactions,” in Proc. of IEEE PES Innovative Smart Grid Technologies (ISGT), Washington, DC, Feb. 2013, pp. 1–6.
  • [10] X. Fang, S. Misra, G. Xue, and D. Yang, “Smart grid - The new and improved power grid: A survey,” IEEE Communications Surveys Tutorials, vol. 14, no. 4, pp. 944–980, Oct. 2012.
  • [11] J. E. Burns and J.-S. Kang, “Comparative economic analysis of supporting policies for residential solar PV in the United States: Solar renewable energy credit (SREC) potential,” Energy Policy, vol. 44, pp. 217–225, May 2012.
  • [12] E. Koutroulis, D. Kolokotsa, A. Potirakis, and K. Kalaitzakis, “Methodology for optimal sizing of stand-alone photovoltaic/wind-generator systems using genetic algorithms,” Solar Energy, vol. 80, no. 9, pp. 1072–1088, Sep. 2006.
  • [13] A. Chatterjee, A. Keyhani, and D. Kapoor, “Identification of photovoltaic source models,” IEEE Transactions on Energy Conversion, vol. 26, no. 3, pp. 883–889, Sept 2011.
  • [14] R. Wies, R. Johnson, A. Agrawal, and T. Chubb, “Simulink model for economic analysis and environmental impacts of a PV with diesel-battery system for remote villages,” IEEE Transactions on Power Systems, vol. 20, no. 2, pp. 692–700, May 2005.
  • [15] S. K. N. Phuangpornpitak, “PV hybrid systems for rural electrification in thailand,” Renewable and Sustainable Energy Reviews, vol. 11, no. 7, pp. 1530–1543, Sep. 2007.
  • [16] S. Huang, J. Xiao, J. Pekny, G. Reklaitis, and A. Liu, “Quantifying system-level benefits from distributed solar and energy storage,” Journal of Energy Engineering, vol. 138, no. 2, pp. 33–42, June 2012.
  • [17] N. A. Ahmed, M. Miyatake, and A. K. Al-Othman, “Power fluctuations suppression of stand-alone hybrid generation combining solar photovoltaic/wind turbine and fuel cell systems,” Energy Conversion and Management, vol. 49, no. 10, pp. 2711–2719, Oct. 2008.
  • [18] T. Kottas, Y. Boutalis, and A. Karlis, “New maximum power point tracker for PV arrays using fuzzy controller in close cooperation with fuzzy cognitive networks,” IEEE Transactions on Energy Conversion, vol. 21, no. 3, pp. 793–803, Sept 2006.
  • [19] Z. Dong, D. Yang, T. Reindl, and W. M. Walsh, “Short-term solar irradiance forecasting using exponential smoothing state space model,” Energy, vol. 55, pp. 1104–1113, May 2013.
  • [20] A. Y. Saber and G. K. Venayagamoorthy, “Plug-in vehicles and renewable energy sources for cost and emission reductions,” IEEE Transactions on Industrial Electronics, vol. 58, no. 4, pp. 1229–1238, Apr. 2011.
  • [21] F. Y. Ettoumi, H. Sauvageot, and A. E. H. Adane, “Statistical bivariate modelling of wind using first-order Markov chain and Weibull distribution,” Renewable Energy, vol. 28, pp. 1787–1802, Dec. 2003.
  • [22] K. Brokish and J. Kirtley, “Pitfalls of modeling wind power using markov chains,” in IEEE PES Power Systems Conference and Exposition, Seattle, WA, March 2009, pp. 1–6.
  • [23] “South park mountain data,” website,$&$, (accessed on April 7, 2014).
  • [24] A. Shamshad, M. A. Bawadi, W. M. A. W. Hussin, T. A. Majid, and S. A. M. Sanusi, “First and second order Markov chain models for synthetic generation of wind data time series,” Energy, vol. 30, no. 5, pp. 693–708, Apr 2005.
  • [25] S. P. Meyn and R. L. Tweedie, Markov chains and stochastic stability.   Springer, London, 1993.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description