Reliability measures for indexed semi-Markov chains applied to wind energy production

# Reliability measures for indexed semi-Markov chains applied to wind energy production

## Abstract

The computation of the dependability measures is a crucial point in the planning and development of a wind farm. In this paper we address the issue of energy production by wind turbine by using an indexed semi-Markov chain as a model of wind speed. We present the mathematical model, we describe the data and technical characteristics of a commercial wind turbine (Aircon HAWT-10kW). We show how to compute some of the main dependability measures such as reliability, availability and maintainability functions. We compare the results of the model with real energy production obtained from data available in the Lastem station (Italy) and sampled every 10 minutes.

###### keywords:
semi-Markov chains; synthetic time series; dependability analysis
1

## 1 Introduction

Wind is one of the most important renewable energy sources. Wind energy is produced by converting the kinetic energy of wind into electrical energy by means of a generator. For this reason it is important to dispose of an efficient stochastic model for wind speed changes. In a recent paper dami13b () the authors showed that an indexed semi-Markov chain (ISMC) reproduces almost exactly the statistical features of wind speed. In particular, it was shown that density and autocorrelation functions of a time series of real wind speed and those obtained by the ISMC model through Monte Carlo simulation were almost indistinguishable.

In this work we use the ISMC model to compute dependability measures as availability, reliability and maintainability functions for the index semi-Markov process. These indicators give important information on the feasibility of the investment in a wind farm by giving the possibility to quantify the uncertainty in the wind energy production.

Another important aspect is related to the location of the wind farm. In fact, today many wind farm are built offshore for different reasons: the wind speed is more powerful and constant due to the absence of obstacles, the visual, environmental and acoustic impact is cut down. The maintenance cost, instead, is higher than the onshore wind farm. A good stochastic model can help the planning of preventive maintenance suggesting when it is suitable to do the maintenance operation.

The results presented here are new and generalize some of the results obtained for semi-Markov chain(see barb04 (); Bla04 (); limn03 (); dami13a (); dami09 (); dami11 (); dami11b ()). The model generalizes also Markov chains and renewal models.

We apply our model to a real case of energy production. For this reason we choose a commercial wind turbine, a 10 kW Aricon HAWT assumed to be installed at the station of L.S.I -Lastem which is situated in Italy.

The paper is organized as follows. Section 2 presents some definitions and notation on the indexed semi-Markov chain. Section 3 describes the database and the technical characteristics of the commercial wind turbine. Section 4 shows the way in which it is possible to compute the dependability measures via kernel transformations and the value computed on the real data and on the synthetic data are compared. In the last section some concluding remarks and possible extensions are presented.

## 2 The indexed semi-Markov process

In this section, we give a short review of a particular indexed semi-Markov model advanced in dami13b () as a novel synthetic time series generation method for wind speed data which is able to capture the persistence in the wind speed data and we give new probabilistic result about the transition probability function.
Let us consider the stochastic process

 J−(m+1),J−m,J−(m−1),...,J−1,J0,J1,... (1)

with a finite state space . In our framework the random variable describes the wind speed at the -th transition, that is at the -th change of speed value and, the state space , describes the discretized wind speeds, see Section 3 for a specific choice of .
Let us also consider the stochastic process

 T−(m+1),T−m,T−(m−1),...,T−1,T0,T1,... (2)

with values in . The random variable describes the time in which the -th change of value of the wind speed occurs. We denote the stochastic process of the sojourn times in wind speed state before the th jump. Thus we have for all .
Let us consider another stochastic process

 V−(m+1),V−m,V−(m−1),...,V−1,V0,V1,... (3)

with values in . The random variable describes the value of the index process at the th transition:

 Vmn=1Tn−Tn−(m+1)m∑k=0Xn−1−k∑s=1f(Jn−1−k,s), (4)

where is a generic function and are known and non-random.
The function depends on the state of the system and on the time .
The process can be interpreted as a moving average of the accumulated reward process with the function as a measure of the permanence reward per unit time.
It should be noted that the order of the moving average is on the number of transitions which is fixed. Anyway, the moving average is executed on time windows of variable length because each transition has a random sojourn time of permanence in state before the next jump.
The indexed model is fully specified once the dependence structure between the variables is assumed. Toward this end we adopt the following assumption:

 P[Jn+1=j,Tn+1−Tn≤t|σ(Jh,Th),h=−m,...,0,...,n,Jn=i,Vmn=v] (5) =P[Jn+1=j,Tn+1−Tn≤t|Jn=i,Vmn=v]=:Qmij(v;t),

where represents the set of past values, of the processes . The matrix of functions is called - .
The joint process , which is embedded in the indexed semi-Markov kernel, depends on the moving average process , the latter acts as a stochastic index. Moreover, the index process depends on through the functional relationship .
Observe that if

 P[Jn+1=j,Tn+1−Tn≤t|Jn=i,Vλn=v]=P[Jn+1=j,Tn+1−Tn≤t|Jn=i], (6)

for all values of the index process, then the indexed semi-Markov kernel degenerates in an ordinary semi-Markov kernel and the model becomes equivalent to classical semi-Markov chain models as presented for example in jans06 (); barb08 (); dami12c (); ciardo90 (); dami09b (); csenk95 (). The dependence of the process on the new variable is introduced in order to capture the effect of the past transitions on the future ones for those processes which are strongly autocorrelated.
One of the main problem is the proposal of a particular choice of the index process which is useful in describing the wind speed process. To this end we need to choose a specific form of the function .
The choice is motivated by some physical reasons and by model simplicity.

Let us briefly remind that wind speed data are long range positively autocorrelated. This implies that there are periods of high and low speed. Motivated by the empirical facts in dami13b () we supposed that also the transition probabilities from current wind speed to the next one depends on whether the wind is, on average, in a high speed period or in a low one. We then fixed the function to be the wind speed itself, i.e.

 f(Jn−1−k,s)=Jn−1−k, (7)

for all . Consequently, substituting (7) in equation (4) and considering that is constant in we obtain

 Vmn=1Tn−Tn−(m+1)m∑k=0Jn−1−k⋅Xn−1−k=m∑k=0Jn−1−k⋅(Tn−k−Tn−1−kTn−Tn−(m+1)). (8)

In this simple case the index process expresses a moving average of order executed on the series of the wind speed values with weights given by the fractions of sojourn times in that wind speed , with respect to the interval time on which the average is executed .
The consequence of the choice (7) is that the assumption (5) now states that:

 Qmij(v;t)=P[Jn+1=j,Tn+1−Tn≤t|Jn=i,Vmn=m∑k=0Jn−1−k(Tn−k−Tn−1−kTn−Tn−(m+1))=v]. (9)

Relation asserts that the knowledge of the last wind speed value and of the weighted moving average of order of past wind speeds suffices to give the conditional distribution of the couple , whatever the values of the past variables might be. Essentially we consider that the average of the past speeds contains most of the information needed to establish the probability of the next wind speed transition. We will show later that, with this assumption, the model is able to capture the temporal dependence structure of real data.
We introduce now auxiliary probabilities which are helpful in the sequel of the analysis. Denote by

 pmij(v):=P[Jn+1=j|Jn=i,Vmn=v].

They represent the transition probabilities of the embedded indexed Markov chain. More precisely denotes the probability that the next transition is into wind speed given that at current time the wind speed process entered state and the index process had value . It is simple to realize that

 pmij(v)=limt→∞Qmij(v;t). (10)

Let be the sojourn time cumulative distribution in wind speed state :

 Hmi(v;t):=P[Tn+1−Tn≤t|Jn=i,Vmn=v]=∑j∈EQmij(v;t). (11)

It expresses the probability to change the actual wind speed in a time less or equal to given the indexed process has value .
It is useful to consider also the conditional waiting time distribution function which expresses the following probability:

 Gmij(v;t):=P[Tn+1−Tn≤t∣Jn=i,Jn+1=j,Vmn=v]. (12)

It is simple to establish that

 Gmij(v;t)=⎧⎪⎨⎪⎩ Qmij(v;t)pmij(v)if pmij(v)≠01if pmij(v)=0. (13)

To describe the behavior of our model at whatever time we need to define additional stochastic processes.
Given the three-dimensional process and the indexed semi-Markov kernel , we define by

 N(t)=sup{n∈N:Tn≤t}; (14) Z(t)=JN(t); Vm(t)=1t−T(N(t)−θ)−mm∑k=0J(N(t)−θ)−k⋅(t∧T(N(t)−θ)+1−k−T(N(t)−θ)−k)

where and .
The stochastic processes defined in represent the number of transitions up to time , the state of the system (wind speed) at time and the value of the index process (moving average of function of wind speed) up to , respectively. We refer to as an indexed semi-Markov chain.
The process extends the process because the time can be a transition or a waiting time. It is simple to realize that, , if we have that .
Let us introduce the stochastic process

 B(t)=t−TN(t), (15)

which is called backward recurrence time process and denotes the time since the last transition.
This process is very important in a semi-Markovian framework and it is well known that the transition probabilities of a semi-Markov process depend on the value of the recurrence time process, see e.g. dami09 (); dami11c ().
This is due to the fact that the conditional waiting time distribution functions can be of any type and then, also no memoryless distributions can be used. In this case, the time since the last transition of the system (backward value) influences the system’s transition probabilities; this is the so called duration effect.
To properly assess the probabilistic behavior of the system, we need to introduce the transition probability function. To this end, it is useful to introduce the following notation:

 J0−m−1=(J0,J−1,...,J−m−1),
 T0−m−1=(T0,T−1,...,T−m−1).

With the equality we denote the fact that

 (J0=j0,J−1=j1,...,J−m−1=j−m−1),

and similarly with we indicate that

 (T0=t0,T−1=t1,...,T−m−1=t−m−1).

Finally by we denote the couple of ordered vectors .

###### Definition 1

The transition probability function of the indexed semi-Markov chain is the function defined by

 bϕ(i0−m−1;j)(t0−m−1;u,t):=bϕ(i−m−1,i−m,...i0;j)(t−m−1,t−m,...,t0;u,t), (16)

where, and .

The transition probability function expresses the probability the ISMC occupies state at time given that at current time it is in state where it entered with last transition at time having previously visited the states at times .
We call the transition probabilities transition probabilities with initial backward times for the ISMC model. In fact, at the current time the backward recurrence time process assumes value:

 B(t0+u)=t0+u−TN(t0+u)=t0+u−T0=t0+u−t0=u.

If we set in we have the probability

 ϕ(i0−m−1;j)(t0−m−1;t):=P[Z(t)=j|J0=i0,...,J−m−1=i−m−1,T0=t0,...,T−m−1=t−m−1]. (17)

which denotes the probability that the ISMC occupies state at time given that at current time it is entered into state having previously visited the states at times .
The following result consists in a recursive formula for computing the transition function of the ISMC .

###### Proposition 2

The probabilities verify the following equation:

 bϕ(i0−m−1;j)(t0−m−1;u,t) (18) +∑i1∈Et∑t1=t0+u+1qi0i1(∑mk=0i−k−1⋅(t−k−t−k−1t0−t−m−1);t1−t0)(1−Hmi0(∑mk=0i−k−1⋅(t−k−t−k−1t0−t−m−1);u))bϕ(i1−m;j)(t1−m;t−t1),

where is the Kronecker delta and

 qir(v;s)=Qir(v;s)−Qir(v;s−1).

Proof First of all let us compute the value of the index process given the information set . Because is a transition time, we have that . Moreover and . Then, we have that

 Vm(t0)=m∑k=0m∑k=0i−k−1t−k−t−k−1t0−t−m−1. (19)

Now, being the events and disjoint, it follows that

 P[Z(t)=j|T1>t0+u,(J,T)0−m−1=(i,t)0−m−1] (20) =P[Z(t)=j,T1>t|T1>t0+u,(J,T)0−m−1=(i,t)0−m−1] +P[Z(t)=j,T1≤t|T1>t0+u,(J,T)0−m−1=(i,t)0−m−1].

Observe that

 P[Z(t)=j,T1>t|T1>t0+u,(J,T)0−m−1=(i,t)0−m−1] (21) =P[T1>t|T1>t0+u,(J,T)0−m−1=(i,t)0−m−1] ⋅P[Z(t)=j|T1>t,T1>t0+u,(J,T)0−m−1=(i,t)0−m−1].

The first factor on the right hand side of is

 P[T1>t|T1>t0+u,(J,T)0−m−1=(i,t)0−m−1] (22) =1−Hmi0(∑mk=0i−k−1t−k−t−k−1t0−t−m−1;t−t0)1−Hmi0(∑mk=0i−k−1t−k−t−k−1t0−t−m−1;u).

The second factor on the right hand side of is simply

 P[Z(t)=j|T1>t,T1>t0+u,(J,T)0−m−1=(i,t)0−m−1]=δi0j, (23)

because being the time of next transition exceeds and, therefore, up to the process remains in state . Consequently the probability is one if and only if otherwise it will be equal to zero.
Now let us consider the computation of the second addend on the right hand side of formula :

 P[Z(t)=j,T1≤t|T1>t0+u,(J,T)0−m−1=(i,t)0−m−1] (24) =∑i1∈Et∑t1=t0+u+1P[Z(t)=j,J1=i1,T1=t1|T1>t0+u,(J,T)0−m−1=(i,t)0−m−1] =∑i1∈Et∑t1=t0+u+1P[Z(t)=j|J1=i1,T1=t1,T1>t0+u,(J,T)0−m−1=(i,t)0−m−1] ×P[J1=i1,T1=t1|T1>t0+u,(J,T)0−m−1=(i,t)0−m−1] =∑i1∈Et∑t1=t0+u+1bϕ(i1−m;j)(t1−m;t−t1) ×P[J1=i1,T1=t1|(J,T)0−m−1=(i,t)0−m−1]P[T1>t0+u|(J,T)0−m−1=(i,t)0−m−1]
 =∑i1∈Et∑t1=t0+u+1qi0i1(∑mk=0i−k−1⋅(t−k−t−k−1t0−t−m−1);t1−t0)(1−Hmi0(∑mk=0i−k−1⋅(t−k−t−k−1t0−t−m−1);u))bϕ(i1−m;j)(t1−m;t−t1). (25)

If we set in equation we obtain the evolution equation for the transition probabilities :

 ϕ(i0−m−1;j)(t0−m−1;t) (26) =δi0j(1−Hmi0(m∑k=0i−k−1⋅(t−k−t−k−1t0−t−m−1);t−t0))

As it is possible to see from the previous propositions, the transition probabilities are function of the last states and times of transition, then our ISMC model can also be considered as a type of order semi-Markov chain model. Anyway higher order semi-Markov models are associated with a dramatic increase in the model parameters that rarely is possible to estimate from real data. The ISMC model overcomes this problem because the influence of the past states and times is summarized efficiently with the introduction of the index process that computes a weighted average of the past states. This is particularly evident once we remark that the are the unknown functions of formula whereas the remaining quantities are known once we dispose of the indexed semi-Markov kernel which is the only quantity to be estimated from data. From formula we see that the influence of past states and times is summarized by the index process . Having estimated it is possible, first to compute the sojourn time cumulative distribution through formula and then, by solving equation the transition probability are obtained.

## 3 Database and commercial wind turbine

As in our previous works dami13b (); dami13a (); dami13c () we used a free database of wind speed sampled in a weather station situated in Italy at N 45 28’ 14,9” E 9 22’ 19,9” and at 107 of altitude. The station uses a combined speed-direction anemometer at 22 above the ground. It has a measurement range that goes from 0 to 60 , a threshold of 0,38 and a resolution of 0,05 . The station processes the speed every 10 minute in a time interval ranging from 25/10/2006 to 28/06/2011. During the 10 minutes are performed 31 sampling which are then averaged in the time interval. In this work, we use the sampled data that represents the average of the modulus of the wind speed () without a considered specific direction. This database is then composed of about 230thousands wind speed measures ranging from 0 to 16 .

To be able to model the wind speed as a semi-Markov process, the state space of wind speed has been discretized. In the example shown in this work we discretize wind speed into 8 states (see Table 1) chosen to cover all the wind speed distribution. This choice is done by considering a trade off between accuracy of the description of the wind speed distribution and number of parameters to be estimated. An increase of the number of states better describes the process but requires a larger dataset to get reliable estimates and it could also be not necessary for the accuracy needed in forecasting future wind speeds. Note also that, in the database used in this work, there are very few cases where the wind speed exceeds . We stress that the discretization should be chosen according to the database to be used.

We apply our model to a real case of energy production. For this reason we choose a commercial wind turbine, a 10 kW Aircon HAWT with a power curve given in Figure 1. The power curve of a wind turbine represents how much energy it produces as a function of the wind speed. In this case, see Figure 1, there is a cut in speed at 2 , instead the wind turbine produces energy almost linearly from 3 to 10 , then, with increasing wind speed the energy production remains constant until the cut out speed at 32 , in which the wind turbine is stopped for structural reason. Then the power curve acts as a filter for the wind speed. In the database used for our analysis the wind speed does never exceed 16 and it is seldom over 8 , this is why the discretization is performed according to Table 1 and the analysis never reached the cut out speed.

Through this power curve we can know how much energy is produced as a function of the wind speed at a given time.

## 4 Reliability theory for the ISMC model of wind speed

In this section we define and compute reliability measures for the ISMC model.
Let be partitioned into sets and , so that:

 E=U∪D,∅=U∩D,U≠∅,U≠E.

The subset contains all ”Up” states in which the system is working and subset all ”Down” states in which the system is not working well or has failed. In the wind speed model the Up states are those for which the wind speed is sufficiently high to allow the production of energy or not excessive high such that the turbine should be turned off.
In the following we present both the typical indicators used in reliability theory and also their application.
The three indicators that we evaluate are the availability, reliability and maintainability functions, they were extensively studied among others by barl75 (); lisn03 () and related to semi-Markov models by barb04 (); Bla04 (); dami09b (); dami13a (); pere02 (); limn99 ().

(i) the point wise availability function with backward time giving the probability that the system is working on at time whatever happens on .
In our model the availability is defined as follows:

 bAi0−m−1(t0−m−1;u,t) (27) =P[Z(t)∈U|T1>t0+u,(J,T)0−m−1=((i,t)0−m−1)],

and gives the probability that at time the wind turbine produces energy conditional on the last wind speed values registered at the times and on the no occurrence of next transition up to time .

(ii) the reliability function with backward time giving the probability that the system was always working from time to time : In our model the availability is defined as follows:

 bRi0−m−1(t0−m−1;u,t) (28) =P[Z(u)∈U,∀u∈(t0,t]|T1>t0+u,(J,T)0−m−1=(i,t)0−m−1]

and gives the probability that the wind turbine will always produce energy from time up to time conditional on the last wind speed values registered at the times and on the no occurrence of next transition up to time .

(iii) the maintainability function with backward time giving the probability that the system will leave the set within the time being in at time :

 bMi0−m−1(t0−m−1;u,t) (29) =P[∃u∈(t0,t]:Z(u)∈U|T1>t0+u,(J,T)0−m−1=(i,t)0−m−1]

and gives the probability that the turbine will produce energy at least once from time up to time conditional on the last wind speed values registered at the times and on the no occurrence of any transition up to time .

These three probabilities can be computed in the following way if the process is an indexed semi-Markov chain of kernel .

(i) the point wise availability function
to compute these probabilities it is sufficient to use the following formula:

 bAi0−m−1(t0−m−1;u,t)=∑j∈Ubϕ(i0−m−1;j)(t0−m−1;u,t). (30)

(ii) the reliability function

to compute these probabilities, we will now work with another cumulated kernel for which all the states of the subset are changed into absorbing states by considering the following transformation:

 ^pmi,j(v)= ⎧⎪ ⎪⎨⎪ ⎪⎩1ifi∈D,j=i0ifi∈D,j∉Dpmi,j(v)otherwise. (31)

is given by solving the evolution equation of the indexed semi-Markov chain but now with the kernel .
The related formula will be:

 bR(i0−m−1;j)(t0−m−1;u,t)=∑j∈Ub^ϕi0−m−1(t0−m−1;u,t) (32)

where are the transition probabilities of the process with all the states in that are absorbing, i.e. with cumulated kernel .

(iii) the maintainability function :
to compute these probabilities we will now work with another cumulated kernel for which all the states of the subset are changed into absorbing states by considering the following transformation:

 ~pmij(v;t)= ⎧⎪ ⎪⎨⎪ ⎪⎩1ifi∈U,j=i0ifi∈U,j≠ipmij(v;t)otherwise. (33)

is given by solving the evolution equation of an indexed semi-Markov chain but now with the cumulated kernel .
The related formula for the maintainability function will be:

 bMi0−m−1(t0−m−1;u,t)=∑j∈Ub~ϕ(i0−m−1;j)(t0−m−1;u,t) (34)

where are the transition probability of the process with all the states in that are absorbing, i.e. with cumulated kernel .

Unfortunately it is impossible to give a graphical representation of these three indicators because they depends on too many parameters ( states , times and one backward value ), then in order to show the capacity of the model to correctly reproduce the reliability measures we define unconditional reliability measures and we compute them by evaluating the discrepancies between real data and model results.
Let us define the set

 Hx0,t0(v)={(i0−m−1,t0−m−1)∣i0=x0,t0=y0,Vm(t0)=v},

then we define the unconditional availability function

 bAHx0,t0(v)(u,t) (35) =P[Z(t)∈U|T1>t0+u,(J,T)0−m−1=((a,s)0−m−1)∈Hx0,t0(v)].

The unconditional availability gives the probability that at time the wind turbine produces energy conditional on the occupancy at the current time of the state with the index process having value and the backward value equal to . By using properties of the conditional probabilities it is easy to realize that

 bAHx0,t0(v)(u,t)=∑((a,s)0−m−1)∈Hx0,t0(v)P[(J,T)0−m−1=((a,s)0−m−1)] (36) ×P[Z(t)∈U|T1>t0+u,(J,T)0−m−1=((a,s)0−m−1)] =∑((a,s)0−m−1)∈Hx0,t0(v)μ((a,s)0−m−1)bAa0−m−1(s0−m−1;u,t).

where is the initial distribution of the states occupied at times .
The same definitions apply for the reliability and the maintainability functions.
We define the unconditional reliability function

 bRHx0,t0(v)(u,t) (37) =P[Z(u)∈U,∀