A Generalization of the Exponential-Logarithmic Distribution for Reliability and Life Data Analysis
In this paper, we introduce a new two-parameter lifetime distribution, called the exponential-generalized truncated logarithmic (EGTL) distribution, by compounding the exponential and generalized truncated logarithmic distributions. Our procedure generalizes the exponential-logarithmic (EL) distribution modelling the reliability of systems by the use of first-order concepts, where the minimum lifetime is considered (Tahmasbi & Rezaei, 2008). In our approach, we assume that a system fails if a given number of the components fails and then, we consider the -smallest value of lifetime instead of the minimum lifetime.
The reliability and failure rate functions as well as their properties are presented for some special cases. The estimation of the parameters is attained by the maximum likelihood, the expectation maximization algorithm, the method of moments and the Bayesian approach, with a simulation study performed to illustrate the different methods of estimation. The application study is illustrated based on two real data sets used in many applications of reliability.
Keywords: Lifetime distributions, reliability, failure rate, order statistics, exponential distribution, truncated logarithmic distribution.
Lifetime distributions are often used in reliability theory and survival analysis for modelling real data. They play a fundamental role in reliability in diverse disciplines such as finance, manufacture, biological sciences, physics and engineering. The exponential distribution is a basic model in reliability theory and survival analysis. It is often used to model system reliability at a component level, assuming the failure rate is constant (Balakrishnan & Basu, 1995; Barlow & Proschan, 1975; Sinha & Kale, 1980). In recent years, a growing number of scholarly papers has been devoted to accommodate lifetime distributions with increasing or decreasing failure rate functions. The motivation is to give a parametric fit for real data sets where the underlying failure rates, arising on a latent competing risk problem base, present monotone shapes (nonconstant hazard rates). The proposed distributions are introduced as extensions of the exponential distribution, following Adamidis & Loukas (1998) and Kuş (2007), by compounding some useful lifetime and truncated discrete distributions (for review, see Barreto-Souza & Cribari-Neto (2009); Chahkandi & Ganjali (2009); Silva et al. (2010); Barreto-Souza et al. (2011); Cancho et al. (2011); Louzada-Neto et al. (2011); Morais & Barreto-Souza (2011); Hemmati et al. (2011); Nadarajah et al. (2013); Bakouch et al. (2014), and others). The genesis is stated on competing risk scenarios in presence of latent risks, i.e. there is no information about the causes of the component’s failure (Basu & Klein, 1982). In fact, a system may experience multiple failure processes that compete against each other, and whichever occurs first can cause the system to fail (Rafiee et al., 2017; Kalbfleisch & Prentice, 2002; Andersen et al., 2002; Tsiatis, 1998). The term competing risks refers to duration data where two or more causes are competing to determine the observed time-to-failure. The potential multiple causes of failure are not mutually exclusive but the interest lies in the time to the first coming one (Putter et al., 2007; Bakoyannis & Touloumi, 2012). For further details see Basu (1981).
In the same way, the exponential-logarithmic (EL) distribution was proposed by Tahmasbi & Rezaei (2008) as a log-series mixture of exponential random variables. This two-parameter distribution with decreasing failure rate (DFR) is obtained by mixing the exponential and logarithmic distributions. It is based on the idea of modelling the system’s reliability where the time-to-failure occurs due to the presence of an unknown number of initial defects of some components is considered. Suppose the breakdown (failure) of a system of components occurs due to the presence of a non-observable number, , of initial defects of the same kind, that can be identifiable only after causing failure and are repaired perfectly (Adamidis & Loukas, 1998; Kuş, 2007). Let be the failure time of the system due to the defect, for . If we assume that are iid exponential random variables independent of , that follows a truncated logarithmic distribution, then the time to the first failure is adequately modelled by the EL distribution (Barreto-Souza & Silva, 2015; Bourguignon et al., 2014; Ross, 1976). For reliability studies, and are used respectively in serial and parallel systems with identical components (Chahkandi & Ganjali, 2009; Ramos et al., 2015). However, one may determine the distribution of the smallest value of the time-to-failure ( order statistic), instead of the minimum lifetime (first order statistic).
There is a huge literature on the order statistics for reliability engineering (for review, see Barlow & Proschan (1975); Sarhan & Greenberg (1962); Barlow & Proschan (1965); Pyke (1965); Gnedenko et al. (1969); Pledger & Proschan (1971); Barlow & Proschan (1981); David (1981); Bain & Englehardt (1991), and references contained therein). The motivation arises in reliability theory, where the so-called -out-of- systems are studied (Xie & Wang, 2008; Xie et al., 2005; Proschan & Sethuraman, 1976; Kim et al., 1988). An engineering system consisting of components is working if at least out of the total components are operating and it breaks down if or more components fail. Hence, a -out-of- system fails at the time of the component failure (Barlow & Proschan, 1975; Kamps, 1995; Cramer & Kamps, 2001). This binary-state context is based on the assumption that a system or its components can be either fully working or completely failing. However, in reality, a system may provide its specified function at less than full capacity when some of its components operate in a degraded state (Ramirez-Marquez & Coit, 2005). The binary -out-of- system reliability models have been extended to multi-state -out-of- system reliability models by allowing more than two performance levels for the system and its components (Eryilmaz, 2014). Multi-state systems contain units presenting two or more failed states with multiple modes of failure and one working state (Anzanello, 2009). Reliability models provide, through multi-state, more realistic representations of engineering systems (Yingkui & Jing, 2012). Many authors have made contributions about the reliability estimation approaches for multi-state systems (Jenney & Sherwin, 1986; Page & Perry, 1988; Rocco et al., 2005; Ramirez-Marquez & Coit, 2004; Ramirez-Marquez & Levitin, 2008; Levitin, 2007; Levitin & Amari, 2008).
In this paper, we generalize the EL distribution (Tahmasbi & Rezaei, 2008) modelling the time to the first failure, to a distribution more appropriate for modelling any order statistic (second, third, or any lifetime). For instance, suppose a machine produces a random number, units, of light bulbs or wire fuses which are put through a life test. Each item has a random lifetime , . The EL distribution (Tahmasbi & Rezaei, 2008) is focused only on the minimum time-to-failure of the first of the functioning components. However, we may be interested in the duration and then determine the lifetime distribution for the order statistics, assuming the system will fail if of the units fail. We may let be the order statistics of independent observations of the time and then, we consider the -smallest value of lifetime instead of the minimum lifetime. We assume that the ’s are not observable, but that is. We would like to estimate the lifetime distribution given the observations on . The proposed new family of lifetime distributions is obtained by compounding the exponential and generalized truncated logarithmic distributions, named exponential-generalized truncated logarithmic (EGTL) distribution.
The paper is organized as follows: In section 2, we present the new family of lifetime distributions and the probability density function (pdf) for some special cases. The moment generating function, the moment, the reliability, the failure rate function and the random number generation are discussed in this section. The estimation of parameters for this new family of distributions will be discussed in section 3. It is attained by maximum likelihood (MLEs) and expectation maximization (EM) algorithms. The method of moments and the Bayesian approach are also presented as possible alternatives to the MLEs method. As illustration of these three methods of estimation, numerical computations will be performed in section 4. The application study is illustrated based on two real data sets in section 5. The last section concludes the paper.
2 Properties of the distribution
The derivation of the new family of lifetime distributions depends on the generalization of the compound exponential and truncated logarithmic distributions as follows: Let be iid exponential random variables with scale parameter and a pdf given by: , for , where is a discrete random variable following a logarithmic-series distribution with parameter and a probability mass function (pmf), , given by:
If is a truncated at logarithmic random variable with parameter , then the probability function will be given by:
The pdf of the order statistic (the -smallest value of lifetime) exponentially distributed is given by the equation (5) (see, David & Nagaraja (1970); Balakrishnan & Cohen (1991); Balakrishnan (1996)):
where, is the lifetime of a system and is the last order statistic. In equation (6) we consider the ascending order . The joint probability density is determined by compounding a truncated at logarithmic series distribution and the pdf of the order statistic (). The use of the truncated at logarithmic distribution is motivated by mathematical interest because we are interested in the order statistic. There is a left-truncation scheme, where only () individuals who survive a sufficient time are included, i.e. we observe only individuals or units with exceeding the time of the event that truncates individuals. In comparison with the formulation of Tahmasbi & Rezaei (2008) and Adamidis & Loukas (1998), we consider the -smallest value of lifetime instead of the minimum lifetime . So, our proposed new lifetime distribution, named exponential-generalized truncated logarithmic (EGTL) distribution, is the marginal density distribution of given by:
where is the shape parameter and is the scale parameter. This distribution is more appropriate for modelling any order statistic (, or any lifetime). The particular case of the EGTL density function, for , is the EL distribution modelling the time of the first failure, , given by Tahmasbi & Rezaei (2008):
For , this pdf decreases strictly in and tends to zero as . The modal value of the density of the EL distribution, at , is given by and hence, its median is . The EL distribution tends to an exponential distribution with rate parameter , as . The function is concave upward on . The graphs of the density resemble those of the exponential and Pareto II distributions (see, Figure 1).
Also, the cumulative distribution function (cdf) of corresponding to the pdf in equation (7) is given by:
Then, the final cdf can be reduced to:
2.2 Moment generating function and Moment
Suppose has the pdf in equation (7), then the moment generating function (mgf) is given by:
and hence, we can write the mgf as:
The moment is given by:
2.3 Reliability and failure rate functions
It is well known that the reliability (survival) function is the probability of being alive just before duration , given by which is the probability that the event of interest has not occurred by duration . So, the reliability is the probability that a system will be successful in the interval from time to time , where is a random variable denoting the time-to-failure or failure time. One may refer to the literature on reliability theory (Barlow & Proschan, 1975, 1981; Basu, 1988). The survival function, corresponding to the pdf in equation (7), is given by equation (13). Table (1) presents the reliability function for some special cases.
The failure rate, known as hazard rate function , is the instantaneous rate of occurrence of the event of interest at duration (i.e. the rate of event occurrence per unit of time). Mathematically, it is equal to the pdf of events at , divided by the probability of surviving to that duration without experiencing the event. Thus, we define the failure rate function as in Barlow et al. (1963) by . The hazard function for some special cases is given in table (2).
The failure rate function is analytically related to the failure’s probability distribution. It leads to the examination of the increasing (IFR) or decreasing failure rate (DFR) properties of life-length distributions. is an IFR distribution, if increases for all such that . The motivation of the EGTL lifetime distribution is the realistic features of the hazard rate in many real-life physical and non-physical systems, which is not monotonically increasing, decreasing or constant hazard rate. If , the hazard rate function is decreasing following Tahmasbi & Rezaei (2008). In fact, if then and if then . For , there is an increasing failure rate. Indeed, if then . If then (see Figure 2).
2.4 Random number generation
We can generate a random variable from the cdf of in equation (9) using the following steps:
Generate a random variable from the standard uniform distribution.
Solve the non linear equation in :
Calculate the values of such as:
where is EGTL random variable with parameters and . Note that for the special case , we generate directly from the following equation:
3 Estimation of the parameters
In this section, we will determine the estimates of the parameters and for the EGTL new family of distributions. There are many methods available for estimating the parameters of interest. We present here the three most popular methods: Maximum likelihood, method of moments and Bayesian estimations.
3.1 Maximum Likelihood estimation
Let () be a random sample from the EGTL distribution. The log-likelihood function given the observed values, , is:
We subsequently derive the associated gradients:
We need the Fisher information matrix for interval estimation and tests of hypotheses on the parameters. It can be expressed in terms of the second derivatives of the log-likelihood function:
The maximum likelihood estimates (MLEs) and of the EGTL parameters and , respectively, can be found analytically using the iterative EM algorithm to handle the incomplete data problems (Dempster et al., 1977; Krishnan & McLachlan, 1997). The iterative method consists in repeatedly updating the parameter estimates by replacing the "missing data" with the new estimated values. The standard method used to determine the MLEs is the Newton-Raphson algorithm that requires second derivatives of the log-likelihood function for all iterations. The main drawback of the EM algorithm is its rather slow convergence, compared to the Newton-Raphson method, when the "missing data" contain a relatively large amount of information (Little & Rubin, 1983). Recently, several researchers have used the EM method such as Adamidis & Loukas (1998), Adamidis et al. (2005), Karlis & Xekalaki (2003), Ng et al. (2002) and others. Newton-Raphson is required for the M-step of the EM algorithm. To start the algorithm, a hypothetical distribution of complete-data is defined with the pdf in equation (6) and then, we drive the conditional mass function as:
3.2 Method of moments estimation
The method of moments involves equating theoretical with sample moments. The estimate of moment is . For the EGTL distribution, the moment is given by equation (12). The corresponding first and second moments are given by:
From equation (22) we obtain:
and then, we should solve the following equation in :
Thereafter, we determine by replacing with its estimated value, , in the equation (24).
3.3 Bayesian estimation
In the Bayesian approach inferences are expressed in a posterior distribution for the parameters which is, according to Bayes’ theorem, given in terms of the likelihood and a prior density function by:
where, is a prior probability distribution function and is the likelihood of observations . Note that is the normalizing constant for the function given by:
We should first specify our initial beliefs or other sorts of knowledge on the prior distribution . Here, we suppose that the standard uniform distribution on the interval is a prior distribution for the parameter and gamma, , is a prior distribution for the parameter , where is a shape parameter and is a scale parameter. The prior probability function is then equal to:
where, and .
Using the mean square error as a risk function, we obtain the Bayes estimates as the means of the posterior distribution:
4 Simulation study
As an illustration of the three last methods of estimation, numerical computations have been performed using the steps presented in section 2.4 for the random number generation. The numerical study was based on random samples of the sizes , and from the EGTL distribution for each of the values of and the three cases . We have considered the initial values , and . For this purpose, we have used the program Mathcad 14.0. After determining the parameter estimates we compute the biases, the variances and the mean square errors (MSEs), where and . An estimator is said to be efficient if its mean square error (MSE) is minimum among all competitors. In fact, is more efficient than if .
Table 3 reports the results from the simulated data where the variances and the MSEs of the parameters are given. The results show that, for each case , the variances and the MSEs decrease when the sample size increases. We see that the values from the Bayesian method are generally lower than those obtained using the ML approach.
|Maximum likelihood||Method of Moments||Bayesian methods|
|1||(0.5 ; 0.5)||0.2091||0.2008||0.2136||0.2043||0.2775||0.2241||0.2849||0.2351||0.1737||0.1414||0.1835||0.1586|
|(0.7 ; 1.5)||0.2779||0.6980||0.2861||0.7145||0.3066||1.0227||0.3078||1.0361||0.2208||0.5794||0.2287||0.6104|
|(0.3 ; 2)||0.4250||1.0930||0.4680||1.1160||0.6890||1.7260||0.7270||1.8930||0.2110||0.5010||0.2420||0.5870|
|2||(0.5 ; 0.5)||0.2725||0.2013||0.2773||0.2027||0.3087||0.2327||0.3347||0.2398||0.1934||0.1227||0.1957||0.1488|
|(0.7 ; 1.5)||0.1105||1.1326||0.1114||1.1380||0.1318||1.2050||0.1409||1.2078||0.0742||0.9730||0.0796||1.0117|
|(0.3 ; 2)||0.2700||0.6230||0.3660||0.6520||0.5750||1.0000||0.6810||1.1900||0.1000||0.3690||0.1230||0.4320|
|3||(0.5 ; 0.5)||0.1407||0.1164||0.1193||0.1536||0.2186||0.1498||0.2245||0.1515||0.1054||0.0490||0.1065||0.0533|
|(0.7 ; 1.5)||0.1782||0.6809||0.1938||0.7694||0.2247||0.9003||0.2269||0.9290||0.1653||0.5840||0.1682||0.5880|
|(0.3 ; 2)||0.4090||0.8630||0.4410||0.9040||1.2390||0.6680||1.2930||0.7200||0.1590||0.3380||0.2090||0.3810|
|1||(0.5 ; 0.5)||0.1099||0.1032||0.1119||0.1049||0.1534||0.1280||0.1848||0.1308||0.0681||0.0493||0.0723||0.0506|
|(0.7 ; 1.5)||0.1221||0.5182||0.1257||0.5431||0.1426||0.6526||0.1532||0.6929||0.0567||0.3863||0.0625||0.3878|
|(0.3 ; 2)||0.3630||0.8440||0.3820||0.9340||0.6330||1.6030||0.6580||1.6480||0.1380||0.3710||0.1730||0.4240|
|2||(0.5 ; 0.5)||0.1687||0.1097||0.1915||0.1239||0.2252||0.1422||0.2257||0.1544||0.1014||0.0443||0.1119||0.0456|
|(0.7 ; 1.5)||0.1051||1.0626||0.1072||1.0985||0.1287||1.1868||0.1381||1.1923||0.0691||0.9795||0.0735||0.9807|
|(0.3 ; 2||0.2350||0.5620||0.2510||0.5770||0.5330||0.8980||0.5650||0.9390||0.0340||0.2670||0.0740||0.3050|
|3||(0.5 ; 0.5)||0.0941||0.0782||0.0947||0.0896||0.1782||0.1134||0.1810||0.1167||0.0624||0.0185||0.0686||0.0224|
|(0.7 ; 1.5)||0.0606||0.5371||0.0608||0.5459||0.1005||0.6989||0.1067||0.7072||0.0433||0.4644||0.0438||0.4685|
|(0.3 ; 2||0.3360||0.7130||0.3790||0.8170||0.5770||1.1460||0.6340||1.1800||0.1310||0.2140||0.1430||0.2760|
|1||(0.5 ; 0.5)||0.0652||0.0398||0.0663||0.0592||0.1049||0.0827||0.1098||0.0879||0.0185||0.0022||0.0327||0.0082|
|(0.7 ; 1.5)||0.0531||0.4206||0.0605||0.4663||0.0870||0.5859||0.0901||0.5876||0.0013||0.1941||0.0052||0.2201|
|(0.3 ; 2)||0.2588||0.6168||0.2828||0.7518||0.4918||1.2348||0.5318||1.3888||0.0118||0.0058||0.0618||0.1188|
|2||(0.5 ; 0.5)||0.1059||0.0733||0.1140||0.0804||0.1897||0.1097||0.1971||0.1162||0.0479||0.0051||0.0574||0.0185|
|(0.7 ; 1.5)||0.0522||0.9943||0.0614||1.0109||0.0849||1.1288||0.0889||1.1395||0.0002||0.8165||0.0039||0.8951|
|(0.3 ; 2)||0.1739||0.4719||0.2009||0.5239||0.4229||0.7759||0.4979||0.8269||0.0049||0.0149||0.0219||0.0859|
|3||(0.5 ; 0.5)||0.0749||0.0560||0.0814||0.0613||0.1255||0.0888||0.1471||0.0901||0.0247||0.0005||0.0505||0.0059|
|(0.7 ; 1.5)||0.0174||0.4517||0.0199||0.4680||0.0543||0.6477||0.0586||0.6599||0.0006||0.3827||0.0011||0.4055|
|(0.3 ; 2)||0.2183||0.4733||0.2873||0.5973||0.4423||1.0013||0.4923||1.0643||0.0043||0.0073||0.0663||0.0323|
5 Application examples
In this section, we fit the EGTL distribution to two real data sets using the MLEs. The first set (table 4) consists of " failure times for right rear brakes on D9G-66A caterpillar tractors", reproduced from Barlow & Campo (1975) and used also by Chang & Rao (1993). These data are used in many applications of reliability (Adamidis et al., 2005; Tsokos, 2012; Shahsanaei et al., 2012). The second set of data involves observations (table 5) of the results from an experiment concerning "the tensile fatigue characteristics of a polyester/viscose yarn". These data were presented by Picciotto (1970) to study the problem of warp breakage during weaving. The observations were obtained on the cycles to failure of a cm yarn sample put to test under strain level. The sample is used in Quesenberry & Kent (1982) as an example to illustrate selection procedure among probability distributions used in reliability. The reliability function of these two data sets belongs to the increasing failure rate class (Doksum & Yandell, 1984; Adamidis et al., 2005). In addition to our class of distributions, the gamma and Weibull distributions were fitted these data sets. The respective densities of gamma and Weibull distributions are and .
Table 6 shows the fitted parameters, the calculated values of Kolmogorov-Smirnov (K-S) and their respective p-values for the two sets of data. It should be noted that the K-S test compares an empirical distribution with a known (not estimated) one. It is used to decide if a sample comes from a population with a specific distribution (: the data follow a specified distribution). We estimate some special cases () of the EGTL family of distributions at significant level. The p-values are only significant for the case for the Barlow & Campo (1975) and Quesenberry & Kent (1982) data sets. In fact, the data exhibit increasing failure rates but, the EGTL distribution is a decreasing failure rate if (see figure 2). The new lifetime distribution provides good fit to the data sets. The K-S test shows that the EGTL distribution is an attractive alternative to the popular gamma and Weibull distributions. It generalizes the reliability lifetime distributions to any order statistics. Indeed, as shown in section 2.3, If , the hazard rate function is decreasing following Tahmasbi & Rezaei (2008) and there is an increasing hazard rate for .
|Barlow & Campo (1975) data set ():|
|First order (k=1)|
|Second order (k=2)|
|Third order (k=3)|
|Fourth order (k=4)|
|Quesenberry & Kent (1982) data set ():|
|First order (k=1)|
|Second order (k=2)|
|Third order (k=3)|
|Fourth order (k=4)|
We define a new two-parameter lifetime distribution so-called EGTL distribution. Our procedure generalizes the EL distribution proposed by Tahmasbi & Rezaei (2008). We derive some mathematical properties and we present the plots of the pdf and the failure rate functions for some special cases. The estimation of the parameters is attained by the maximum likelihood, EM algorithm, the method of moments and the Bayesian approach, with numerical computations performed as illustration of the different methods of estimation. The application study is illustrated based on two real data sets used in many applications of reliability. We have shown that our proposed EGTL distribution is suitable for modelling the time to any failure and not only the time to the first or the last failure. It is very competitive compared with its standard counterpart’s distributions.
Ordered random variables are already known for their ascending order. The paper may be extended to the concept of dual generalized ordered statistics, introduced by Burkschat et al. (2003), that enables a common approach to the descending ordered spacings like the reverse ordered statistics and the lower record values.
Let be iid exponential r.v. with pdf given by: , for , where is a log-series r.v. with pmf, , given by:
From the Taylor series, for we have:
The truncated at logarithmic distribution with parameter is:
The pdf of the order statistic is:
Let, , , and
the marginal density of is:
Let, , ,