A Generalization of the ExponentialLogarithmic Distribution for Reliability and Life Data Analysis
Abstract
In this paper, we introduce a new twoparameter lifetime distribution, called the exponentialgeneralized truncated logarithmic (EGTL) distribution, by compounding the exponential and generalized truncated logarithmic distributions. Our procedure generalizes the exponentiallogarithmic (EL) distribution modelling the reliability of systems by the use of firstorder concepts, where the minimum lifetime is considered (Tahmasbi & Rezaei, 2008). In our approach, we assume that a system fails if a given number of the components fails and then, we consider the smallest value of lifetime instead of the minimum lifetime.
The reliability and failure rate functions as well as their properties are presented for some special cases. The estimation of the parameters is attained by the maximum likelihood, the expectation maximization algorithm, the method of moments and the Bayesian approach, with a simulation study performed to illustrate the different methods of estimation. The application study is illustrated based on two real data sets used in many applications of reliability.
Keywords: Lifetime distributions, reliability, failure rate, order statistics, exponential distribution, truncated logarithmic distribution.
1 Introduction
Lifetime distributions are often used in reliability theory and survival analysis for modelling real data. They play a fundamental role in reliability in diverse disciplines such as finance, manufacture, biological sciences, physics and engineering. The exponential distribution is a basic model in reliability theory and survival analysis. It is often used to model system reliability at a component level, assuming the failure rate is constant (Balakrishnan & Basu, 1995; Barlow & Proschan, 1975; Sinha & Kale, 1980). In recent years, a growing number of scholarly papers has been devoted to accommodate lifetime distributions with increasing or decreasing failure rate functions. The motivation is to give a parametric fit for real data sets where the underlying failure rates, arising on a latent competing risk problem base, present monotone shapes (nonconstant hazard rates). The proposed distributions are introduced as extensions of the exponential distribution, following Adamidis & Loukas (1998) and Kuş (2007), by compounding some useful lifetime and truncated discrete distributions (for review, see BarretoSouza & CribariNeto (2009); Chahkandi & Ganjali (2009); Silva et al. (2010); BarretoSouza et al. (2011); Cancho et al. (2011); LouzadaNeto et al. (2011); Morais & BarretoSouza (2011); Hemmati et al. (2011); Nadarajah et al. (2013); Bakouch et al. (2014), and others). The genesis is stated on competing risk scenarios in presence of latent risks, i.e. there is no information about the causes of the component’s failure (Basu & Klein, 1982). In fact, a system may experience multiple failure processes that compete against each other, and whichever occurs first can cause the system to fail (Rafiee et al., 2017; Kalbfleisch & Prentice, 2002; Andersen et al., 2002; Tsiatis, 1998). The term competing risks refers to duration data where two or more causes are competing to determine the observed timetofailure. The potential multiple causes of failure are not mutually exclusive but the interest lies in the time to the first coming one (Putter et al., 2007; Bakoyannis & Touloumi, 2012). For further details see Basu (1981).
In the same way, the exponentiallogarithmic (EL) distribution was proposed by Tahmasbi & Rezaei (2008) as a logseries mixture of exponential random variables. This twoparameter distribution with decreasing failure rate (DFR) is obtained by mixing the exponential and logarithmic distributions. It is based on the idea of modelling the system’s reliability where the timetofailure occurs due to the presence of an unknown number of initial defects of some components is considered. Suppose the breakdown (failure) of a system of components occurs due to the presence of a nonobservable number, , of initial defects of the same kind, that can be identifiable only after causing failure and are repaired perfectly (Adamidis & Loukas, 1998; Kuş, 2007). Let be the failure time of the system due to the defect, for . If we assume that are iid exponential random variables independent of , that follows a truncated logarithmic distribution, then the time to the first failure is adequately modelled by the EL distribution (BarretoSouza & Silva, 2015; Bourguignon et al., 2014; Ross, 1976). For reliability studies, and are used respectively in serial and parallel systems with identical components (Chahkandi & Ganjali, 2009; Ramos et al., 2015). However, one may determine the distribution of the smallest value of the timetofailure ( order statistic), instead of the minimum lifetime (first order statistic).
There is a huge literature on the order statistics for reliability engineering (for review, see Barlow & Proschan (1975); Sarhan & Greenberg (1962); Barlow & Proschan (1965); Pyke (1965); Gnedenko et al. (1969); Pledger & Proschan (1971); Barlow & Proschan (1981); David (1981); Bain & Englehardt (1991), and references contained therein). The motivation arises in reliability theory, where the socalled outof systems are studied (Xie & Wang, 2008; Xie et al., 2005; Proschan & Sethuraman, 1976; Kim et al., 1988). An engineering system consisting of components is working if at least out of the total components are operating and it breaks down if or more components fail. Hence, a outof system fails at the time of the component failure (Barlow & Proschan, 1975; Kamps, 1995; Cramer & Kamps, 2001). This binarystate context is based on the assumption that a system or its components can be either fully working or completely failing. However, in reality, a system may provide its specified function at less than full capacity when some of its components operate in a degraded state (RamirezMarquez & Coit, 2005). The binary outof system reliability models have been extended to multistate outof system reliability models by allowing more than two performance levels for the system and its components (Eryilmaz, 2014). Multistate systems contain units presenting two or more failed states with multiple modes of failure and one working state (Anzanello, 2009). Reliability models provide, through multistate, more realistic representations of engineering systems (Yingkui & Jing, 2012). Many authors have made contributions about the reliability estimation approaches for multistate systems (Jenney & Sherwin, 1986; Page & Perry, 1988; Rocco et al., 2005; RamirezMarquez & Coit, 2004; RamirezMarquez & Levitin, 2008; Levitin, 2007; Levitin & Amari, 2008).
In this paper, we generalize the EL distribution (Tahmasbi & Rezaei, 2008) modelling the time to the first failure, to a distribution more appropriate for modelling any order statistic (second, third, or any lifetime). For instance, suppose a machine produces a random number, units, of light bulbs or wire fuses which are put through a life test. Each item has a random lifetime , . The EL distribution (Tahmasbi & Rezaei, 2008) is focused only on the minimum timetofailure of the first of the functioning components. However, we may be interested in the duration and then determine the lifetime distribution for the order statistics, assuming the system will fail if of the units fail. We may let be the order statistics of independent observations of the time and then, we consider the smallest value of lifetime instead of the minimum lifetime. We assume that the ’s are not observable, but that is. We would like to estimate the lifetime distribution given the observations on . The proposed new family of lifetime distributions is obtained by compounding the exponential and generalized truncated logarithmic distributions, named exponentialgeneralized truncated logarithmic (EGTL) distribution.
The paper is organized as follows: In section 2, we present the new family of lifetime distributions and the probability density function (pdf) for some special cases. The moment generating function, the moment, the reliability, the failure rate function and the random number generation are discussed in this section. The estimation of parameters for this new family of distributions will be discussed in section 3. It is attained by maximum likelihood (MLEs) and expectation maximization (EM) algorithms. The method of moments and the Bayesian approach are also presented as possible alternatives to the MLEs method. As illustration of these three methods of estimation, numerical computations will be performed in section 4. The application study is illustrated based on two real data sets in section 5. The last section concludes the paper.
2 Properties of the distribution
2.1 Distribution
The derivation of the new family of lifetime distributions depends on the generalization of the compound exponential and truncated logarithmic distributions as follows: Let be iid exponential random variables with scale parameter and a pdf given by: , for , where is a discrete random variable following a logarithmicseries distribution with parameter and a probability mass function (pmf), , given by:
(1) 
If is a truncated at logarithmic random variable with parameter , then the probability function will be given by:
(2) 
where,
(3) 
and,
(4) 
The pdf of the order statistic (the smallest value of lifetime) exponentially distributed is given by the equation (5) (see, David & Nagaraja (1970); Balakrishnan & Cohen (1991); Balakrishnan (1996)):
(5) 
From equations (2) and (5) the joint probability density is derived as^{1}^{1}1The proofs of all steps and equations are presented in the appendix.:
(6) 
where, is the lifetime of a system and is the last order statistic. In equation (6) we consider the ascending order . The joint probability density is determined by compounding a truncated at logarithmic series distribution and the pdf of the order statistic (). The use of the truncated at logarithmic distribution is motivated by mathematical interest because we are interested in the order statistic. There is a lefttruncation scheme, where only () individuals who survive a sufficient time are included, i.e. we observe only individuals or units with exceeding the time of the event that truncates individuals. In comparison with the formulation of Tahmasbi & Rezaei (2008) and Adamidis & Loukas (1998), we consider the smallest value of lifetime instead of the minimum lifetime . So, our proposed new lifetime distribution, named exponentialgeneralized truncated logarithmic (EGTL) distribution, is the marginal density distribution of given by:
(7) 
where is the shape parameter and is the scale parameter. This distribution is more appropriate for modelling any order statistic (, or any lifetime). The particular case of the EGTL density function, for , is the EL distribution modelling the time of the first failure, , given by Tahmasbi & Rezaei (2008):
For , this pdf decreases strictly in and tends to zero as . The modal value of the density of the EL distribution, at , is given by and hence, its median is . The EL distribution tends to an exponential distribution with rate parameter , as . The function is concave upward on . The graphs of the density resemble those of the exponential and Pareto II distributions (see, Figure 1).
Also, the cumulative distribution function (cdf) of corresponding to the pdf in equation (7) is given by:
(8) 
where,
Then, the final cdf can be reduced to:
(9) 
where,
2.2 Moment generating function and Moment
Suppose has the pdf in equation (7), then the moment generating function (mgf) is given by:
(10) 
where,
and hence, we can write the mgf as:
(11) 
The moment is given by:
(12) 
2.3 Reliability and failure rate functions
It is well known that the reliability (survival) function is the probability of being alive just before duration , given by which is the probability that the event of interest has not occurred by duration . So, the reliability is the probability that a system will be successful in the interval from time to time , where is a random variable denoting the timetofailure or failure time. One may refer to the literature on reliability theory (Barlow & Proschan, 1975, 1981; Basu, 1988). The survival function, corresponding to the pdf in equation (7), is given by equation (13). Table (1) presents the reliability function for some special cases.
(13) 
order statistic  k  S(x) 

first  
second  
third 
The failure rate, known as hazard rate function , is the instantaneous rate of occurrence of the event of interest at duration (i.e. the rate of event occurrence per unit of time). Mathematically, it is equal to the pdf of events at , divided by the probability of surviving to that duration without experiencing the event. Thus, we define the failure rate function as in Barlow et al. (1963) by . The hazard function for some special cases is given in table (2).
order statistic  k  h(x) 

first  
second  
third 
The failure rate function is analytically related to the failure’s probability distribution. It leads to the examination of the increasing (IFR) or decreasing failure rate (DFR) properties of lifelength distributions. is an IFR distribution, if increases for all such that . The motivation of the EGTL lifetime distribution is the realistic features of the hazard rate in many reallife physical and nonphysical systems, which is not monotonically increasing, decreasing or constant hazard rate. If , the hazard rate function is decreasing following Tahmasbi & Rezaei (2008). In fact, if then and if then . For , there is an increasing failure rate. Indeed, if then . If then (see Figure 2).
2.4 Random number generation
We can generate a random variable from the cdf of in equation (9) using the following steps:

Generate a random variable from the standard uniform distribution.

Solve the non linear equation in :
(14) where,

Calculate the values of such as:
(15)
where is EGTL random variable with parameters and . Note that for the special case , we generate directly from the following equation:
(16) 
3 Estimation of the parameters
In this section, we will determine the estimates of the parameters and for the EGTL new family of distributions. There are many methods available for estimating the parameters of interest. We present here the three most popular methods: Maximum likelihood, method of moments and Bayesian estimations.
3.1 Maximum Likelihood estimation
Let () be a random sample from the EGTL distribution. The loglikelihood function given the observed values, , is:
(17) 
We subsequently derive the associated gradients:
We need the Fisher information matrix for interval estimation and tests of hypotheses on the parameters. It can be expressed in terms of the second derivatives of the loglikelihood function:
The maximum likelihood estimates (MLEs) and of the EGTL parameters and , respectively, can be found analytically using the iterative EM algorithm to handle the incomplete data problems (Dempster et al., 1977; Krishnan & McLachlan, 1997). The iterative method consists in repeatedly updating the parameter estimates by replacing the "missing data" with the new estimated values. The standard method used to determine the MLEs is the NewtonRaphson algorithm that requires second derivatives of the loglikelihood function for all iterations. The main drawback of the EM algorithm is its rather slow convergence, compared to the NewtonRaphson method, when the "missing data" contain a relatively large amount of information (Little & Rubin, 1983). Recently, several researchers have used the EM method such as Adamidis & Loukas (1998), Adamidis et al. (2005), Karlis & Xekalaki (2003), Ng et al. (2002) and others. NewtonRaphson is required for the Mstep of the EM algorithm. To start the algorithm, a hypothetical distribution of completedata is defined with the pdf in equation (6) and then, we drive the conditional mass function as:
(18) 
Estep:
(19) 
Mstep:
(20) 
(21) 
3.2 Method of moments estimation
The method of moments involves equating theoretical with sample moments. The estimate of moment is . For the EGTL distribution, the moment is given by equation (12). The corresponding first and second moments are given by:
(22) 
(23) 
From equation (22) we obtain:
(24) 
and then, we should solve the following equation in :
(25) 
Thereafter, we determine by replacing with its estimated value, , in the equation (24).
3.3 Bayesian estimation
In the Bayesian approach inferences are expressed in a posterior distribution for the parameters which is, according to Bayes’ theorem, given in terms of the likelihood and a prior density function by:
(26) 
where, is a prior probability distribution function and is the likelihood of observations . Note that is the normalizing constant for the function given by:
(27) 
We should first specify our initial beliefs or other sorts of knowledge on the prior distribution . Here, we suppose that the standard uniform distribution on the interval is a prior distribution for the parameter and gamma, , is a prior distribution for the parameter , where is a shape parameter and is a scale parameter. The prior probability function is then equal to:
(28) 
where, and .
Using the mean square error as a risk function, we obtain the Bayes estimates as the means of the posterior distribution:
(29) 
(30) 
4 Simulation study
As an illustration of the three last methods of estimation, numerical computations have been performed using the steps presented in section 2.4 for the random number generation. The numerical study was based on random samples of the sizes , and from the EGTL distribution for each of the values of and the three cases . We have considered the initial values , and . For this purpose, we have used the program Mathcad 14.0. After determining the parameter estimates we compute the biases, the variances and the mean square errors (MSEs), where and . An estimator is said to be efficient if its mean square error (MSE) is minimum among all competitors. In fact, is more efficient than if .
Table 3 reports the results from the simulated data where the variances and the MSEs of the parameters are given. The results show that, for each case , the variances and the MSEs decrease when the sample size increases. We see that the values from the Bayesian method are generally lower than those obtained using the ML approach.
Maximum likelihood  Method of Moments  Bayesian methods  

n  k  
n=20  
1  (0.5 ; 0.5)  0.2091  0.2008  0.2136  0.2043  0.2775  0.2241  0.2849  0.2351  0.1737  0.1414  0.1835  0.1586  
(0.7 ; 1.5)  0.2779  0.6980  0.2861  0.7145  0.3066  1.0227  0.3078  1.0361  0.2208  0.5794  0.2287  0.6104  
(0.3 ; 2)  0.4250  1.0930  0.4680  1.1160  0.6890  1.7260  0.7270  1.8930  0.2110  0.5010  0.2420  0.5870  
2  (0.5 ; 0.5)  0.2725  0.2013  0.2773  0.2027  0.3087  0.2327  0.3347  0.2398  0.1934  0.1227  0.1957  0.1488  
(0.7 ; 1.5)  0.1105  1.1326  0.1114  1.1380  0.1318  1.2050  0.1409  1.2078  0.0742  0.9730  0.0796  1.0117  
(0.3 ; 2)  0.2700  0.6230  0.3660  0.6520  0.5750  1.0000  0.6810  1.1900  0.1000  0.3690  0.1230  0.4320  
3  (0.5 ; 0.5)  0.1407  0.1164  0.1193  0.1536  0.2186  0.1498  0.2245  0.1515  0.1054  0.0490  0.1065  0.0533  
(0.7 ; 1.5)  0.1782  0.6809  0.1938  0.7694  0.2247  0.9003  0.2269  0.9290  0.1653  0.5840  0.1682  0.5880  
(0.3 ; 2)  0.4090  0.8630  0.4410  0.9040  1.2390  0.6680  1.2930  0.7200  0.1590  0.3380  0.2090  0.3810  
n=50  
1  (0.5 ; 0.5)  0.1099  0.1032  0.1119  0.1049  0.1534  0.1280  0.1848  0.1308  0.0681  0.0493  0.0723  0.0506  
(0.7 ; 1.5)  0.1221  0.5182  0.1257  0.5431  0.1426  0.6526  0.1532  0.6929  0.0567  0.3863  0.0625  0.3878  
(0.3 ; 2)  0.3630  0.8440  0.3820  0.9340  0.6330  1.6030  0.6580  1.6480  0.1380  0.3710  0.1730  0.4240  
2  (0.5 ; 0.5)  0.1687  0.1097  0.1915  0.1239  0.2252  0.1422  0.2257  0.1544  0.1014  0.0443  0.1119  0.0456  
(0.7 ; 1.5)  0.1051  1.0626  0.1072  1.0985  0.1287  1.1868  0.1381  1.1923  0.0691  0.9795  0.0735  0.9807  
(0.3 ; 2  0.2350  0.5620  0.2510  0.5770  0.5330  0.8980  0.5650  0.9390  0.0340  0.2670  0.0740  0.3050  
3  (0.5 ; 0.5)  0.0941  0.0782  0.0947  0.0896  0.1782  0.1134  0.1810  0.1167  0.0624  0.0185  0.0686  0.0224  
(0.7 ; 1.5)  0.0606  0.5371  0.0608  0.5459  0.1005  0.6989  0.1067  0.7072  0.0433  0.4644  0.0438  0.4685  
(0.3 ; 2  0.3360  0.7130  0.3790  0.8170  0.5770  1.1460  0.6340  1.1800  0.1310  0.2140  0.1430  0.2760  
n=100  
1  (0.5 ; 0.5)  0.0652  0.0398  0.0663  0.0592  0.1049  0.0827  0.1098  0.0879  0.0185  0.0022  0.0327  0.0082  
(0.7 ; 1.5)  0.0531  0.4206  0.0605  0.4663  0.0870  0.5859  0.0901  0.5876  0.0013  0.1941  0.0052  0.2201  
(0.3 ; 2)  0.2588  0.6168  0.2828  0.7518  0.4918  1.2348  0.5318  1.3888  0.0118  0.0058  0.0618  0.1188  
2  (0.5 ; 0.5)  0.1059  0.0733  0.1140  0.0804  0.1897  0.1097  0.1971  0.1162  0.0479  0.0051  0.0574  0.0185  
(0.7 ; 1.5)  0.0522  0.9943  0.0614  1.0109  0.0849  1.1288  0.0889  1.1395  0.0002  0.8165  0.0039  0.8951  
(0.3 ; 2)  0.1739  0.4719  0.2009  0.5239  0.4229  0.7759  0.4979  0.8269  0.0049  0.0149  0.0219  0.0859  
3  (0.5 ; 0.5)  0.0749  0.0560  0.0814  0.0613  0.1255  0.0888  0.1471  0.0901  0.0247  0.0005  0.0505  0.0059  
(0.7 ; 1.5)  0.0174  0.4517  0.0199  0.4680  0.0543  0.6477  0.0586  0.6599  0.0006  0.3827  0.0011  0.4055  
(0.3 ; 2)  0.2183  0.4733  0.2873  0.5973  0.4423  1.0013  0.4923  1.0643  0.0043  0.0073  0.0663  0.0323 
5 Application examples
In this section, we fit the EGTL distribution to two real data sets using the MLEs. The first set (table 4) consists of " failure times for right rear brakes on D9G66A caterpillar tractors", reproduced from Barlow & Campo (1975) and used also by Chang & Rao (1993). These data are used in many applications of reliability (Adamidis et al., 2005; Tsokos, 2012; Shahsanaei et al., 2012). The second set of data involves observations (table 5) of the results from an experiment concerning "the tensile fatigue characteristics of a polyester/viscose yarn". These data were presented by Picciotto (1970) to study the problem of warp breakage during weaving. The observations were obtained on the cycles to failure of a cm yarn sample put to test under strain level. The sample is used in Quesenberry & Kent (1982) as an example to illustrate selection procedure among probability distributions used in reliability. The reliability function of these two data sets belongs to the increasing failure rate class (Doksum & Yandell, 1984; Adamidis et al., 2005). In addition to our class of distributions, the gamma and Weibull distributions were fitted these data sets. The respective densities of gamma and Weibull distributions are and .
56  753  1153  1586  2150  2624  3826  83  763  1154  1599  2156  2675  3995  104 
806  1193  1608  2160  2701  4007  116  834  1201  1723  2190  2755  4159  244  838 
1253  1769  2210  2877  4300  305  862  1313  1795  2220  2879  4487  429  897  1329 
1927  2248  2922  5074  452  904  1347  1957  2285  2986  5579  453  981  1454  2005 
2325  3092  5623  503  1007  1464  2010  2337  3160  6869  552  1008  1490  2016  2351 
3185  7739  614  1049  1491  2022  2437  3191  661  1069  1532  2037  2454  3439  673 
1107  1549  2065  2546  3617  683  1125  1568  2096  2565  3685  685  1141  1574  2139 
2584  3756 
86  146  251  653  98  249  400  292  131  169  175  176  76  264  15 
364  195  262  88  264  157  220  42  321  180  198  38  20  61  121 
282  224  149  180  325  250  196  90  229  166  38  337  65  151  341 
40  40  135  597  246  211  180  93  315  353  571  124  279  81  186 
497  182  423  185  229  400  338  290  398  71  246  185  188  568  55 
55  61  244  20  284  393  396  203  829  239  286  194  277  143  198 
264  105  203  124  137  135  350  193  188  236 
Table 6 shows the fitted parameters, the calculated values of KolmogorovSmirnov (KS) and their respective pvalues for the two sets of data. It should be noted that the KS test compares an empirical distribution with a known (not estimated) one. It is used to decide if a sample comes from a population with a specific distribution (: the data follow a specified distribution). We estimate some special cases () of the EGTL family of distributions at significant level. The pvalues are only significant for the case for the Barlow & Campo (1975) and Quesenberry & Kent (1982) data sets. In fact, the data exhibit increasing failure rates but, the EGTL distribution is a decreasing failure rate if (see figure 2). The new lifetime distribution provides good fit to the data sets. The KS test shows that the EGTL distribution is an attractive alternative to the popular gamma and Weibull distributions. It generalizes the reliability lifetime distributions to any order statistics. Indeed, as shown in section 2.3, If , the hazard rate function is decreasing following Tahmasbi & Rezaei (2008) and there is an increasing hazard rate for .
Distributions  KS value  pvalue  

Barlow & Campo (1975) data set ():  
First order (k=1)  
Second order (k=2)  
Third order (k=3)  
Fourth order (k=4)  
Gamma  
Weibull  
Quesenberry & Kent (1982) data set ():  
First order (k=1)  
Second order (k=2)  
Third order (k=3)  
Fourth order (k=4)  
Gamma  
Weibull 
6 Conclusion
We define a new twoparameter lifetime distribution socalled EGTL distribution. Our procedure generalizes the EL distribution proposed by Tahmasbi & Rezaei (2008). We derive some mathematical properties and we present the plots of the pdf and the failure rate functions for some special cases. The estimation of the parameters is attained by the maximum likelihood, EM algorithm, the method of moments and the Bayesian approach, with numerical computations performed as illustration of the different methods of estimation. The application study is illustrated based on two real data sets used in many applications of reliability. We have shown that our proposed EGTL distribution is suitable for modelling the time to any failure and not only the time to the first or the last failure. It is very competitive compared with its standard counterpart’s distributions.
Ordered random variables are already known for their ascending order. The paper may be extended to the concept of dual generalized ordered statistics, introduced by Burkschat et al. (2003), that enables a common approach to the descending ordered spacings like the reverse ordered statistics and the lower record values.
Appendix
Let be iid exponential r.v. with pdf given by: , for , where is a logseries r.v. with pmf, , given by:
(31) 
From the Taylor series, for we have:
then,
The truncated at logarithmic distribution with parameter is:
(32) 
where,
(33) 
and,
(34) 
The pdf of the order statistic is:
(35) 
(36) 
Let, , , and
(37) 
the marginal density of is:
(38) 
Let, , ,
(39) 
(40) 
let