Marshall-Olkin Lehmann Lomax Distribution: Theory, Statistical Properties, Copulas and Real Data Modeling

In this work, a new four-parameter lifetime probability distribution called the Marshall-Olkin Lehmann Lomax distribution is defined and studied. The density function of the new distribution "asymmetric right skewed" and "symmetric" and the corresponding hazard rate can be monotonically increasing, increasing-constant, constant, upside down and monotonically decreasing. The coefficient of skewness can be negative and positive. We derive some new bivariate versions via Farlie Gumbel Morgenstern family, modified Farlie Gumbel Morgenstern family, Clayton Copula and Renyi's entropy.The method of maximum likelihood is used to estimate the unknown parameters. Using "biases" and "mean squared errors", a simulation study is performed for assessing the finite behavior of the maximum likelihood estimators.


Introduction
The Lomax or Pareto type II distribution (Lomax (1954)), is a heavy-tail probabilistic model used in modeling business, actuarial science, biological sciences, engineering, economics, income and wealth inequality, queueing theory, size of cities and Internet traffic data sets. Harris (1968) and Atkinson and Harrison (1978) employed the Lomax (Lx) distribution in modeling data obtained from income and wealth. Corbellini et al. (2007) used the Lx distribution firm size data modeling. For applications in reliability and life testing experiments see Hassan Al-Ghamdi (2009). The Lx model is known as a special distribution form of Pearson system (type VI) and has also considered as a mixture of standard exponential (Exp) and standard gamma (Gam) distributions. The Lx model belongs to the family of "monotonically decreasing" hazard rate function (HRF) and considered as a limiting model of residual lifetimes at great age (see Balkema and de Hann (1974) and Chahkandi and Ganjali (2009)). The Lx distribution has been suggested as heavy tailed alternative model to the standard Exp, standard Weibull (W) and standard Gam distributions (see Bryson (1074)). For details about relation between the Lx model and the Burr XII and Compound Gamma (CGam) models see Tadikamalla (1980) and Durbey (1970).
A random variable (rv) has the Lx distribution with two parameters and if it has cumulative distribution function (CDF) (for > 0 ) given by . (2) Due to , the CDF of the Marshall-Olkin Lehmann-G family (MOL-G) family is given by The Marshall-Olkin Lehmann Lomax (MOLLx) CDF is given by where ( ) = 1 − ( )| ( = , , , ) . The main objectives of this work is to derive a more flexible distribution by adding two extra parameter to the standard base line Lomax model and for improving the goodness-of-fit to real-life data.
The basic motivations for the MOLLx model in practice are: I. to make the kurtosis more flexible as compared to the baseline base line Lomax model.

II.
for producing skewness for symmetrical and asymmetrical distributions.

III.
for constructing heavy-tailed densities that are not longer-tailed for modeling various real-life data IV.
for providing consistently better fits than other generated Lomax extensions.
The PDF corresponding to (5) is given by The HRF for the new model can be derived from ( )/ ( ). For exploring the flexibility of the MOLLx PDF and the corresponding HRF we sketched Figure 1. Figure 1(left) provides some plots of the new PDF for some carefully selected parameters value. Figure 1 Figure 1(left) the MOLLx PDF can be "asymmetric right skewed" and "symmetric". Based on Figure 1  We are motivated to define and study the MOLLx model for the following reasons: I. The PDF of the MOLLx model can be "symmetric" and "asymmetric right skewed" with many useful cases. The failure rate of the MOLLx model can be monotonically increasing, increasing-constant, constant, upside down and monotonically decreasing. II. In statistical modeling, the new MOLLx proved adequate superiority in fitting real-life data sets against many common competitive models including standard base line Lomax, odd log-logistic Lomax, reduced odd loglogistic Lomax, the three-parameter gamma Lomax, the four-parameter Kumaraswamy Lomax, the fiveparameter Macdonald Lomax, reduced MOLLx, reduced Burr-Hatke Lomax, the four-parameter beta Lomax and special generalized mixture Lomax distributions. III. The MOLLx model showed better fits in modeling the bimodal right skewed and bimodal right skewed data sets. We prove empirically that the MOLLx distribution provides better fits to two real life data sets than other fourteen extended Lomax distributions with two, three and four parameters (see Section 5). IV. By analyzing the skewness kurtosis coefficients and index of dispersion numerically, it is noted that, skewness coefficient of the MOLLx distribution can be negative and also can be positive. The spread for the kurtosis coefficient of the MOLLx model is ranging from − 7.39 to 4074.332. The index of dispersion for the MOLLx model can be in (0,1) and also > 1 so it may be used as an "under-dispersed" and "over-dispersed" model (see Table 1).

Type I modified FGM:
Here, we consider the following functional form for both ( ) and ( ) as .

Type II modified FGM:
The CDF of the BvMOLLx-FGM (Type II) model can be derived from

Type III modified FGM:
Consider the following functional form for both ( ) and ( ) which satisfy all the conditions stated earlier where The corresponding bivariate copula (henceforth, BvMOLLx-FGM (Type III) copula) can be derived from

Type IV modified FGM:
Consider the following functional form for both ( ) and ( ) which satisfy all the conditions stated earlier where In this case, one can also derive a closed form expression for the associated CDF of the BvMOLLx-FGM (Type IV) as ℘ ( , ) = (1 + ℘ ( ) ( )).

BvMOLLx type via Clayton Copula
The Clayton Copula can be considered as Let us assume that ∼ MOLLx ( 1 ) and ∼ MOLLx ( 2 ) . Then, setting Then, the BvMOLLx type distribution can be derived as

MvMOLLx extention via Clayton Copula
A straightforward -dimensional extension from the above will be Then, the MvMOLLx extention can expressed as Recently, many authors used the above mentioned copulas to generate some new bivariate models such as see Elgohari

Mathematical properties 3.1 The quantile function (QF)
For simulation of the MOLLx model, we obtain the QF of by inverting (5), say = −1 ( ) , as Equation (7) is used for simulating the new model. Then, by expanding the quantity , , ( ) we get

Linear combinations Let
which can be repressed as where 0 = 2 and Similarly, we expand the quantity , , , ( ) we get which can be re-written as Using (8) and (9), the CDF of the MOLLx model can be expressed as where , , ( ) = , ( ) is the CDF of the exponentiated-Lx (expLx) distribution with power parameter and The PDF of the MOLLx model can also be expressed as a mixture of expLx densities. By differentiating (10), we obtain where ℎ +1, , ( ) is the exp-G pdf with power parameter + 1 . Equation (11) reveals that the MOLLx density is a linear combination of expLx densities.

Moments and incomplete moments
The ℎ ordinary moment of is given by Then, we obtain and where ( ) = 1 ′ is the mean of . The ℎ incomplete moment, say , ( ) , of can be expressed, from (11), as The first incomplete moment given by (11) with = 1 as The index of dispersion or the variance ( 2 ) to mean ratio can derived as 3 = 2 / 1 ′ . It is a measure used to quantify whether a set of observed occurrences are clustered or dispersed compared to a standard statistical model. By analyzing the 1 ′ , 2 , skewness ( 1 ) , kurtosis ( 2 ) and index of dispersion ( 3 ) numerically in Table 1, it is noted that, 1 of the MOLLx distribution can be negative and also can be positive. The spread for the 2 of the MOLLx model is ranging from − 7.39 to 4074.332. 3 for the MOLLx model can be in (0,1) and also > 1 so it may be used as an "under-dispersed" and "over-dispersed" model.

Some generating functions (GF)
The moment generating function (MGF) can be derived using (8)  The first cumulant is the mean ( 1 = 1 ′ ), the second cumulant is the variance, and the ℎ cumulant is the same as the third central moment 3 = 3 . But fourth and higher order cumulants are not equal to central moments, that being said In some cases, theoretical treatments of problems in terms of cumulants are simpler than those using moments. In particular, when two or more RVs are statistically independent, the ℎ order cumulant of their sum is equal to the sum of their ℎ order cumulants. Moreover, the cumulants can be also obtained from

Reversed residual life function
The ℎ moment of the reversed residual life, say . .

→ .
A direct consequence of the above theorem is that all asymptotic properties of the maximum likelihood estimators also hold for the Bayesian estimators (see Ibragimov (1962) and Chao (1970) for more details). Also, since the determination of the MLE is independent of the loss function and the prior measure, the asymptotic properties of Bayesian estimators hold for all priors and loss functions in a certain class. A separate article could be allocated for demonstrating this theorem.

Graphical assessment
Graphically and using the biases and mean squared errors (MSEs), we can perform the simulation experiments to assess the finite sample behavior of the MLEs. The assessment was based on =1000 replication for all | ( =50,100,…,500) . The following algorithm is considered: Generate =1000 samples of size | ( =50,100,…,500) from the MOLLx distribution using (7), II.
Compute the MLEs for the =1000 samples,

III.
Compute the SEs of the MLEs for the 1000 samples,

Applications
In this section, we provide two real life applications to two real data sets to illustrate the importance and flexibility of the MOLLx model. We compare the fit of the MOLLx with some well-known competitive models (see Table 2). First  For exploring the extreme values, the box plot is plotted (see Figure 6). Based on Figure 6, we note that no extreme values were found in the two real life data sets. For checking the normality, the Quantile-Quantile (Q-Q) plot is sketched (see Figure 7). Based on Figures 7, we note that the normality is nearly exists. For exploring the HRF for real data, the total time test (TTT) plot is provided (see Figure 8). Based on Figure 8, we note that the HRF is "monotonically increasing" for the two real life data sets. For exploring the initial shape of real data nonparametrically, kernel density estimation (KDE) is provided (see Figure 9). Figure 9 show nonparametric KDE for exploring the data. Figures 10 and 11 give the estimated Kaplan-Meier survival (EKMS) plot, Probability-Probability (P-P) plot, estimated PDF (EPDF), estimated CDF (ECDF) and estimated HRF (EHRF) for data set I and II respectively.
We estimate the unknown parameters of each model by maximum likelihood using "L-BFGS-B" method and the goodness-of-fit statistics Akaike information criterion (AIC), Bayesian IC (BIC), Consistent AIC (CAIC), Hannan-Quinn IC (HQIC), Anderson-Darling ( * ) and Cramér-von Mises ( * ) are used to compare the five models. In general, the smaller the values of these statistics, the better the fit to the data. The required computations are obtained by using the "maxLik" and "goftest" sub-routines in R-software. For failure times data: the analysis results of are listed in Tables 3 and 4. Table 3 gives the MLEs and standard errors (SEs) for failure times data. Table 4 gives the −l and goodness-of-fits statistics for failure times data. For service times data: the analysis results of are listed in Tables 5 and 6. Table 5 gives the MLEs and SEs for service times data. Table 6 give the −l and goodness-of-fits statistics for the service times data. Based on Tables 4 and 6, we note that the MOLLx model gives the lowest values for the AIC, CAIC, BIC, HQIC, * and * among all fitted models. Hence, it could be chosen as the best model under these criteria.  Gamma Lx GamLx 7 Transmuted Topp-Leone Lx TTLLx 8 Reduced TTLLx RTTLLx 9 Odd log-logistic Lx OLLLx 10 Reduced OLLLx ROLLLx 11 Reduced Burr-Hatke Lx RBHLx 12 Special generalized mixture Lomax SGMLx 13 Reduced MOLLx RMOLLx 14 Proportional reversed hazard rate Lx PRHRLx Failure times data Service times data Failure times data Service times data      Figure 11: EKMS plot, P-P plot, EPDF, ECDF and EHRF for data set II.

Conclusions
A new four parameter lifetime model called the Marshall-Olkin Lehmann Lomax (MOLLx) is proposed and studied. The MOLLx density function can be "monotonically right skewed", "symmetric", "monotonically left skewed" and "uniformed density". The MOLLx failure rate function can be "monotonically decreasing", " monotonically increasing" and "constant". The new MOLLx density can be expressed as a mixture of the exponentiated Lomax model. The skewness of the MOLLx distribution can negative and positive. The spread for the kurtosis of the MOLLx model is ranging from − 1129.85 to 311.698. The index of dispersion for the MOLLx model can be in (0,1) and also > 1 so it may be used as an "under-dispersed" and "over-dispersed" model. We derived some new bivariate versions of the MOLLx distribution via Farlie Gumbel Morgenstern family, modified Farlie Gumbel Morgenstern family, Clayton Copula and Renyi's entropy. The maximum likelihood method is used to estimate the MOLLx parameters. Using the "biases" and "mean squared errors", we performed simulation experiments for assessing the finite sample behavior of the maximum likelihood estimators. It is noted that, the biases for all parameters are generally negative and tends to 0 as → ∞ and the mean squared errors for all parameter decrease to 0 as → ∞ . The MOLLx model deserved to be chosen as the best model among many well-known Lomax extension such as exponentiated Lomax, gamma Lomax, Kumaraswamy Lomax, odd log-logistic Lomax, Macdonald Lomax, beta Lomax, reduced odd log-logistic Lomax, reduced Burr-Hatke Lomax, reduced MOLLx, special generalized mixture Lomax and the standard Lomax distributions in modeling the "failure times" and the "service times" data sets.