The Marshall–Olkin Pareto Type-I Distribution: Properties, Inference under Complete and Censored Samples with Application to Breast Cancer Data

In this paper, we introduce the Marshall–Olkin Pareto type-I (MOPTI) distribution. Statistical attributes of the MOPTI distribution including the quantile function, mean residual life, and a new theorem for strength-stress measure are introduced. Five methods of estimation for the MOPTI parameters based on complete samples are presented. Furthermore, we explore the estimation of the MOPTI parameters under type-I and type-II censoring. Two Monte Carlo simulation studies are conducted to evaluate the performance of the estimation methods under complete and censored samples. A real-life data set is used to validate the proposed methods.


Introduction
Many important distributions are widely used in various statistical applications to model several life-time data in applied fields such as engineering, insurance, economics, medicine, and life testing, among others.The limitation of the standard distributions arouses the interest of finding new distributions by extending existing ones.Hence, many authors have been interested in proposing new generalized forms of standard distributions as well as new generated families to overcome these limitations which may include the inability to model non-monotone hazard rate (HR) shapes.
by Jose (2011), Marshall-Olkin extended Lindley by Ghitany et al. (2012), MO log-logistic by Gui (2013) In this article, we are motivated to introduce a new flexible extension of the Pareto type-I (PTI) distribution using the MO-F family.The proposed model is called the Marshall-Olkin Pareto type-I (MOPTI) distribution.The MOPTI distribution provides flexible shapes for its HR function (HRF) and probability density function (PDF).It is also more flexible than the PTI distribution.We have also studied its mathematical characteristics in more detail including a new theorem to evaluate strength-stress measure as an important tool for measuring system reliability.We have also show that the exact values of the strength-stress are very close to its approximate values.The MOPTI parameters are estimated using different estimators under complete and censored samples.Comprehensive simulation studies are provided to assess the performance of introduced estimators.This article consists of six sections.Section 2 introduces the new MOPTI distribution.The structural properties of the MOPTI distribution are derived in Section 3. Five methods of estimation are introduced in Section 4. In Section 5, the MOPTI parameters are estimated under type I and type II censoring schemes.A real-life data set is analyzed in Section 6.Some conclusions are given in Section 7.
(4) Then using Equations ( 3), ( 4) in ( 1) and ( 2 The graphs of () and ℎ() for different values of the parameters are presented in Figures 1 and 2, respectively.These plots show that the MOPTI density provides left-skewed, reversed J-shaped, right-skewed.The failure rate of the MOPTI distribution can be increasing, unimodal and decreasing shaped.

Properties
In this section we derive some statistical attributes of the MOPTI distribution.

Quantile Function
In heavy tailed distributions, the measures of skewness and kurtosis based on quantiles are better than those based on moments for which the higher moments may not exist.The th quantile, say   , of the MOPTI model reduces to The Bowley skewness (Bsk) measure (Kenney and Keeping, 1962) and the Moor's kurtosis (Mkur) measure (Moors, 1988)  Table 1 gives the values of quantiles, Bsk and Mkur for the MOPTI distribution for some different choices of the parameters.It is noted that skewness and kurtosis decrease with the increase of .Substituting for   and  in Equation ( 8) by   and , respectively, where  = (0,1) is a uniform(0,1) random variable or a vector of (0,1), and   is a generating function.If  is a vector of (0,1) the   is a vector of MOPTI distribution with the same length.

Moments
The th moment of the MOPTI distribution has the form Expanding the denominator of () given in (6) for 0 <  < 1 and for  > 1.For  > 1, we can write Then (for 0 <  < 1,  < ), we have where

Strength-Stress Measure of System Reliability
In mechanical reliability of a system, the relation between the strength and stress of a device can be measured using statistical methods.If  is a RV represents the strength which is subject to stress , where  is another RV independent of , then the probability  =  ( ≥ ) is called the strength-stress measure () in statistical literature and it is considered as a measure of system reliability.When stress exceeds strength the system fails, so  is the probability of the system failure.The measure  has been introduced when  and  belong to different families (see, Church and Harris (1970), Kotz et al, (2003) and Ali et al. (2010)).
More specifically, the case of  and  are generalized inverted exponential distribution is studied by Abouammoh and Alshingiti (2009).Inference of  for logistic distribution is given in Babayi et al. (2014).Inference of  for Poisson distribution is given in Barbiero (2013).Estimation of  for generalized exponential and Weibull distributions are given in Kundu and Gupta (2005)  Now we study the estimation of the measure  when  and  belong to MOPTI( ,  , ) without any constraints on the parameters.Simply, the estimation of  when  and  belong to Pareto type-I distribution with four parameters  1 ,  1 ,  2 and  2 is a special case when  1 =  2 = 1.
Using ( 5) and ( 6), we have (for Now we introduce a new theorem to evaluate the .The theorem has four formulae according to the values of  1 and  2 when both  and  are MOPTI ( ,  , ), without any constraints on the parameters, except that they are all positive real numbers.

Remark 1:
The values of the  given by Equation ( 13) can be calculated numerically using the R function integrate ().Like all numerical integration routines, the R integrate function gives the approximate values and can fail if misused.The exact values of the  using the above new theorem are given in Table 3.If  and  are two independent RVs having PTI( 1 ,  1 ) and PTI( 2 ,  2 ) distributions, respectively, the  measure can be obtained for all cases of Theorem 1 by taking  1 =  2 = 1.We obtain the first term in each case which is equal to  1 ( 1 )  1 ℎ(0 , 0).Then, for  1 =  2 = 1, we obtain We provide the exact values of the  using (19) in Corollary 1 and the corresponding approximate values using the R integrate function in evaluating (13).These values are reported in Table 4.It is noted that the approximate values are very close to the exact values.
which is the same result given in [27].The exact values of the  using (20) in Corollary 2 and the corresponding approximate values using the R function integrate() in evaluating ( 13) are given in Table 5.

Mean Residual Life (MRL)
In life testing, the expected remaining lifetime given that a unit has survived until time  is a function of  called the MRL (see, Bryson and Siddiqui (1969), Muth (1977), Hollander and Proschan (1975), Greenwood et al. (1979), Bradley and Gupta (2003), and Chaubey and Sen (1999)).Specifically, if the RV  represents the life of a unit, then the MRL takes the form MRL() =  ( −  | > ) (Balkema and de-Hann, 1974).

Methods of Estimation
In this section we estimate the parameters  ,  and  of the MOPTI distribution by five different methods of estimation from complete samples.These methods are: maximum likelihood (ML), least-squares (LS) and weighted least-squares (WLS), percentile (PC), and maximum product of spacings (MPS) methods.

Maximum Likelihood Estimation
Let  = ( 1 , … ,   )  be a random sample of size  from the pdf (6) and unknown vector parameter Θ = (,  , )  .Define the log-likelihood function by ℓ, hence the partial derivatives of ℓ with respect to ,  and  ( 1 Setting the above equations to zero and solving them simultaneously yields the ML estimates (MLEs), say Θ ̂= ( ̂ ,  ̂ ,  ̂) .These equations may not be solved mathematically; hence we used the R software to solve them numerically.

Least-Squares and Weighted Least-Squares
We use the LS method to estimate the MOPTI parameters ,  and .The LS estimates (LSEs) are obtained by minimizing the following quantity: with respect to  ,  and , where () is the MOPTI CDF in (5) and  () is the th order statistic of the MOPTI model.
The WLS estimates (WLSEs) of  ,  and  of the MOPTI distribution are obtained by minimizing the following quantity: with respect to  ,  and , where () is as given by (5).Charnes et.al. [3] showed that the WLSEs are identical to those obtained using the MLE method in the exponential family under wild conditions.In this paper, we showed that the LSEs and WLSEs of  ,  and  are better than those of MLEs.

Percentile Estimation
This method was originally introduced by Kao (1985) and Kao (1959).The PC estimates (PCEs) of  ,  and  can be determined by minimizing the quantity: where  () is the th order statistic of the MOPTI distribution.
The MPS estimates (MPSEs) of  ,  and  are determined by maximizing the following quantity: ℓ   ( ,  , ), with respect to  ,  and .

Simulation Results
In this subsection, a Monte Carlo simulation study is provided to explore the performance of different estimation methods.We apply the above estimation methods to estimate the parameters  ,  and  of the MOPTI distribution using the R software with number of replications,  = 10000.The results are obtained from samples of size  = (20,200) and by choosing  = (1,2) ,  = ( 0.5 , 1.5) and  = (0.5, 1.5).The average values of the MLEs, LSEs, WLSEs, PCEs and MPSEs as well as the mean square errors (MSEs) are calculated and reported in Tables 6  and 7.The results in these tables illustrate that the MSEs decrease as sample size increases, which shows the consistency of all estimates.The results show also that the LS and WLS methods provide the best estimates for ,  and , in terms of their MSEs in most cases.Then, the performance ordering of the estimates from best to worst is the LSEs, WLSEs, MLEs, PCEs and MPSEs, in terms of their MSEs.

Estimation under Type-I and Type-II Censoring
In this section, we use censoring types I and II to estimate ,  and  of the MOPTI using the ML method.

The ML under Censoring Type-I
In censoring of type-I, the unit  is observed for a fixed time    ,  = 1, 2, … , , where  is the sample size and the number of failures in the sample is random and it is defined by  = ∑  =1   , where   refers to the death indicator such that,   = 1 if the unit  dies and   = 0 otherwise.Hence, the likelihood function takes the form where   = 1 if   ≤    ,   = 0 if   >    , () and ℎ() are given by ( 7) and (8).
Then, the log likelihood function is If the fixed observation times for all units are equal to a constant value,   the log likelihood function reduces to and   ≤  .By setting the partial derivatives, of the log likelihood function with respect to  ,  and , to zero and solving them simultaneously yields the MLEs Θ ̂= ( ̂ ,  ̂ ,  ̂) .But these partial derivatives cannot be solved analytically, hence the R statistical package is used to solve them numerically.

The ML under Censoring Type-II
In type-II censoring, the sample is followed until  units have failed.The number of failures  determines the precision of the study and is fixed in advance.Hence, the likelihood function under censoring type-II reduces to Let  (1) ,  (2) , … ,  () be the sample order statistics, then the log-likelihood function takes the form ln ( () ) + ( − ) ln ( () ), where   = 1, for  ≤  and   = 0, for  > .
The partial derivatives of the above function with respect to  ,  and  can be calculated and solving them simultaneously yields the MLEs.These equations cannot be solved analytically; hence the R software can be used to solve them numerically.

Simulation Results for Censored Samples
In this section, a Monte Carlo simulation study is employed to explore the performance of proposed methods.
We generate censored data from the MOPTI distribution under Type-I and Type-II censoring schemes for different choice of , where  = 20, 50, 100, 200 with number of replications 1000.The initial parameters of the MOPTI are  = 0.5,  = 1.5,  = 0.01.
The average estimate (AVE), interval estimate, means square error (MSE), lower (L) and upper (U) limits of asymptotic (ASY) confidence intervals (CI), and average interval length (AIL) with coverage percentage (CP) are reported for both types of censoring schemes (Type-I and Type-II) in Tables 8 and 9, respectively.
The results of the MLE for any sample size  and complete sampling can be shown from the corresponding rows of the following tables.For example, the results of  = 200 and complete sampling are given in the last three rows.We first check whether the MOPTI distribution is suitable for analyzing this data set.Now, we obtain the results from real-life data using the studied methods in the previous section.The MLEs of the parameters and the value of Kolmogorov-Smirnov (K-S) test statistic is reported to judge the goodness-of-fit.The MOPTI distribution is fitted to the given data under the considered schemes.The Akaike's information criterion (AIC), negative log-likelihood criterion (NLC) and Bayesian information criterion (BIC) are calculated to compare these methods.The MLEs of the MOPTI parameters along with their standard errors (SEs), AIC, BIC and NLC are respectively listed in Tables 10 and 12 for type-I and type-II censoring schemes.It is shown that EMOIP distribution provides a better fit under the two types of censoring.Furthermore, the MRL and EMRL are calculated empirically using the real-life data for the first and the last 10 values of ,  = 121,  = 0.2914,  = 1.8541 and  = 8659.4946.The results are reported in Table 12.Table 12 shows that the values of MRL and EMRL are approximately the same for small values of  and both decrease with increasing the survival time .

Conclusion
In this paper we introduced a three-parameter extended Marshall-Olkin Pareto type-I (MOPTI) distribution.Structural properties including moments, quantile function, mean residual life, and a new theorem for the strengthstress measure is provided.The MOPTI parameters are estimated using five methods of estimation based on complete sampling.Furthermore, the MOPTI parameters are also estimated using the maximum likelihood under censoring schemes of types I and II.A real data set on breast cancer is analyzed to validate our results and to illustrate the usefulness of the newly MOPTI distribution in applications.
The work in this paper can be extended in some ways.For example, entropy estimation based on the MOPTI distribution considering the works of Zamanzade andMahdizadeh (2016 and2017) , MO extended generalized Rayleigh by MirMostafaee et al. (2017), MO power generalized Weibull by Afify et al. (2020), MO alpha power class by Nassar et al. (2019), MO Burr family by Al-Babtain et al. (2021) and Marshall-Olkin-Weibull-H family by Afify et al. (2022), among many others.
), we obtain the CDF and PDF of the MOPTI distribution.The CDF of the MOPTI distribution takes the form (,  ,  > 0 and  ≥ .,  > 0 and  ≥ .

Figure 1 :
Figure 1: Density plots for various values of the MOPTI parameters.

Figure 2 :
Figure 2: Failure rate plots for various values of the MOPTI parameters.
and Mahdizadeh and Zamanzade (2017 and 2019).Bayesian inference the MOPTI distribution based on lower k-record values considering the work of MirMostafaee et al. (2016).Another extended versions of the PTI distribution can be established considering the works of Mead (2014), Fatima and Roohi (2015), and Tamandi et al. (2019).

Table 1 :
Quantiles, Bsk and Mkur of the MOPTI distribution for  = 0.2 and different values of  and .

Table 6 :
The numerical results of estimates and their (MSEs) for  = 20.

Table 7 :
The numerical results of estimates and their (MSEs) for  = 200.

Table 8 :
Average estimated values of the MLE, MSE and associated CI estimates, AILs and CPs (in %) of the MOPTI distribution under Type-I censoring scheme for different values of

Table 9 :
Average estimated values of the MLE, MSE and associated CI estimates, AILs and CPs (in %) of the MOPTI distribution under type-II censoring scheme for different values of The calculated K-S distance between the empirical and the fitted MOPTI distribution is 0.1061 and its  −  = 0.1310 where  ̂= 6859.49, ̂= 1.8541, and  ̂= 0.2914.This shows that the MOPTI distribution can be considered as an adequate model for the given data.Now, one can suppose the following schemes for Type-I and Type-II censoring with censoring fractions for Type-I and Type-II (30%,60%,90%,100%) and numbers of failures  and , respectively, as follows:

Table 10 :
The MLEs, SEs, AIC, BIC and NLC of MOPTI distribution using real data under Type-I censoring.

Table 11 :
The MLEs, SEs, AIC, BIC and NLC of MOPTI distribution using real data under Type-II censoring

Table 12 :
The values of the MRL and EMRL for the real data set