On Truncated Zeghdoudi Distribution: Posterior Analysis under Different Loss Functions for Type II Censored Data

We perform a Bayesian analysis of the upper truncated Zeghdoudi distribution based on type II censored data. Using various loss functions including the generalized quadratic, entropy and Linex functions, we obtain Bayes estimators and the corresponding posterior risks. As tractable analytical forms of these estimators are out of reach, we propose Markov chain Monte-Carlo (MCMC) based simulation approach to study their performance. Moreover, given initial values for the parameters of the model, we obtain maximum likelihood estimators. Furthermore, we compare their performance with those of the Bayesian estimators using Pitman’s closeness criterion and integrated mean square error. Finally, we illustrate our approach through an example with real data. keywords: Truncated Zeghdoudi distribution, Bayes estimators, generalized quadratic loss function, Linex loss function, posterior risk, Metropolis-Hastings algorithm, Pitman closeness criterion.


Introduction
Truncation is the process of excluding and omitting all the values that lay outside predetermined bounds in a statistical experiment. The remaining data points inside these bounds are called truncated data. A random variable X is upper (lower) truncated, at a given threshold level c, if only the outcomes of X for which X ≤ c(X > c) are considered, i.e. we omit all the values of X for which X > c(X ≤ c). For example, age is lower truncated, since it starts at Zero (c = 0). It is widely used in modelling lifetime data including the series of papers by Zeghdoudi and Nedjar (2016a, 2016b, 2016c where it is shown that it fits well a large class of real data sets. Messaadia and Zeghdoudi (2018) suggested a one parameter exponential family of distribution which is based on mixtures of gamma(2, θ) and gamma (3, θ) distribution, known as the Zeghdoudi distribution and showed that it fits well lifetime data sets where the Lindley distribution gave poor fit. In this paper, we propose a Bayesian analysis of the new upper truncated Zeghdoudi distribution. We first introduce the upper-truncated Zeghdoudi distribution (UTZ) which depends on two parameters. Then we derive the maximum likelihood (ML) estimators of these parameters for type II censored data. Furthermore, we derive the Bayesian estimators of these parameters under the generalized quadratic (GQ), the entropy and the Linex loss functions. We perform a simulation experiment to study the behaviour of the proposed estimators and compare them with the ML estimator using Pitman's closeness criterion. Finally, we compute the integrated mean square error (IMSE) for the three Bayesian estimators.
The rest of the paper is organized as follows. In section 2, we derive the truncated version of the Zeghdoudi distribution. The estimation of the parameters is displayed in section 3. Monte-Carlo simulation results are presented in section 4. In section 5, to illustrate the obtained results, we present an example based on real data. Finally, we conclude the paper in section 6.

The upper truncated Zeghdoudi distribution
The probability density function of the Zeghdoudi distribution (see Messaadia and Zeghdoudi (2018)) is and its cumulative function is The probability density function of the upper truncated Zeghdoudi distribution with the parameter β > 0, is given by which, in view of (1) and (2), is explicitly given by the formula Thus, the cumulative function is 3. Estimation of the unknown parameters

By setting
A =n!/(n − m)!, the likelihood function reads The corresponding logarithm is The solution of the following non-linear system yields the maximum likelihood estimators θ M LE and β M LE of the parameters θ, β, respectively. where The solution of the system (7) seems analytically intractable. We will rely on numerical methods to obtain approximate solutions. We will use the R package BB to obtain the approximate value of the maximum likelihood estimators θ M LE and β M LE of the parameters θ, β. The R package BB is successfully used for solving non-linear system of equations; see Varadhan and Gilbert (2010).

Bayesian Estimation under different loss functions
In the Bayesian approach, we consider the unknown parameters to be random variables instead of fixed constants unlike in the classical approach. For that matter the variation in the parameters allows to assume a prior information in terms of prior distributions on the unknown parameters.

(i) Prior and posterior distributions
We assume here that the parameters θ, β of the UTZ distribution have independent gamma distributions: where the constants a, b, c, d are called hyper-parameters. There is no objective motivation for choosing the gamma family as prior distributions, except for their flexibility, tractability and for being natural conjugate priors for the exponential distributions. Other prior distribution may well be used. The joint prior distribution of (θ, β) is then The joint posterior distribution of (θ, β) reads where K is a normalizing constant.

(ii) Loss functions
We obtain the Bayesian estimators under three different loss functions. Namely, the generalized quadratic (GQ), the Linex and the entropy loss function.
Let δ be the estimator of λ, the generalized quadratic loss function is defined as The Bayesian estimator of λ has posterior mean and the associated posterior risk The Entropy loss function is defined as where p is a selected constant and the associated posterior risk is The Linex loss function is defined as L(λ, δ) = exp(r(δ − λ)) − r(δ − λ) − 1. The Bayesian estimator of λ iŝ where r is a selected constant and the corresponding the posterior risk is (iii) Bayesian Estimators and their posterior risks.
In the case of the generalized quadratic loss function, the Bayes estimators are given by the formulas: The corresponding posterior risks are then Under the entropy loss function, we obtain the following estimators: The corresponding posterior risks are then Under the Linex loss function, we obtain the following estimators: and he corresponding posterior risks are In the next section, we will use an MCMC method to evaluate these estimators.

Simulation study
In this section, we present some simulation results to compare the performance of the different estimations that are proposed in this paper. We compare the performance of the ML and the Bayes estimators of the unknown parameters for the UTZ distribution under type II censored data. For given hyper-parameters a = b = c = d = 1, and θ = 1, β = 1, 5, using N = 5000 samples from the UTZ distribution, we obtain the following results.

Maximum Likelihood estimators
We have used the R package BB to derive the numerical values of the ML estimators: From Table 1, we see that the estimated values of θ are close to its true value. Moreover, the quadratic error is small. However, the estimated values of β are not close to its true value.

Bayesian estimators
To evaluate the Bayesian estimators, we will use the Metropolis-Hastings algorithm. In the case of the generalized quadratic loss function we use α ∈ {2, −1.  In Table 2 below we display the values of the Bayesian estimators and their corresponding posterior risks, in brackets, under the generalized quadratic loss function. We note that the chosen value α = −2 provides the best posterior risks, which presents the best estimator for the generalised quadratic loss function case. Also, when n is large, we obtain the smallest posterior risks. Table 3 presents the values of the Bayesian estimators and their corresponding posterior risks, in brackets, under the Entropy loss function.
From that table, the value p = −0.5 provides the best posterior risks, which presents the best estimations for the Entropy loss function case. Also, when n = 100 and n = 200, we obtain the smallest posterior risks.     In the following table, we present the values of the Pitman probabilities which allow us to compare the Bayesian estimators with the ML estimators under the three loss functions when α = −2.p = −0.5.r = −0.5. if the probability is greater than 0.5, the Bayesian estimators are better than the ML estimators.   According to Pitman's criterion, the Bayesian estimators of θ are better than θ M LE when n is small. Also the generalized quadratic loss function has the best values in comparison with the other two loss functions. However, β M LE is closer to the true value than all the Bayesian estimators. Moreover, θ M LE and β M LE perform better than the corresponding Bayesian estimators, when n is large.
Definition 4.2. The integrated mean square error is defined as In Table 7, below, we display the values of the integrated mean square error of the estimators under the three loss functions and the ML estimators. We remark that when n is small the Bayesian estimators of θ and β do provide a small IM SE for the parameters compared with θ M LE and β M LE . Also the values provided by the generalized quadratic loss function are relatively close to those provided by the entropy and the Linex loss functions. To conclude, the Bayesian estimators perform better than the ML estimators, and the generalized quadratic loss function gives the smallest IM SE.

Application to real data
In this section we illustrate the applicability of the UTZ distribution by performing the above estimations using a set of real data. The data set includes the number of gold particles observed on each dystrophin unit (dystrophin is a gene product of possible importance in muscular dystrophies) discussed in Mathews et Appleton (1993) and M Cullen et al. (1990). It is confirmed that the Zeghdoudi distribution fits these data, using the Kolmogorov-Smirnov (K-S) test. The K-S test value is 0.012901 which is smaller than their corresponding critical value at 5% level of significance, which is 0.025449 (for n = 198). Its P-value is equal to 0.793548. The complete observations are displayed in the next table.
x i 1 2 3 4 5 n n i 122 50 18 4 4 198 It is clear that the truncation point is 5 in the case of complete data. We assume it 3 for censured data, choosing m = 180.  Table 8 presents the Bayesian estimation of the parameters θ and β under the three loss functions with the corresponding posterior risk, and the ML estimators. We note that the estimators based on the complete data set provides smaller posterior risk compared with the censored data, which is expected since we lose part of the information using censored data. We also remark the entropy has the smaller posterior risk.
In the next tables, we present the values of the Pitman's criterion using the real data set of the Bayesian estimators and the MLE.  Comparing the estimators according to Pitman's criterion, the ML estimator performs better than the Bayesian estimator. Table 9 presents the values of the integrated mean-square error of the Bayesian estimators and the ML estimators.
We note that the values provided by the generalized quadratic loss function are the smallest, and all the Baysian estimators performs better than the ML estimators.

Concluding Remarks
In this study we proposed a new model, the upper truncated Zeghdoudi (UTZ) distribution. We compared Bayesian estimators of UTZ distribution under various loss functions. The performed Monte-Carlo study showed that the Bayesian approach based on the entropy loss function yielded the best estimator compared to the ones based on the other proposed loss functions. These selected Bayesian estimators are compared with the maximum likelihood estimators of the unknown parameters using Pitman closeness criterion and   the integrated mean square error, where it showed that when n is small, the Bayesian estimators gave better results, while when n is large enough, the ML estimators are closer to the true values but provide the highest IMSE than the Bayesian estimators. Finally, we show that the same conclusions hold using a set of real data.
In a future work, we plan to construct a mixture of the loss functions used in this paper to obtain an optimal estimator.

Acknowledgement
We thank the anonymous reviewers for providing us with their so appreciated comments, corrections and recommendations.