A Unified Approach for Generalizing Some Families of Probability Distributions, with Applications to Reliability Theory

In this paper, we propose a new method for generating families of continuous distributions based on the star-shaped property which grantees the existences of some well know properties for the generated classes and distributions for any non-negative random variables. We refer to the new class as the composed−G Q generator or shortly (C − G Q) generator. We study some mathematical properties of the new family. Some special families and sub-models from the C − G Q generator are discussed. To examine the performance of our new generator and the generated models in fitting several data we use two real sets of data; censored and uncensored then comparing the fitting of a new produced model called composedLomax Weibull (C − L W) with some well-known models, which provides the best fit to all of the data. A simulation has been performed to assess the behavior of the maximum likelihood estimates of the parameters under the finite samples.


Introduction
In recent years there has been an increased interest in defining new generators for univariate continuous distributions by introducing one or more additional parameter(s) to the baseline distribution. This induction of parameter(s) has been proved useful in exploring tail properties and also for improving the goodness-of-fit of the proposed family and provides great flexibility in modeling data in practices. Depending on a distribution with ( ), ( ) and ̅ ( ) as the probability density function (pdf), the cumulative distribution function (cdf) and the survival functions (sf), respectively, a lot of generators have been introduced in the literature based on the pdf or the cdf or also the sf as the base distribution to introduce new classes. So, in the following context we try to introduce a new method for generating a wide number of classes with known characteristics in the reliability theory.

One can use ( ) and
( ) for generating numerous families with a wide number of distributions based on two or more cdf's.

Definition 2.
Let be a continuous cdf distribution of an absolutely continuous random variable and (. ) is the cumulative hazard rate function. Now define a cdf, 1 , out of and as follows: and its corresponding pdf is given by 1 ( ) = ( . ( )). { . ℎ( ) + ( )}.
(3) Next we show a set of important results in the theoretical reliability hold for our newly introduced family. These results make the family much richer in applications. These results extracted from Barlow and Proschan (1981) and are listed below for convenience. Let and be continuous distributions, be strictly increasing on its support, and b) The relationship < is unaffected by a translation transformation of either and , assuming the random variables remain non-negative. c) The relationship < * may be destroyed by a translation transformation of either and , assuming the random variables remain non-negative. d) Let ( ) = 1 − − , be a continuous distribution function, with (0) = 0.
The Single Crossing Property. Let < , then i) ̅ ( ) crosses ̅ ( ) at most one, and from above, as x increases from 0 to ∞, from > 0. ii) If, in addition, and have the same mean, then a single crossing does occur, and has smaller variance than . iii) If we take to be the exponential distribution, then must be IFRA by the previous results.
To this end, we present the following arguments.
Based on Definition 1. we can see that the new generator enjoys the star-shaped property, which means any distribution derived based on the new family enjoys the results form a. to e.. Suppose that ( ) and ( ) are the cdf's of the exponential distribution and are respectively given by ( ) = 1 − − and ( ) = 1 − − (for > 0 and , > 0). Then, a new distribution called composed-exponential exponential (C− ), can be derived based on (1), and its cdf is given by , > 0, > 0, while, its corresponding pdf is given by . Now, we check the existence of the star-shaped property for the new generated model C− . i For any given values of , and , then ̅ ( ) crosses ̅ ( ) at most one, and from above, as increases from 0 to ∞, for > 0.  ii Let = 1, = 3 and = 0.9385107, then and have the same mean, so a single crossing does occur, and has smaller variance than . Figure 2 below shows this property visually for a given values for the unknown parameters. At = 0.9385107, = 3 and = 1, the variance of is 1.032239, while the variance of at same values is 0.6280421. It's clearly that have smaller variance than .
iii C− is an IFRA. The cumulative hazard rate functions of the composed− exponential exponential (C− ) is given by for , > 0 the quantity − controls the behavior of the function ( ) , while − is a decreasing function, so 1 − − is an increasing function, then we can conclude that is IFRA.
The rest of the paper is outlined as follows. In Section 2, special families and distributions are derived from the proposed generator. The statistical properties include quantile functions, moments and incomplete moments are derived in Section 3. Probability weighted moments, the order statistics and their moments are investigated in Sections 4 and 5. Section 6 disscusses the entropies of the propesed generators. The reliability properties include survival function, hazard and cumulative hazard functions and also mean residual and mean reversed life functions are derived in Section 7. In Section 8, We discuss the method of likelihood estimation to derive the equations used for estimating the unknown parameters. To examine the performance of the new generator section 9 gives a smiulation of one generated model from the generator and compare the performance of the produced model against different models.

New Families
In this section, we introduce new families that can be used to produce a wide range of new useful distributions. Such families are listed below.

The composed exponential-generated family
Suppose ( ) = 1 − − , ( , > 0) is the cdf of the exponential distribution, a new exponential family can be introduced using (1). This family will be named the composedexponential family ( − exponential ( − )), and its cdf is given by with corresponding pdf The following models are derived directly from the family, in (4). , results.

The composed Lomax-generated family
Suppose ( ) = 1 − (1 + ) − , ( , , > 0) is the cdf of the Lomax distribution, a new Lomax family can be introduced using (1). This family will be named the composed-Lomax family ( −Lomax ( − )), and its cdf is given by with corresponding pdf The following models are derived directly from the family, in (5).

The composed Lomax-Weibull distribution.
Inserting and corresponding pdf results.

Other new families
Every time one selects a different base distribution, new families arise as a result. The following illustrates our ideas: Selecting ( ) as a Weibull, log-logistic, Pareto, Burr or Extreme value distribution, the composed Weibull, composed log-logistic, composed Pareto, composed Burr or composed Extreme value-generated families arises. A lot of families can also be derived and a wide number of distributions could be derived based on this families.

Statistical Properties
This section explains the statistical properties of the new family in general terms. We then focus on the composed Lomax-Weibull distribution given by (7). Among the statistical properties considered are: the quantiles, the non-central moments and the incomplete moments.

Quantiles of the distribution
The ℎ quantile, , of the new class is the real solution ( ) = , which turns out to be the solution of . ( ) = −1 ( ). (9) Although (9) has no implicit form, it can be solved numerically. The ℎ quantile, , of the − distribution is the real solution of the following equation:

The moments
The ℎ non-central moment, , of the new generator can be formulated as where 1 ( ) is the solution for of the function = . ( ).

Probability weighted moments
The probability weighted moments (PWMs) method can generally be used for estimating parameters of a distribution whose inverse form cannot be expressed explicitly. We calculate the PWMs of the new class since they can be used to obtain the moments of the class. The PWMs of a random variable are formally defined by where and are non-negative integers and (. ) and (. ) are the cdf and pdf of the random variable . The PWMs of the new class with cdf (1) and pdf (2), are given by where 2 ( ) is the solution for of = ( . ( )). The ℎ non-central moment of the new class can be obtained by putting s = 0 in (17).

Moment of order Statistics
Order statistics make their appearance in many areas of statistical theory and practice. Let the random variable : be the -th order statistic ( 1: ≤ 2: ≤. . . ≤ : ) in a sample of size with pdf denotes by : ( ) and cdf denotes by : ( ).
The -th moment about zero of the -th order statistic are obtained by using a result in Barakat and Abdelkader (2004) and becomes . In particular, the -th moment about zero of the -th order statistic for the − distribution is given by:

Rényi and Shannon entropies
The entropy measure of a random variable with density function ( ) is a measure of variation of the uncertainty. One of the popular entropy measures is the Rényi entropy given by where > 0, ≠ 1.

The Shannon entropy which is defined by
The Rényi entropy for the − distribution is given by

Reliability Analysis
This section presents the survival function, the hazard rate function, the cumulative hazard rate function, the residual and reversed residual lifetime functions for the new generator and especially for the − distribution.

Survival Function
The new generator is a very flexible generator which can be a useful characterization of lifetime data analysis of a given system. The survival function of the new class is defined as: while , the survival function of the − distribution is given by: , , , , > 0, ≥ 0.

Hazard Rate and Cumulative Hazard Rate Functions
The other characteristic of interest of a random variable is the hazard rate function ℎ( ) which is defined as: while, the cumulative hazard rate is given by The hazard rate and the cumulative hazard rate functions of the − distribution are, respectively, given by:  Applying the binomial expansion for ( − ) and substituting ( ) given by (2) into ̈( ), the ℎ moment of the residual life of the new family distribution is given by An expression for the mean residual lifetime function follows by taking = 1.
In particular, the ℎ residual life for the − distribution is given by: . ( + 1) . ( + 1) . As doing before, then the ℎ moment of the reversed residual life of the new family distribution is given by An expression for the mean reversed residual lifetime function (or, the mean inactivity time) follows by taking = 1.
In particular, the th reversed residual life for the − distribution is given by:

Estimation
In this section we introduce the method of likelihood estimation to derive the equations used for estimating the unknown parameters.
Here, we consider estimation of the unknown parameters of the new class by the maximum likelihood method. Let 1 , … , be a random sample from (2). Let be a

The maximum likelihood estimate(s) (MLE(s)) of , say ̂, is (are) the simultaneous solution(s) of the equation(s) .
Maximization of (32) can be performed by using well established routines in the R statistical package.

Application
In this section, we use simulated data and real data (censored and uncensored) sets to compare the fits of the new model (composed-Lomax Weibull) and illustrate the usefulness of the new model.

Simulation study
To assess the behavior of the maximum likelihood estimators of the parameters , , θ and under the finite samples, we construct a Monte Carlo simulation for the composed Lomax Weibull ( − ) distribution. All results were obtained from 3000 Monte Carlo replications and the simulations were carried out using the statistical software package R. In each replication a random sample of size n is drawn from the − distribution. The true parameter values used in the data generating processes are = 3.5, = 6.5, = 2.8 and = 5.2. Table 1 presents the mean maximum likelihood estimates of the parameters, the bias and the root mean squared errors (RMSE) for different samples of sizes = 50, = 80 and = 100. Based on table 1 results, we notice that the biases and root mean squared errors of the maximum likelihood estimators of , , and decay toward zero as the sample size increases. Also, the bias of the parameter is increasing; the root mean square error is goes down.

Prices of children's wooden toys -uncensored
The data is obtained from the Open University (1993). The data represents the prices of 31 different children's wooden toys on sale in a Suffolk craft shop in April 1991. In order to determine the shape of the most appropriate hazard function for modeling, graphical analysis data may be used. In this context, the total time in test (TTT) plot is very useful (for more details see Aarset (1987)). The TTT plot for prices of children's wooden toys data is displayed in Figure 5, which provides evidence that a constant hazard rate is adequate.  In each case, the parameters are estimated by maximum likelihood and also model selection is carried out using Akaike information criterion (AIC), consistent Akaike information criterion (CAIC), Hannan-Quinn information criterion (HQIC), Bayesian information criterion (BIC), Anderson-Darling (A * ) and Cram'er-von Mises (W * ) to compare the fitted models. The calculation carried out using the R code (AdequacyModel). In general, the smaller the values of these statistics, the better the fit to the data. The estimates of the parameters and the standard error values of this estimates are listed in Table 2 while Table 3, gives the rest of the statistics as AIC, CAIC, BIC, HQIC, W * , A * and K-S values.  Table 3 shows that C − distribution fitted the data better than the other models. In order to assess if the model is appropriate, we plot in Figure 6 (a) and (b) the histogram of the data and the C-L W, LW, Kw-LL, N-MW, AW and Kw-P distributions and the empirical and their estimated cdf functions, respectively. These plots indicate that the C − distribution provides a better fit to these data than all other competitive lifetime models.

Waiting times before service -uncensored
The data were reported in Merovci and Elbatal (2013). The data set represents the waiting times (in minutes) before service of 100 bank customers. Figure 7 below shows the TTT plot of the waiting times before service data Figure 7. The TTT plot of the waiting times before service data.
The TTT plot for the current data is displayed in Figure 7, which is concave and according to Aarset (1987) provides evidence that the monotonic hazard rate is adequate.
We compare the fitting of the − model with 7 non-nested models. In each case, the parameters are estimated by maximum likelihood and also model selection is carried out using AIC, CAIC, HQIC, BIC, A * and W * to compare the fitted models. In general, the smaller the values of these statistics, the better the fit to the data. The estimates of the parameters and the numerical values of the statistics are listed in Table 4 while Table 5 gives the rest of the statistics as AIC, CAIC, BIC, HQIC, W * , A * and K-S values.  Table 5 shows that − distribution fitted the data better than the other models. In order to assess if the model is appropriate, we plot in Figure 8

Leukemia data-censored
Remission times for patients receiving a particular leukemia therapy. Lawless (1982, page 136) gives the results of a study to investigate the effect of a certain kind of therapy for 20 leukemia patients. After the therapy, patients go into remission for some period of time, the length of which is random. The observed times were 1, 1, 2, 2, 2, 6, 6, 6, 7, 8, 9, 9, 10, 12, 13, 14, 18, 19 24, 26, 29, 31+, 42, 45+, 50+, 57, 60, 71+, 85+, 91 weeks. The times marked with a + indicate patients who were still in remission at the time that the data were analyzed. These are known as right-censored observations because all that is known about them is that they did not come out of remission up to the given time and, presumably, would have come out at some point in time (to the right) of the observed survival times. Table 6 gives the MLE estimates of the parameters for different models.

Conclusions
In this paper, we propose a new method for generating families of continuous distributions, called the composed−G Q family or shortly (C − G Q) family, based on the star-shaped property. Special families and sub-models of the new generator are presented to provides the flexibality of the new generator. The statistical properties shuch as quantiles and moments are disscused in section 3. While, the probability weighted moments, moments of order statistics, and Rényi and Shannon entropies are disscused in sections 4, 5 and 6, respectively. Section 7, presents the reliability properties of the new generator shuch as the survival fuction, the hazard and cumulative hazard functions, moments of residual and reversed life functions. To examine the performance of our new generator and the generated models in fitting several data we use two real sets of data; censored and uncensored then comparing the fitting of a new produced model called composed-Lomax Weibull (C − L W) with some well-known models, which provides the best fit to all of the data. A simulation has been performed to assess the behavior of the maximum likelihood estimates of the parameters under the finite samples.