Goodness of Fit Tests for Marshal-Olkin Extended Rayleigh distribution

A class of goodness of fit tests for Marshal-Olkin Extended Rayleigh distribution with estimated parameters is proposed. The tests are based on the empirical distribution function. For determination of asymptotic percentage points, Kolomogorov-Sminrov, Cramer-von-Mises, Anderson-Darling, Watson, and Liao-Shimokawa test statistics are used. This article uses Monte Carlo simulations to obtain asymptotic percentage points for Marshal-Olkin extended Rayleigh distribution. Moreover, power of the goodness of fit test statistics is investigated for this lifetime model against several alternatives.


Introduction
The Rayleigh distribution is very popular among lifetime distributions. Some of the areas where it is used are the study of vibrations and waves, theory of communication to explain instantaneous peak power and hourly median of signals received at a radio, to model wind speed under certain circumstances at wind turbine sites in a year and for modeling the lifetimes of devices. There are variety of methods to add parameters to some existing probability distribution to get some new distribution, which generally provides much more flexibility to model the lifetime data. Marshall and Olkin (1997) suggested that a new lifetime distribution having survival function ̅ ( ; )can be formed by adding a parameter " " to another distribution ( ). i.e. ̄( , ) = ̄( ) wherē= 1 − , and ̄( ) is the survival function of a continuous type. They discussed this method with the application to the Exponential and Weibull families. Goodness of fit tests evaluate the degree of agreement between the observed sample distribution and the theoretical distribution. There are so many goodness of fit tests available in the literature, some of which endure severe limitations. Grouping of data is requirement of chi-square goodness of fit test which do not work in case of smaller sample. Tests based on correlations and tests based on moment ratios are most of the times under-estimated.
In most of practical situations, the parameters are usually unknown and are estimated from sample data, using an estimation method such as maximum likelihood. Goodness of-fit test statistics are not distribution-free when the parameters are unknown and have to be estimated from sample. The distribution of test statistics depend on sample size, the population parameters being estimated, estimation technique and on the hypothesized distribution.
Empirical distribution function (EDF) based tests are appropriate to use when the population parameters are unknown and are estimated from sample data, as these provide high power than other tests. Moreover, the empirical

Estimating the parameters of Marshal-Olkin extended Rayleigh distribution
A two parameter distribution, named as Marshal-Olkin Extended Rayleigh (MOR) distribution is obtained by using equation (1) as a generalization of standard Rayleigh distribution. Let 'X' be a continuous random variable follows the Marshal-Olkin Extended Rayleigh (MOR) distribution with shape parameter a and scale parameter ß, the cumulative distribution function for MOR distribution is The density function of Marshal-Olkin Extended Rayleigh (MOR) distribution is Figure 1. Probability density function of Marshal-Olkin Extended Rayleigh distribution for = 1 and different values of shape parameter α. Let 1 , 2 , … , be a random sample from Marshal-Olkin Extended Rayleigh (MOR) distribution, then the loglikelihood function is The MLE's of can be determined from the solution of non-linear equations (5) and (6). Since the MLE's are not in explicit form so we use Newton-Raphson iterative method to obtain the numerical estimates of the parameters of MOR distribution.

Random Number Generator.
The random number generator of Marshal-Olkin Extended Rayleigh (MOR) distribution is given by where R is the random number from uniform distribution u(0, 1).

Simulations and Power Study for Marshal-Olkin extended Rayleigh distribution
Where 0 ( ,̂,̂) is the cumulative distribution function of ( , ) distribution, is the sample size and ̂,̂ are the estimated parameters using maximum likelihood estimators of from (5) and (6), respectively.

Calculation of critical values for Marshal-Olkin extended Rayleigh distribution
For calculating critical values, we proceeded as follows: 1. We generated a random sample 1 , 2 , … , from Marshal-Olkin Extended Rayleigh distribution with probability density function (3). 2. For this, we first generated a random sample of n ordered statistics i.e. (1) , (2) , (3) , … . . , ( ) from uniform distribution (0, 1). 3. Then using (7), the random number generator of MOR ( , ) with =1 and = 0.50, we obtained an ordered sample of size n from MOR distribution. 4. We used this random sample to estimate the unknown parameters of MOR distribution by method of maximum likelihood. Because the normal equations (5) and (6) to obtain MLE's are non-linear, so we used Newton-Raphson iterative method to solve these non-linear equations. 5. We performed this iterative procedure using the 'rootSolve' package in R that was suggested by Soetaert (2009). 6. Then we determined the cumulative distribution function of our hypothesized Marshal-Olkin Extended Rayleigh distribution using these maximum likelihood estimates of the unknown parameters. 7. We selected the sample of size as = 5 (5), 50 (10), 100 and calculate all five test statistics , 2 , 2 , 2 for these sample sizes. 8. We repeated this procedure to generate 10,000 Monte-Carlo runs, from which we obtained 10,000 independent values of each test statistic. 9. We then ranked these 10,000 values for each statistic. 10. We selected seven levels of significance (γ) as 0.01, 0.025, 0.05, 0.10, 0.15, 0.20 and 0.25 at which we compute the critical values for each test statistic against each sample size.
Following table represents the critical points of five goodness of fit test statistics against different significance level and sample sizes using Monte Carlo method. statistic also decreases as sample sizes increases for all significance levels.

Power study for Marshal-Olkin extended Rayleigh distribution
In testing of hypothesis, power is a useful tool to evaluate the goodness of a particular test or to compare two competing tests. Power of goodness of a test is denoted by 1 − and is defined as a probability that a statistic leads to reject a null hypothesis 0 , when infact it is not true. Here is the probability of making type-II error. We calculated the power of Marshal-Olkin Extended Rayleigh distribution by simulating the data from the alternative distribution ( 1 ) and fitting the MOR distribution ( 0 ) to this data. We repeated this process 10,000 time for sample size = 5(5), 30and significance level = 0.01, 0.25, 0.05, 0.10, 0.15, 0.20, 0.25. We consider = 5 as small, = 15as moderate and = 30as large sample sizes. We observed the number of times each test statistics exceeded respective critical values at each level of significance to obtain power of tests. The results of power of these tests are presented in table 2 to 7:  Table 2 shows that power of all goodness of fit tests for MOR distribution increases monotonically as level of significance increases. For small to moderate sample size 2 is more powerful than , 2 , 2 , . For large sample size all test statistics depict good power.
We computed the power of the test for Cauchy, Gamma and Logistic distribution using , 2 , 2 , 2 statistics. The results of power of the tests are very close to one even for small sample sizes for different choices of probability of type-I error. Thus, those alternative distributions are not much informative   Table 4 presented that for MOR distribution, power of tests improves as significance level increases. We observed that for small sample sizes 2 show high power as compared to other tests while Anderson-Darling statistic 2 appears to be more powerful than , 2 , 2 , for moderate and large sample sizes.  Table 5 showed that for sample size as n = 5 and significance level γ = 0.01 and γ = 0.025, Kolmogorov-Smirnov statistic appears to be more powerful than other tests but for the same sample size and significance level 0.05 to 0.25, Anderson-Darling 2 depicts high power than , 2 , 2 , . For all the significance levels and sample sizes from 10 till 30, 2 is the most powerful among all test statistics except for n =30, γ = 0.01.  Table 6 we can see that for n = 5 and γ = 0.01, 0.025, Kolmogorov-Smirnov statistic is more powerful than other tests but for the same sample size with γ =0.05, 0.10, 0.15, 0.20, 0.25, Anderson-Darling 2 shows high power than , 2 , 2 , .While for all the significance levels and sample size n = 10, 2 is the most powerful among all test statistics. We also observe that for sample size 15, 2 represents highest power for γ = 0.01, 0.025 while for γ = 0.05 till γ = 0.25, is more powerful. For sample size 20, 25 appears to be most powerful except γ = 0.01 and for n = 30 and γ = 0.01, is most powerful among all statistics.

Real life application
In this section, we used a real life application using well known data set to show the wider applicability of our proposed model over other competing models in survival and reliability. We considered Exponential, Rayleigh, Generalized Exponential (GE) (