A New Unit Distribution Based On The Unbounded Johnson Distribution Rule: The Unit Johnson SU Distribution

This paper proposes a new probability distribution, which belongs a member of the exponential family, defined on (0,1) unit interval. The new unit model has been defined by relation of a random variable defined on unbounded interval with respect to standard logistic function. Some basic statistical properties of newly defined distribution are derived and studied. The different estimation methods and some inferences for the model parameters have been derived. We assess the performance of the estimators of these estimation methods based on the three different simulation scenarios. The analysis of three real data examples which one is related to the coronavirus data, show better fit of proposed distribution than many known distributions on the unit interval under some comparing criteria.

The signs of this equation can not be determined analytically. So, the plots of pdf and shape regions of the pdf are drawn for its density shapes in Figure 1. From Figure 1, we see that various pdf shapes such as w-shaped, U-shaped, uni-modal shaped, N-shaped, inverse N-shaped, decreasing and increasing for the distribution. At the same time, these pdf shapes can be seen with plot of the pdf regions. As a result, we can say that the both and parameters are the shape parameters of the model. Due to these two parameters of the distribution, which ensure different shapes to model, we obtain many unit distributions on the unit interval with very flexible form. On the other hand, it is interesting that when = 0, the distribution is symmetric about 1/2 and the expected value of the distribution is equal to 1/2. To see this, we can write following equations It can be easily seen that (1/2 − , , ) = (1/2 + , − , ) from above equations using the sinh −1 ( ) = −sinh −1 (− ) for ∈ ℜ. From above equations, it is also noticed that ( , , ) = (1 − , − , ). Moreover, if the ( , ) may be right skewed, then the (− , ) is left skewed (see moments subsection). Consequently, we can say that the shapes of this distribution, especially w-shaped, N-shaped and inverse N-shaped pdfs, can be distinguishing feature on data modeling. Its some distribution properties are given by the following subsections.

Hazard rate function
The hazard rate function (hrf), of the distribution is given by , 0 < < 1.
To define hrf shapes, we obtain the first derivation of the logℎ( , , ) as The signs of this equation can not be determined analytically. So, the plots of hrf and shape regions of the hrf are drawn for shapes of the hrf in Figure 2. From this Figure, its hrf shapes can be w-shaped, bathtub shaped and increasing. At the same time, these hrf shapes can be seen with plot of the hrf regions. It is distinctive feature that the distribution has w-shaped hrf on the unit interval. So, the w-shaped hrf shape of this distribution can be striking property on data modeling as well as in its density.

Moments
The ℎ moment of the ( , ) distribution is given by The first four moments of the distribution are given by Alternatively, the following equation can be given for ′ From above equations, we see that the ℎ raw moments can not be put forth analytically however, numerical integration can be applied to calculate the mean and other important related measures. The ℎ central moment can be obtained by the following equation Then, the skewness and kurtosis coefficients are respectively given by √ 3 2 2 3 and 4 2 2 . These above calculations can be easily computed using many packet programs such as R, S-Plus, SAS and Wolfram. The plots of the skewness and kurtosis coefficients for selected values of the and parameters are shown in Figure  3. From this Figure, we see that both two parameters affect skewness and kurtosis of the distribution. The distribution can be left skewed, right skewed and symmetrical. For = 0, the skewness of distribution is equal to zero as we expect. For fixed , when increases, the skewness goes to zero as well as the kurtosis goes to 3. These plots indicate that the this distribution can model various data types on unit interval in terms of skewness and kurtosis. Therefore, the distribution belongs to the exponential family and according to factorization theorem, the ( ) = (∑ =1 sinh −1 [log( /(1 − ))] , ∑ =1 (sinh −1 [log( /(1 − ))]) 2 ) is the sufficient statistics for the ( , ), where n is the sample size.

Stochastic ordering
Stochastic ordering of positive continuous random variables is an important tool to judge the comparative behavior of such variables. For this purpose, we shall recall some basic definitions. Let us denote the pdf, cdf and hrf of a positive continuous rv by (⋅), (⋅) and ℎ (⋅) respectively, and those of another positive continuous rv by (⋅), (⋅ ) and ℎ (⋅) respectively. We recall some basic definitions. A rv is said to be smaller than a rv in the The below given implications (see Shaked and Shanthikumar, 2007) are well justified: The following Proposition shows that the distributions are ordered with respect to different stochastic orderings. Proof. For any 0 < < 1, the likelihood ratio is given by Hence, the proof is completed.

Relative Entropy
The relative entropy or Kullback-Leibler distance (also known as cross entropy or discriminant function) of a random variable with density function with respect to another random variable with density is defined as ( : ) = ∫ log( / ) (Kullback, 1997). This is the discrimination function since ( : ) ≥ 0 and equality holds if and only if = almost everywhere. The relative entropy is used as a measure for comparing information content of distributions in information theory. Since the symmetrical distribution around 0.5 is sub-model of the distribution, one may want to compare the information contents of the distributions family to the information content of the symmetrical distribution. Under this definition, the relative entropy is obtained as .
By using transformation = sinh −1 [log ( 1− )] and after some calculations, we derive ( : ) = 2 /2. As it can be seen that the relative entropy does not depend on , it is a function of only parameter. Therefore, no information is added to or subtracted from the information content of a distribution by varying the parameter . Furthermore, this quantity attains its minimum value 0, as expected. When approaches to both ∞ and −∞, the relative entropy approaches to ∞.

Order statistics
be a random sample of size n from the distribution, and let (1) ≤ (2) … ≤ ( ) denote the corresponding order statistics. It is well-known that the cdf and pdf of the ℎ order statistic, ( ) , are defined by respectively, where = 1,2, … , and B(⋅,⋅) is the beta function. Therefore, the cdf and pdf of the ℎ order statistic of the distribution are given by For = 1 and = , we have the pdf of the (1) = { 1 , 2 , . . . , } and ( ) = { 1 , 2 , . . . , }, respectively.

Quantile function and random number generation
Let be a rv with cdf (3). Then, the quantile function, ( , ) = −1 ( , , ), of the distribution can be given by where 0 < < 1 and Φ −1 ( ) is the u th quantile of standart normal distribution. Hence, if is a uniform rv on (0,1) then, ( , ) is the rv. To generate random variables from the distribution, we have the following algorithm.
A New Unit Distribution Based On The Unbounded Johnson Distribution Rule: The Unit Johnson SU Distribution 478

Different methods of the parameter estimation
In this section, we point out the six different estimators to estimate of the model parameters. These methods are: the maximum likelihood (with inferences based on it), maximum product spacings, least squares, weighted least squares, Anderson-Darling and Cramer-von Mises estimators. The details are the following.

Maximum likelihood estimation
In this subsection, we estimate the parameters of the distribution via the method of maximum likelihood estimation (MLE). Let 1 , 2 , . . . . . . , be a random sample from the distribution with observed values 1 , 2 , . . . . . . , , and Ξ = ( , ) be the vector of the model parameters. The log likelihood function for Ξ may be expressed as The MLEs, ̂ and ̂, of the and parameters can be obtained as the simultaneous solution of the following non-linear equations: and From (7), the ̂ is obtained as function of the and is given by Substituting (8) in (6), we obtain the profile log-likelihood of as Therefore, the ̂ is obtained by maximizing (9) based on parameter. Following the normal routine of parameter estimation for the ̂2 , we have
The elements of this information matrix are .
Above expectations are calculated as Hence, the Fisher information matrix is obtained as ).

Maximum product spacing estimation
The maximum product spacing (MPS) method is an alternative method to MLE for parameter estimation. This method was proposed by Cheng and Amin (1979) and independently developed by Ranneby (1984) as approximation to the Kullback-Leibler measure of information. This method is based on the idea that differences (spacings) between the values of the cdf at consecutive data points should be identically distributed. The geometric mean of the differences is given as where, the difference is defined by where, ( (0) , , ) = 0 and ( ( +1) , , ) = 1. The maximum product spacing estimations (MPSEs), ̂ and ̂, of the and parameters are obtained by maximizing the geometric mean (GM) of the differences. Substituting cdf of the in (10) and taking logarithm of the above expression, we have The MPSEs can be obtained as the simultaneous solution of the following non-linear equations: ])).

Least squares estimation
Let (1) , (2) , … , ( ) be ordered statistics from the distribution with sample size . The least square estimations (LSEs), ̂ and ̂, of the and parameters are obtained by minimizing the following equation where [ ( ( ) )] = +1 is the expected value of the empirical cdf for = 1,2, … , . Then, ̂ and ̂ are solutions of the following equations:

Weighted least squares estimation
Let (1) , (2) , … , ( ) be ordered statistics from the distribution with sample size . The weighted least square estimations (WLSEs), ̂ and ̂, of the and parameters are obtained by minimizing the following equation

Anderson-Darling estimation This estimator is based on Anderson-Darling goodness-of-fits statistics which is introduced by Anderson and Darling
Therefore, the ̂ and ̂ can be obtained as the solution of the following system of equations:

The Cramer-von Mises estimation
The Cramer-von Mises (CVM) minimum distance estimations, ̂ and ̂, of the and parameters are obtained by minimizing (15) Therefore, the ̂ and ̂ can be obtained as the solution of the following system of equations: We note that one may see Chen and Balakrishnan (1995) for AD and CVM goodness-of-fits statistics in detail.
As it can be seen, all estimating equations except those of the MLE method contain non-linear functions, it is not possible to obtain explicit forms of all estimators directly. Therefore, they have to be solved by using numerical methods such as the Newton-Raphson and quasi-Newton algorithms.

Simulation experiments
In this Section, we perform the graphical simulation studies to see the performance of the above estimators of the distribution with respect to varying sample size . We generate = 1000 samples of size = 20, 25, … ,1000

Data Analysis
In this section we analyze three data sets, which one is related to the coronavirus data and the others are related to burr measurements on the iron sheets, to compare the distribution with well know unit distribution in the literature. These comparing models are (with their pdfs for 0 < < 1): The first two data sets have been firstly introduced and studied by Dasgupta (2011) for burr measurements on the iron sheets. For the first data set of 50 observations on burr (in the unit of millimeters), the hole diameter is 12mm and the sheet thickness is 3.15mm. For the second data set of 50 observations, hole diameter and sheet thickness are 9 and 2mm, respectively. Hole diameter readings are taken on jobs with respect to one hole, selected and fixed as per a predetermined orientation. The two data sets relate to two different machines under comparison. One may see Dasgupta (2011) about the technical details of the data sets' measurements. These data sets have been analyzied by Korkmaz   We give the summary statistics of the data sets in Table 1. The three data sets are right skewed however the second data set is close to symmetrical distribution.  Tables 2-4 show analyzing results based on the application models. When we see these Tables, distribution model can be chosen as the best model since it has the smallest values of the AIC, BIC, * and * statistics. Further, Figures 8, 9 and 10 show all fitted densities and their cdfs for the data set I, data set II and data set III respectively. Hence, we observe that the fitting has successfully captured shape of the empirical data, skewness and kurtosis for all data sets. Furthermore, we give parameter estimation results and goodness-of-fits statistics of the distribution based on other estimation methods in Table 5. These results are better than the best results of the Tables  2-4 according to goodness-of-fits statistics. Consequently, the fitting results of the distribution to three data sets is not ignored.

Concluding remarks
A new alternative unit distribution belongs to exponential family is introduced. Statistical properties are studied in detail for newly defined distribution. For the model parameters, the six different estimators have been presented based on the methods of the maximum likelihood, maximum product spacing, least square, weighted least square, Anderson-Darling and Cramer-von-Mises estimation methods. Three simulation studies based on the different model scenarios are performed to illustrate the performances of above estimation methods. Three applications to the real data sets, which one is related to coronavirus data, show that the fitting results of the distribution are not ignored. It is hoped that the distribution will be a remarkable model in different disciplines as well as applied probability and statistics.