Estimation Based on Ranked Set Sampling for Farlie–Gumbel–Morgenstern Bivariate Weibull Distribution Parameters with an Application to Medical Data

In this article, we address the problem of estimating the parameters of Farlie-Gumbel-Morgenstern bivariate Weibull distribution using ranked set sample (RSS) design. The suggested estimators of the FGMBW distribution parameters are compared with their counterparts based on simple random sampling (SRS) via Monte Carlo simulations studies. An example of a real data set consists of times (in days) to the ﬁrst and second recurrence of infection for 30 kidney patients is considered for illustration. It turns out that the RSS estimators results in an improvement in efﬁciency as compared to the simple random sampling estimators based on the same number of measured units for all cases considered in this study.


Introduction
Due to its flexibility, the Weibull distribution has received a great deal of attention in the literature in the recent past.It has been widely studied and applied in several fields including reliability, lifetime data analysis, climatology, biology, and engineering.(see, for instance, Helu et al.(2010) and the reference therein).This paper deals with the problem of estimating the parameters of Farlie-Gumbel-Morgenstern bivariate Weibull (FGMBW) distribution, which was introduced by Almetwally et al.(2020) as a modification of the base Weibull distribution, based on ranked set sampling (RSS) design.The primary objectives of this study are multifaceted and encompass the following key aspects: Firstly, the study aims to address the problem of estimating the parameters of the FGMBW distribution by utilizing RSS design as a method of estimation.Secondly, the study endeavors to evaluate the performance of the proposed RSS estimators and their counterparts based on SRS by employing Monte Carlo simulations.These simulations provide a quantitative framework to assess the efficiency and resilience of the estimators under various scenarios.Furthermore, to demonstrate the practicality and potential applicability of the proposed estimators in real-world situations, the study presents an illustrative example using a real dataset.This dataset comprises times (in days) to the first and second recurrence of infection for 30 kidney patients, providing concrete evidence of the proposed methodology's utility.
Lastly, the study aims to conduct an efficiency improvement analysis, wherein it analyzes and showcases the superior efficiency achieved by the RSS estimators in comparison to the SRS estimators.By evaluating the efficiency gains obtained with RSS, the research emphasizes the considerable benefits of adopting this sampling design for parameter estimation tasks.
A convenient method for describing a multivariate distribution is a copula.One of the most well-known parametric families of copulas is the Farlie-Gumbel-Morgenstern (FGM) family, which was explored by Gumbel(1960) A random variable X is said to have a univariate Weibull distribution if it has the following cumulative distribution function (cdf) and probability density function (pdf), respectively, given by and where α, β are the scale and shape parameters, respectively.The cdf of the FGMBW distribution can be expressed as and the corresponding pdf of the FGMBW distribution is defined as The Farlie-Gumbel-Morgenstern bivariate Weibull (FGBMW) distribution is a statistical model used to describe the joint distribution of two random variables, which may have Weibull marginal distributions.It is often employed in modeling the reliability of systems and lifetime data.Moreover, FGBMW extends the concept of the Farlie-Gumbel-Morgenstern (FGM) copula to cases where the marginal distributions are Weibull.This copula is utilized to model the dependency structure between the two variables, accommodating various types of dependence, including positive and negative associations.
Researchers and analysts can use this distribution to gain insights into the joint behavior of variables in various fields.Some applications of the FGBMW distribution include: reliability analysis in which it can be used to model the lifetime of systems with two failure modes.In actuarial science, it can model the joint distribution of two insurancerelated variables, such as the claim amounts for two different types of insurance policies.Also, it can be applied to model environmental data where two variables are related, such as the joint distribution of temperature and wind speed for wind energy assessment.In addition, it can be used in finance to model the joint distribution of financial losses or returns for two different assets or portfolios, allowing for the modeling of their dependence.Furthermore, it can be applied in medical research to model the joint distribution of two health-related variables or the occurrence of two medical events.Moreover, it can be used in hydrological studies to describe the joint distribution of two hydrological variables, such as rainfall and river discharge (Conway(2014)).
The moment generating function of (X, Y ) is given by Figure 1 presents some pdf plots to the FGMBW distribution for some selected parameters.
For more details about the FGBMW see Almetwally et al.(2020).
To achieve accurate statistical inference without a much cost or time consuming, it is most common to use an efficient methodology to choose the samples such as RSS.The RSS is a sampling design used to collect data by employing ranking on observations in a way that provides improvements in parametric estimation.The RSS was first envisaged by McIntyre(1952) as a cost-efficient procedure alternative to the SRS in the situations where quantifying sampling units in a study is difficult, time-consuming or expensive but ranking them according to the variable under investigation is relatively easy and cheap.As an illustration of the RSS design.Assume that we would like to estimate the average height of trees in a field.Actual measurement of the sampled trees height may be difficult, however, ranking the height of a small group of trees can be done easily by eye inspection.In some cases, the visual inspection might not be clear, in such cases, the ranking based on a concomitant variable that is correlated with the variable of interest can be considered.For instance, diameters of trees can be used as concomitant variable since it is highly correlated with the heights of trees to estimate their height.
The RSS design has been an active field in statistical community and recently continues to attract widespread attention in many ecological and agricultural studies.The fundamental theoretical properties of the RSS are established by Takahasi and Wakimoto(1968).They proved that mean estimator obtained using RSS procedure is more efficient compared with the corresponding ones obtained using SRS in estimating population mean.Dell and Clutter(1972) showed that the mean of RSS is still unbiased whether the ranking is perfect or not, i.e., there are errors in ranking.For recent published researches and detailed discussion on the theory and applications of RSS see for instance Al-Omari and Bouza(2014) and the references therein.
The RSS procedure can be described as follows: 1. Randomly drawing m 2 units from the target population, where m is the set size.
2. Distribute the m 2 units randomly into m sets each of size m.
3. The m units within each set are ranked visually or by any cheap way with respect to a variable of interest.
4. Then, the m RSS measurements are chosen by quantifying the i th smallest ranked unit from the i th set.This completes one cycle of the RSS.
5. The cycle may be repeated r times, if necessary, until the desired sample size n = rm is obtained.
This process can be presented in the following table: Table 1: Display of m 2 quantifications in the r th set cycle sets of size m Ranked sample units RSS The resulted n units form a RSS sample denoted by X (1:m) , X (2:m) , ..., X (m:m) .These units are independent but not identically distributed.Note that, in practice, the set size m should be small to avoid ranking errors; a larger sample size can be obtained by iterating the procedure.The resulted sample is denoted as X (i:m)j , the i th largest ranked unit in a set of size min the j th cycle, where i = 1, 2, . . ., m and j = 1, 2, . . ., r.Based on the above steps, the joint pdf of a RSS is given by the following equation (see Arnold et al.(2008)) To the best of our knowledge, there are no published papers on the estimation of the FGMBW distribution parameters under the RSS design.The remainder of the paper is organized as follows.In Section 2, the Maximum Likelihood Estimation (MLE) of the FGMBW parameters based on RSS is presented.Section 3 is devoted to a simulation study in order to compare the performance of the suggested RSS estimators with SRS estimators.An application of real data is discussed in Section 4. Finally, the paper is concluded in Section 5.

Maximum Likelihood Estimation
In this section, we derived the MLE estimators of the FGMBW distribution parameters using the SRS and RSS methods.

Simulation Study
In this section, a Monte Carlo simulation of 10000 samples is conducted to compare the performance of SRS and RSS based on the MLE procedure for estimating the FGMBW distribution parameters by R language in terms of the mean squared errors (MSE) and bias values.The efficiency of ΦRSS with respect to ΦSRS is given by: Different scenarios of the distribution parameters values are considered with sample size n = 30, 45, 90, 120.Also, two situations of ranking are considered, the first one when the ranking of X is perfect while the ranking of Y with errors and the results are presented in the (Tables 2-4), respectively, based on (α1 = 3.3, β1 = 2.4, α2 = 1.2, β2 = 0.7, θ = 0.5), (α1 = 0.75, β1 = 0.75, α2 = 0.75, β2 = 0.75, θ = 0.75), (α1 = 2.7, β1 = 2.3, α2 = 1.3, β2 = 1.7, θ = 0.50), and the second case is that the ranking of Y is perfect while there are errors in the ranking of X with obtained results given in the (Tables 5-7) for the same parameters values considered to the first case.Based on Tables 2-4 we can see that the suggested RSS estimators for the FGMBW distribution parameters are more efficient than their competitors based on SRS for all scenarios considered in this study when the ranking is performed on Y .The same thing can be conducted based on Tables 5-7 when the ranking is performed on X, except the case of β1 in Tables 5 and 7.Moreover, simulation studies conducted in this section validate the asymptotic properties of MLEs.The results consistently demonstrate consistency as the sample size increases, ensuring that the estimates converge to the true parameter values.In addition, they exhibit asymptotic efficiency when compared to alternative estimators under SRS in almost all cases.This efficiency implies that our estimators have smaller asymptotic variances, making them statistically more precise and powerful in capturing the underlying parameter.

Application
In this section, an analysis of real data is provided to investigate the efficiency of the suggested estimators based on RSS with respect to the SRS competitors based on the same number of measured units.The dataset presented by [18] contains of times (in days) to the first and second recurrence of infection, at the point of insertion of the catheter, for 30 kidney patients using a portable dialysis machine.Recurrence time can be defined as the time from infection until next infection.The first recurrence to infection is measured when a catheter is inserted, while the second recurrence to infection is measured as time elapsed between the second insertion of a catheter and the second infection.
Let X refers to first recurrence time and Y to second recurrence time, as following X is (  2020) showed that the FGMBW distribution fits this data as compared to other real life time models based on SRS method.The same data is considered by [18].In this section, using this data, we considered the sample size n = 9 based on both SRS and RSS methods with set size equal to 3 for RSS.For evaluating how well the model fits the data, two criteria are considered including Akaike Information Criterion (AIC) and Bayesian Information criterion (BIC), which are defined as AIC = 2K − 2log(L) and BIC = Klog(n) − 2log(L), where K is the number of independent variables used and L is the log-likelihood estimate.The method with the smallest values of AIC and BIC is the preferred one.The results are presented in Table 8.For more about the data, Figure ?? and ?? present the density, box, histogram and TTT plots of X and Y variables for the data while Figure 4 shows the scatter plot of the data.It can be seen that the data is skewed to the right.
From Table 8, it turns out that the AIC and BIC values based on RSS are less than their competitor based on SRS regardless of the error case.Also, the results are more efficient when the ranking is performed on X than when the ranking is based on Y .

Conclusion
In this paper, the RSS method in used to estimate the FGMBW distribution parameters and it is compared with the SRS method based on the same number of measured units.The maximum likelihood is implemented to estimate the distribution parameters.Also, an application of real data set is considered for illustration.It is found that the RSS estimators are more efficient than the SRS counterpart for all scenarios when the ranking is performed on X, while it is more efficient than SRS when the ranking is performed on Y in most scenarios.
. El-Sherpieny et al.(2023) considered Bayesian and non-Bayesian estimation for the parameter of bivariate generalized Rayleigh distribution based on clayton copula.Muhammed and Almetwally(2023) proposed Bayesian and non-Bayesian estimation for the bivariate inverse Weibull distribution.Muhammed et al.(2021) investigated the dependency measures for new bivariate models based on copula function.Blier-Wong et al.(2022) developed some theoretical properties of FGM copulas.

Figure 1 :
Figure 1: Plots of the FGMBW distribution pdf for some parameters Then the reliability and e hazard rate functions of FGMBW distribution is Hanandeh and Al-Saleh(2013) introduced some inferences on Downton's bivariate exponential distribution based on moving extreme RSS.Samuh et al.(2020) estimated the new Weibull-Pareto distribution parameters based on RSS.Pedroso et al.(2021) considered estimation for the two-parameter Birnbaum-Saunders distribution parameters based on RSS.Zamanzade et al.(2020) discuss efficient cdf estimation and reliability parameter estimation using moving extreme RSS, highlighting its superior efficiency in the tail of the distribution compared to SRS and RSS.Mahdizadeh and Zamanzade(2021) investigates the estimation of the area under the receiver operating characteristic (ROC) curve using Multistage RSS and compares it with SRS.Hanandeh et al.(2022b) proposed new mixed RSS.New double stage RSS for estimating the population mean is suggested by Hanandeh et al.(2022a).

Table 2 :
The efficiency of RSS with respect to SRS in estimating the parameters of the FGMBW distribution where the ranking of Y is with errors and α 1 = 3.3, β 1 = 2.4, α 2 = 1.2, β 2 = 0.7, θ = 0.5

Table 4 :
The finding in this paper may be modified in future works based on other modifications of RSS.The efficiency of RSS with respect to SRS in estimating the parameters of the FGMBW distribution where the ranking of Y is with errors α 1 = 2.7, β 1 = 2.3, α 2 = 1.3, β 2 = 1.7, θ = 0.50

Table 7 :
The efficiency of RSS with respect to SRS in estimating the parameters of the FGMBW distribution where the ranking of X is with errors and α 1 = 2.7, β 1 = 2.3, α 2 = 1.3, β 2 = 1.7, θ = 0.50

Table 8 :
The estimated parameters of the FGMBW distribution based on SRS and RSS for the infection data.