A New Extreme Value Model with Different Copula, Statistical Properties and Applications

In this article, we defined and studied a new distribution for modeling extreme value. Some of its mathematical properties are derived and analyzed. Simple types copula is employed for proposing many bivariate and multivariate type extensions. Method of the maximum likelihood estimation is employed to estimate the model parameters. Graphically, we perform the simulation experiments to assess of the finite sample behavior of the maximum likelihood estimations. Three applications are presented for measuring the flexibility of the new model is illustrated using three real data applications.


Introduction and motivation
The Fréchet (Fr) model is one of the most important distributions in modeling extreme values. The Fr model was originally proposed by Fréchet (1927). It has many applications in ranging, accelerated life testing, earthquakes, the floods, the wind speeds, the horse racing, the rainfall, queues in supermarkets and sea waves. One can find more details about the Fr model in the literature for example: Nadarajah and Kotz (2003) investigated the exponentiated Fr distribution. Nadarajah and Kotz (2008)  In this article, we expanded the extreme value theory (EVT) theory by proposing and studying a new version of the Fr model called the generalized Odd-Burr generalized Fréchet (OB-Fr) model (for more details about EVT see Fréchet (1927) and Fisher and Tippett (1928)). The new model is derived based on compiling the standard F model with the Odd-Burr generalized (OB-G) family (see Alizadeh et al. (2016)). Straightforward types of copula are employed for proposing many bivariate OB-Fr (BvOB-Fr) and multivariate OB-Fr (MOB-Fr) type extensions. A RV is said to have the Fr distribution if its probability density function (PDF) and cumulative distribution function (CDF) are given by For = 1 , the OB-G family reduces to the Odd G (O-G) family (see Gleaton and Lynch (2006)). For = 1 , the OB-G family reduces to the Proportional reversed hazard rate family (PRHR) (see Gupta and Gupta (2007)). In this work, we define and study a new Fréchet model based on OB-G family called generalized odd log-logistic F (OB-Fr) model. The OB-Fr survival function (SF) is given by where ( ) = 1 − ( )| ( = , , , ) . For = 1 , the OB-Fr reduces to the O-Fr. For = 1 , the OB-Fr reduces to the PRHR-Fr. The PDF corresponding to (5) is given by The hazard rate function (HRF) for the new model can be derived from ( )/ ( ) . The new model in (6) can be used in modeling extreme data such as the extreme floods, maximum sizes of ecological populations, the size of freak waves, the amounts of large insurance losses, equity risks, day to day market risk, side effects of drugs (e.g., Ximelagatran), survival time data, strengths and breaking stress experiments, large wildfires and repair times data. In mathematical analysis, the asymptotic analysis is used for describing the limiting behavior of some functions. Asymptotics derivations for the CDF, PDF and HRF can be obtained for the new model. The asymptotics of the CDF, PDF and HRF as → are given by The asymptotics of CDF, PDF and HRF as → ∞ are given by For simulation of this new model, we obtain the quantile function (QF) of (by inverting (5)), say = −1 ( ) , as where = 1 − . Equation  Renyi's entropy.
The Multivariate OB-Fr (MOB-Fr) type is also presented. However, future works may be allocated to study these new models. For more details see Ali et al. (2021a, b), Elgohari and Yousof (2020b and 2021) and Shehata and Yousof (2021a, b).

BvOB-Fr-FGM (Type I) model
Here, we consider the following functional form for both ( ) and ( ) as  The corresponding bivariate copula (henceforth, BvOB-Fr-FGM (Type II) copula) can be derived from

Clayton Copula
The Clayton Copula can be considered as Let us assume that ∼ OB-Fr ( 1 , 1 , 1 ) and ∼ OB-Fr ( 2 , 2 , 2 ) . Then, setting Then, the BvOB-Fr type distribution can be derived as

Mathematical properties 3.1 Useful representations
Due to Alizadeh et al. (2016), the PDF in (6) can be expressed as and (1+ ) ( ; , ) is the PDF of the F model with scale parameter (1+ ) − −1 and shape parameter . By integrating where (1+ ) ( ; , ) is the CDF of the F distribution with scale parameter (1+ ) − −1 and shape parameter .

Moments and incomplete moments
The ℎ ordinary moment of is given by  (10), we have where ( ) = 1 ′ is the mean of . The ℎ incomplete moment, say , ( ) , of can be expressed, from (9), as where ( , ) is the incomplete gamma function.

Index of dispersion (Index)
The Index of dispersion IxDis or the variance to mean ratio can derived as IxDis( ) = 2 / 1 ′ . It is a measure used to quantify whether a set of observed occurrences are clustered or dispersed compared to a standard statistical model. So, it indicates whether a certain statistical model is suitable for over (or under) dispersed datasets A New Extreme Value Model with Different Copula, Statistical Properties and Applications 1021 and is used widely in ecology as a standard measure for measuring clustering (over-dispersion) or repulsion (underdispersion). Thus, the measure can be used to assess whether observed data can be modeled using a Poisson process. For any real dataset, when the IxDis is less than 1 , the dataset is said to be "under-dispersed", this important condition can relate to occurrence patterns that are more regular than the randomness associated with a Poisson process.  Table 1 with useful comments. The same analysis for the standard Fr model is given in Table 2. Based on Tables 1 and 2 we note that, the Skew ( ) of the OB-Fr distribution can range in the interval (4.5, 1097), whereas the Skew ( ) of the F model varies only in the interval (1.2115, 3.5). Further, the spread for the Kur ( ) of the OB-Fr model is ranging from 25.16 to 1354275, whereas the spread for the Kur ( ) of the F model only varies from 4.5 to 98.8 with the above parameter values. The IxDis ( ) for the OB-Fr model can be only more 1 so it may be used as an "over-dispersed" model. Howevere the IxDis ( ) for the Fr model can be only between 0 and 1 so it may be used as an "under-dispersed" model.

Some generating functions (GF)
The moment generating function (MGF) can be derived using (8)  The first derivatives of ( ) , with respect to at = 0 , yield the first moments about the origin, i.e., In some cases, theoretical treatments of problems in terms of cumulants are simpler than those using moments. In particular, when two or more RVs are statistically independent, the ℎ order cumulant of their sum is equal to the sum of their ℎ order cumulants. Moreover, the cumulants can be also obtained from

Residual life and reversed residual life functions
The ℏ ℎ moment of the residual life

The maximum likelihood estimation (MLE) method
Let 1 , 2 , … , ℏ be a random sample from size ℏ from the OB-Fr distribution with parameters , , and . For determining the MLE of , we have the log-likelihood function The components of the score vector is available if needed. Setting = = = = 0 and solving them simultaneously yields the MLEs. To solve these equations, it is usually more convenient to use nonlinear optimization methods such as the quasi-Newton algorithm to numerically maximize ℓ . For interval estimation of the parameters, we obtain the 3 × 3 observed information matrix ( ) = { 2 ℓ/ }| ( , = , , , ) ,

Graphical assessment
Graphically and using the biases and mean squared errors ( ), we can perform the simulation experiments to assess the finite sample behavior of the MLEs. The assessment was based on the following algorithm: i. Generate =1000 samples of size | ( =50,100,…,1500) from the OB-Fr distribution using (7).

ii.
Compute the MLEs for the =1000 samples.

iii.
Compute the standard errors (SEs) of the MLEs for the 1000 samples. iv.
Compute the biases and mean squared errors given for = , , , . We repeated these steps for | ( =50,100,…,1500) with = = = = 1 , so computing biases (Bias ( )), mean squared errors ( )    show how the four MSEs vary with respect to . The broken line in Figure 1 corresponds to the biases being 0 . From Figures 1-4, the biases for each parameter are generally negative and decrease to zero as → ∞ , the MSEs for each parameter decrease to zero as → ∞ .

Modeling uncensored real data for comparing the competitive models
For illustrating the wide applicability of the new OB-Fr model, we consider the Cramér-Von Mises (CVM) statistic, the Anderson-Darling (A-D) statistic, the Kolmogorov-Smirnov (KS) statistic and its corresponding p-value ( ) ) . Table 3 below gives the competitive models.

Stress data
The 1 data set is an uncensored data set consisting of 100 observations on breaking stress of carbon fibers (in Gba) given by Nichols and Padgett (2006) and these data are used by Mahmoud and Mandouh (2013) to fit the transmuted Fr distribution. Figure 5 gives in its left panel the total time test (TTT) plot (see Aarset (1987) for more details) for data set I along with the box plot for discovering the EVs. It indicates that the empirical HRF of data sets I is "increasing HRF" and we have six EVs. The statistics CVM, A-D , K-S and P for all fitted models are presented in Table 4. The MLEs and corresponding standard errors (SEs) are given in Table 5. From Table 4, the OB-Fr model gives the lowest values the CVM=0.0664, A-D=0.4706, K-S=0.0630 and P =0.822 as compared to further Fr models. Therefore, the OB-Fr can be chosen as the best model. Figure 6 gives the estimated (E-PDF) versus the estimated CDF (E-CDF). Figure 6 gives the Probability-Probability (P-P) plot and estimated HRF (E-HRF) for data set I. From Figures 6 and 7, we note that the new OB-Fr model provides adequate fits to the empirical functions.

Glass fibers data
The 2 data set is generated data to simulate the strengths of glass fibers which was given by Smith and Naylor (1987). Figure 8 gives in its left panel the TTT for data set II along with its corresponding box plot for discovering the EVs. It indicates that the empirical HRF of data sets II is "increasing HRF" and we have four EVs. The statistics CVM, A-D , K-S and P for all fitted models are presented in Table 6. The MLEs and corresponding SEs are given in Table 7. From Table 6, the OB-Fr model gives the lowest values the CVM=0.05447, A-D=0.3858, K-S=0.0797 and P =0.88270 as compared to further Fr models. Therefore, the OB-Fr can be chosen as the best model. Figure 9 gives the E-PDF versus the E-CDF. Figure 10 gives the P-P plot and E-HRF for data set II. based on Figures 9 and 10, we note that the new OB-Fr model provides adequate fits to the empirical functions.  Figure 8: Box plot (left panel) and TTT plot (right panel) for data set II. Figure 9: Estimated density, estimated CDF, P-P plot and estimated HRF for data set II. Figure 10: P-P plot and estimated HRF for data set II.

Relief time data
The 3 data set (wingo data) represents a complete sample from a clinical trial describe a relief time (in hours) for 50 arthritic patients. Figure 11 gives in its left panel the TTT for data set III along with its corresponding box plot for discovering the EVs. Based on Figure 11, the empirical HRF of data sets III is "increasing HRF" and we have no EVs. The statistics CVM, A-D , K-S and P for all fitted models are presented in Table 6. The MLEs and corresponding SEs are given in Table 7. From Table 6, the OB-Fr model gives the lowest values the CVM=0.04903, A-D=0.42081, K-S=0.09124 and P =0.7994. Therefore, the OB-Fr may be chosen as the best model. Figure 12 gives the E-PDF versus the E-CDF. Figure 13 gives the P-P plot and E-HRF for data set III. based on Figures 12 and 13, we note that the new OB-Fr model provides adequate fits to the empirical functions.   Figure 11: Box plot (left panel) and TTT plot (right panel) for data set III.