Modeling Real-life Data Sets with a Novel G Family of Continuous Probability Distributions: Statistical Properties, and Copulas

This work presents a novel two-parameter G family of continuous probability distributions with compounded parameters. To determine and examine the pertinent mathematical properties, calculations are performed. In one of the special sections, the standard inverse-Rayleigh baseline model is mathematically and statistically emphasized. We generated a number of bivariate and multivariate distributions using the copula method. These new distributions will aid in the modelling of bivariate and multivariate data. The applicability and flexibility of the new compounded two-parameters-G family are demonstrated through three applications to real-life data sets. These examples demonstrate the applicability of the family.


Introduction
In the fields of statistics and probability theory, one of the most important classes of distributions is known as the geometric G (GC-G) family of probability distributions.These distributions find use in a wide variety of domains for a variety of purposes.In a series of independent Bernoulli trials, the geometric distribution, which is a member of the family GC-G, is frequently used to describe waiting times or the number of tries until the first success.This is because the geometric distribution is a member of the GC-G family.It has found use in dependability analysis, queueing theory, and a variety of real-life scenarios that include waiting times.The GC-G family of distributions includes models that can accurately simulate difficult-to-predict occurrences.For instance, the negative binomial distribution, which belongs to this family of distributions, can be used to simulate the number of tests that must be conducted in order to see a predetermined number of successes.Because of this, it is useful for modelling unusual events, such as the number of accidents that occur over a specific time or the number of defective items that are produced during a given manufacturing process.Applications in risk analysis and insurance frequently make use of the GC-G distributions, which include the geometric-exponential and geometric-gamma distributions, among others.These distributions can be utilized to represent claim frequencies or inter-arrival durations between claims, both of which are significant components in the process of evaluating risk and calculating insurance rates.Within the realms of reliability and survival research, the GC-G family has a wide variety of potential applications.
Utilizing geometric distribution is one way to model the length of time a system has left before it fails or the amount of time individuals in a population have left to live.This can be done to model the amount of time individuals have left to live.Reliability engineers and medical researchers can make predictions about the reliability and longevity of specific systems or people by analyzing these distributions.Modelling discrete data is made easier with the help of the GC-G family's versatile architecture.Count data and discrete events can both be modelled with different members of this family, such as the geometric distribution, the negative binomial distribution, and the Poisson distribution.Analysis and modelling of discrete phenomena are common applications for these distributions, particularly in the domains of epidemiology, ecology, finance, and telecommunications, to name a few.The GC-G family is frequently utilised for simulation research as well as the creation of random numbers.The process of simulating events using a geometric distribution or one of its related distributions can be helpful in generating random sequences that replicate real-life conditions.This can be useful for testing theories, evaluating algorithms, and conducting Monte Carlo simulations.
The compound G family is the name that we give to a novel family of probability distributions that we define in this research.This new family of probability distributions is derived from an existing family of probability distributions known as the geometric generated Rayleigh G (GCGR-G) family, which is founded on the generated Rayleigh-G (GR-G) family.This new family of probability distributions is referred to as the GCGR-G family.Let   () and   () are two functions which denote to the cumulative distribution function (CDF) and its corresponding probability density functions (PDF) with main parameter vector .First, we will consider the CDF of the well-known generated-Rayleigh G (GR-G) family class ( , ()) with the following CDF: where ∆ , 2 () refers to squared odd ratio function which can be explained as ∆ , () =    ()/[1 −    ()]| ∈,>0, and    , () =  , () is the PDF corresponding to the CDF in (1).On the other hand, the CDF of the geometric G (GC-G) family is defined as follows: for any random variable (RV) according to any standard baseline model which has the CDF represented by _ v (y), then the CDF of the geometric G (GC-G) family is defined as follows: , () =   () where   () = 1 −   () refers to the reliability function (RF) for the base line model.By substituting (1) in ( 2), we get a new extension of the GC-G class to provide a new flexible compound G family.The new compound G family has a wide physical interpretation as given later in Section 2. The new compound G family is derived via expanding the GC-G family and the GR-G family.
In reality, there are a lot of helpful contributions in the statistical literature on compound G families.Many applied disciplines and fields, including medical, reliability, actuarial sciences, engineering, insurance, econometrics, biology, demography, environmental sciences, etc., employ these contributions in the field of modelling real-life datasets.In this sense, the following examples can be made: The new GCGR-G class stands out for having a wide range of applications.Using three examples, we show that the GCGR class provides better matches than many other families.Figures 5, 8, and 11 (1st row right graphs) depict "asymmetric monotonically increasing hazard rate functions (HRF)"; Figures 5 and 8 (2nd row right and left graphs) depict "real-life data with some extreme values;" and Figures 5 and Figure 7 (3rd row right and left graphs) depict "real-life data with some extreme values."The new family may be useful in modelling such real-life data.Real-life data without extreme values, as seen in Figure 10's second row right and left graphs, real-life data with symmetric and unimodal nonparametric Kernel densities, as seen in Figures 5, 8, and the first row left graph of Figure 10, real-life data with bimodal and heavy-tailed nonparametric Kernel densities, as seen in Figures 5, and Figure 7, respectively, and real-life data whose nonparametric Kernel density cannot be fit by common models.

The new family and its related representations
We build a new two-parameter family of continuous probability distributions called the GCGR-G family by utilizing (1) and (2) to generate the baseline CDF of (2).This family of distributions is known as the GCGR-G family.The CDF of the GCGR-G family can be defined in the following manner: where  = (, , )| ∈ ,  > 0,  > 0 refers to the parameter vector.
The generalized G (GZX-G family) family of distributions can be used to generate some power series expansions for Equations ( 3) and ( 4).The PDF in (4) can be stated as follows by using the generalized binomial expansion and the power series: Then, using the expansion of Taylor, we have where As a consequence of this, it is possible to infer certain mathematical properties of the GCGR-G family directly from those of the GZX-G family.Equation ( 5) represents the most important result of this section.In the same way, the CDF of the GCGR-G family can be expressed through the combination of densities from the GZX-G family.By integrating equation ( 5), we arrive to the same representation for the mixture: where the function   * , () refers to the baseline CDF of the GZX-G family with power parameter  * .

Copulas
In order to represent data with two or more variables, a copula, a fundamental statistical concept, is used.A function that links the marginal distributions of two or more variables to their combined distribution is known as this.It links the marginal distributions of each variable in particular.Copulas have been more well-liked in recent years as a result of their versatility and ability to mimic intricate dependency patterns between variables.They are now better able to replicate complex dependency patterns because to this talent.The significance of copulas and their use in statistical modelling and modelling of bivariate data are demonstrated by the following examples: 1. Copulas are used to show the variables' relationship to one another, which is a key idea in many different fields of data and statistical research.We can describe the joint distribution of all variables while maintaining the marginal distributions of each variable by using copulas.Finding intricate interactions between variables, which cannot be measured using conventional correlation techniques, is made considerably easier as a result.2. Copulas are used in the realm of finance to emphasise the connection between the returns on different assets.
For portfolio optimisation, which tries to create an asset allocation that maximises returns while minimises risk, this is very necessary.It is essential that you have this knowledge at your disposal.When we use copulas as a modelling tool for the dependency between assets, we are able to more accurately assess the risk associated with a portfolio and construct portfolios that are more effective.3. Copulas can be used in risk management to estimate the likelihood of catastrophic occurrences like market collapses and natural disasters.Copulas are also used to estimate the likelihood of catastrophic catastrophes.
Because copulas model the dependence structure that exists between variables, they can give a more precise evaluation of the likelihood that such occurrences will occur.A crucial element of risk management and insurance is this.4. Copulas can also be used to generate data, which is useful in circumstances when it is challenging or costly to get the required data.By simulating new data sets with the identical dependent structures as the original data, we are able to generate brand-new data for testing and analysis.This provides us with extra data to work with.This will be achieved by using a copula to characterise the structure of the interrelationships between the variables.
In order to replicate the dependence structure that exists between two or more random variables, the FAGM copula, which is a family of parametric copulas, is utilized.It is an approximation of the copulas of Ali Mikhail Haq, Frank, and Placket that is of the first order, and it has a simple form that enables explicit calculus and exact results (see Ali et al. (1978) for more information).The FAGM copula is attractive not only due to the ease with which it may show a broad variety of dependency patterns, such as positive, negative, and mixed dependence, but also due to the fact that it can do so.The FAGM copula has been applied in a variety of contexts, such as in the management of financial risk, the underwriting of insurance policies, and the evaluation of environmental risk.For instance, the FAGM copula has been utilized to symbolize the connection between asset values and insurance claims, as well as the connection between environmental variables and other factors.Modelling a wide variety of dependent structures can be accomplished with the flexible assistance of the FAGM copula.
It is simple to implement and has a number of benefits over other families of copulas, including the capacity to show mixed dependency, among other things.In this section, we provide some new bivariate type GCGR-G variations that are based on the FAGM copula that was developed by Morgenstern (1956), Farlie (1960), Gumbel (1958 and1960), Johnson and Kotz (1975), Balakrishnan and Lai (2009), and Johnson and Kotz (1977).There is also discussion of the "modified FAGM (MFAGM)" copula, the "Clayton copula" copula, the "Archimedean-Ali-Mikhail-Haq" copula, and the "Renyi's entropy" copula (for more information, see Balakrishnan and Lai (2009), Ali et al. (1978), and Pougaza and Djafari (2011)).In addition, the Multivariate GCGR-G type (also known as Mv GCGR-G) is provided (for further information, refer to Balakrishnan and Lai 2009).Nevertheless, in the not-too-distant future, there might be some attempts made to investigate and evaluate these new models.

BGCGR-G type via FAGM copula
The following are some applications that make use of the FAGM copula: I.The FAGM copula may be utilized to provide a representation of the reliance of asset prices.This can be utilized in the process of developing hedging strategies as well as conducting risk assessments on asset portfolios.II.A simulation of the interdependence of insurance claims can be carried out with the help of the FAGM copula.This can be utilized to decide pricing and reserving techniques, in addition to evaluating the risk that is posed by an insurance firm.III.A simulation of the interdependence of environmental variables can be carried out with the assistance of the FAGM copula.This can be put to use to determine the likelihood of environmental catastrophes and to design strategies for mitigating their effects.Using the joint CDF of the FAGM family we have   (, ) = (1 + ), where  ∈ (0,1),  ∈ (0,1), are two continuous marginal functions and  ∈ [−1,1] is a dependence parameter.Then, we have   (, 0) =   (0, ) = 0| (,∈(0,1)) , which is "grounded minimum condition" and   (, 1) =  and   (1, ) =  which is "grounded maximum condition".
In the context of statistical copulas, the grounded minimum condition is an important property that copulas must satisfy to ensure their validity and consistency.A copula is a mathematical function that links marginal distributions to a joint distribution.It plays a crucial role in multivariate statistical analysis, especially in applications like risk management, finance, and insurance, where modeling the dependence structure between variables is crucial.The grounded minimum condition ensures that the copula is well-behaved and interpretable.The fact that the copula attains its minimum value at the lower bounds of its arguments aligns with our intuition about dependence.When all variables are at their minimum value, the joint probability should also be minimized, which is reflected by this condition.The grounded minimum condition is a part of the more general concept of coherence, which imposes certain properties on copulas to ensure their consistency and validity.Coherence is a fundamental requirement for any valid copula function, and the grounded minimum condition contributes to fulfilling this requirement.Then, setting and Then, we have ( 1 ,  2 ) = (1 + ).

BGCGR-G type via MFAGM copula
The MFAGM copula is a flexible tool that was produced based on the primary version of the FAGM; it has the ability to be used in the modeling of a large range of different sorts of dependence structures.The MFAGM copula was developed based on the primary version of the FAGM.It is simple to implement, and it offers a number of benefits over other families of copulas, such as the ability to simulate mixed dependency (for more details regarding this topic, refer to Rodriguez-Lallena and Ubeda-Flores ( 2004)).As a result of this, the FAGM copula is a well-liked choice for a variety of diverse applications due to its many advantageous characteristics.In this section of the article, we are going to look at the MFAGM copula in the following form: where () ̃= (), () ̃= () and the two continuous functions () and () are two functions on (0,1) with the following condition boundary condition: Then, ((; ℎ 1 )(; ℎ 1 ), (; A model of the correlation between asset prices can be constructed utilizing the MFAGM copula.The creation of hedging strategies for asset portfolios and risk assessments are also possible using this information.A model of the interdependence of environmental variables can be constructed with the help of the MFAGM copula.This can be used to make predictions about the possibility of environmental disasters and come up with measures for mitigating their effects.It is possible to demonstrate the interconnectedness of insurance claims by utilizing the MFAGM copula.This can be utilized to design pricing and reservation policies, in addition to providing an estimate of the risk level that an insurance company faces.In light of the limitations imposed by the ongoing study, we will conduct the following investigation of four possible kinds of MFAGM copulas.

The Type I BGCGR-G via the MFAGM copula:
Consider the two functional forms () and (), then the Type I BGCGR-G according the MFAGM copula can be expressed as , where

The Type II BGCGR-G via the MFAGM copula:
Consider the two functional forms () and (), then the Type II BGCGR-G according the MFAGM copula can be expressed as Then, the new Type II BGCGR-G version can be derived from

The Type III BGCGR-G via the MFAGM copula:
According to the Type III of the MFAGM copula, a new two function (() ̃ and () ̃) can be considered where Then, depending on Ghosh and Ray (2016), one can easily apply the following type

BGCGR-G type under the Clayton copula
The Clayton copula is a type of copula that can be found in nature.This particular copula permits any non-zero degree of (lower) tail dependence between the different variables to be modelled.In addition to being interchangeable, it possesses an Archimedean copula as one of its properties.The Clayton copula lends itself to a variety of different interpretations, some of which are as follows: and After conducting an analysis using the Clayton copula, the BGCGR-G type distribution can then be managed in the following manner: .

BGCGR-G type under the Renyi's entropy
The Renyi's entropy copula allows us to easily build bivariate probability distributions without requiring costly mathematical derivations, making bivariate data mathematics and statistical modeling easy.The Renyi's entropy copula can be used to directly deduce the bivariate version.Renyi's entropy copula introduces no new parameters into the new bivariate model.The Renyi's entropy copula can be found in Pougaza and Djafari (2011).Then, according to the theorem of Pougaza and Djafari (2011), we have: (, ) =  2  +  1  −  1  2 , then, the associated CDF of the BGCGR-G will be

BGCGR-G type via the Archimedean Ali-Mikhail-Haq copula
The Archimedean Ali-Mikhail-Haq copula makes it simpler to build bivariate probability distributions without requiring significant mathematical derivations, which makes it easier to mathematically and statistically characterize bivariate data.This is because it simplifies the process of developing bivariate probability distributions.It is possible to generate the bivariate form of the Archimedean Ali-Mikhail-Haq copula (see Ali et al. (1978)) in a straightforward manner by utilizing the two above-mentioned functions.According to the Archimedean Ali-Mikhail-Haq copula, the innovative bivariate model can only add one parameter to the mix.Balakrishnan and Lai (2009) and Ali et al. (1978) are the primary references that are utilized in the construction of the Archimedean Ali-Mikhail-Haq copula.Under more stringent Lipschitz conditions, one may then construct the one-of-a-kind joint CDF of the Archimedean Ali-Mikhail-Haq copula by making use of the fundamental formula that is as follows: As a result, the Archimedean Ali-Mikhail-Haq copula's matching joint PDF can be determined as follows: Then, for any RVs  ∼ GCGR-G ( 1 ,  1 , ) and  ∼ GCGR-G ( 2 ,  2 , ) we have the following main result | ∈(−1,1) .

The MvGCGR-G type
When it comes to the procedure of generating multivariate distributions or families, the Clayton copula is typically considered to be one of the simplest methods that can be used and is widely regarded for this fact.This is due to the fact that Clayton Copula was the one who initially designed the Clayton copula.As a result of this, the Clayton copula should be taken into consideration as a possibility.This is because both its mathematical formulation and its application can be carried out with an amount of work that is manageable.The reason for this is due to the fact that both can be done with relative ease.In practical applications for multivariate data modeling in engineering and medical journals validity, insurance, and reinsurance, as well as other fields, the multivariate distributions or families that are based on the Clayton copula are adaptive.This is the case because the Clayton copula is a copula that is based on a multivariate distribution.These distributions and families are also flexible enough to be applied in a variety of other settings.This is because the Clayton copula acts as a foundation for the distributions, which are also referred to as families.In accordance with the idea of the Clayton copula, which is summarized as follows, the multivariate GCGR-G version can be formed using the follows: .
It is possible that we may devote future independent works to the investigation of a number of these binary and multivariate distributions, along with their application to binary and multivariate data in important journals like engineering, health, insurance, and actuarial sciences.This is as a result of the limitations that the current study imposes.

Some properties
It is essential to comprehend these mathematical properties for a number of reasons.The first advantage is that they enable us to construct a variety of statistical measures that can characterize the behavior of a random variable.For instance, the mean and variance can be used to describe the central tendency and variability of a probability distribution.Second, these characteristics enable us to forecast a random variable's future behavior.By comprehending the PMF or PDF, we may forecast the probability of specific events and base our judgements on that knowledge.Creating and evaluating statistical models requires a comprehension of these mathematical features.We build models that precisely mimic the behavior of real-life phenomena using probability distributions with well-defined mathematical features.

Ordinary moment
Moments are fundamentally important pieces of statistical equipment in a wide variety of fields, including physics, engineering, insurance, economics, and finance.It is a mathematical method that discloses information on the shape, position, and variability of a probability distribution.Specifically, it shows how the shape of the distribution varies.
The Moreover, many statistical measures like the variance, skewness, and kurtosis measures can now be calculated via simple relations.

Moment generating function
The moment generating function, or MGF for short, is significant because it can offer detailed details about the characteristics of a probability distribution.One can precisely pinpoint the distribution's moments by employing the moment generating function.The mean, variance, skewness, and kurtosis are some of these moments.The characteristic function, the cumulant generating function, and the cumulant generating function are all statistical measures that are derived from the moment generating function.The MGF can then be calculated by using the equation below: where   * () is the MGF of the RV   * .That being said: the   () can be easily derived after deriving the   * () for any baseline model.

The incomplete moments
Moments and incomplete moments are both significant in terms of the information that can be gleaned about the properties of a probability distribution from the two types of moments.Moments might be whole or they can be missing something.They are helpful in finding, among other statistical measures, the mean, variance, skewness, and kurtosis of a distribution.Additionally, they are important in determining the shape of a distribution.In addition to this, you may make an estimate of the shape of the distribution by using them.The  ℎ incomplete moment, say  , () , of  can be expressed from (5) as: where * ().

The residual life function and the reversed residual life function
The term "residual life" describes the amount of time that has passed from a particular moment in time or since the system reached a particular age without experiencing a failure.The  th moment of the residual life, say  , ( The  ℎ moment of the residual life of  is given by  , () = where   +∞ (  ;  * ) = ∫   +∞    * ().The events that occur during the residual life provide crucial insights into the functioning and dependability of systems.They are helpful in determining the remaining useful life of an item, organizing maintenance activities, evaluating hazards, establishing warranty durations, and making decisions based on accurate information.When firms take into account these periods in their dependability analyses, they are able to improve the performance of their systems, lower their expenses, and increase their overall operational efficiency.The remaining useful life of a system or component after a specific time point is referred to as the reversed residual life (RSRL), which is an abbreviation.In the process of reliability analysis, the moments of the RSRL play a crucial role, particularly in the context of survival analysis and deterioration modeling.  * ().The moments of the RSRL can be helpful in determining the pace of improvement in a system's dependability when it is increasing over time, such as when it is undergoing the burn-in phase or when it is improving as a result of learning effects.The evaluation of dependability growth models as well as the identification of prospective trends or anomalies can be made possible through the monitoring of changes in the moments.

Studying a special model
The inverse-Rayleigh (IR) model is without a doubt among the most important distributions that are utilized in the process of modelling extreme values.Fréchet (1927) is credited with making the first recommendation regarding the IR model.It has a wide range of applications, some of which are accelerated life tests, earthquakes, floods, wind speed, horse racing, showers, lineups at the grocery store, and sea waves (for further information, see Von Mises (1964) where  > 0 is a scale parameter of the base line IR model.The CDF of the geometrically generated Rayleigh Inverse-Rayleigh (GCGR-IR) model can be defined as follows, with reference to equation (3): where  = (, , ) refers to the parameters vector.For the GCGR-IR model, we obtain the following results:

The maximum likelihood (MLE) method
One of the most popular techniques used to estimate unknown parameters in statistical models is the maximum likelihood approach.The likelihood function, which measures the likelihood that the provided data will be observed in line with the predicted model, can be used to determine the parameter values that will maximise this likelihood.It provides a methodical approach to doing so.For the aim of generating parameter estimates in a range of statistical models, the method known as maximum likelihood estimation, or MLE, is often employed and typically adaptable.Efficiency is one of the ideal characteristics of estimates based on the greatest likelihood.The maximum likelihood technique yields estimate that are asymptotically efficient, which means that they achieve the lowest asymptotic variance among all reliable estimators.This is so that the maximum likelihood technique may make use of the data that is already accessible.To put it another way, compared to estimates produced by other estimators, the maximum likelihood estimates tend to be quite close to the actual values of the parameters being estimated.They also have a comparatively low level of variability.
To obtain the MLE of , the log-likelihood function can be expressed as The function ℓ() can be numerically maximized either directly by using softwires like the R (using the optim function), the SAS (using the PROC NLMIXED) or the Ox program (by using the sub-routine MaxBFGS) or analytically via solving the nonlinear equations of the likelihood function which obtained by differentiating ℓ().
The maximum likelihood method has a close connection to information theory.The likelihood function can be interpreted as a measure of the information contained in the data about the unknown parameters.Maximizing the likelihood is equivalent to maximizing the information extracted from the data.This connection to information theory has led to several developments in statistical theory and provides a theoretical foundation for the maximum likelihood method.

Simulation study and assessing the estimation method
In this section, we would like to highlight the significance of simulation by presenting a full simulation study.
Additionally, we would want to evaluate the estimators produced by the MLE approach using specific statistical criteria.When evaluating the effectiveness of the MLE technique, simulation studies are an extremely important component.The following is a list of the most essential reasons why simulations are necessary in this setting: I. We are able to compare the estimated values that were derived by MLE with the actual parameter values in a controlled environment thanks to the power of simulations.We are able to determine the bias of the estimator by performing the simulation process an innumerable number of times.The systematic difference between the values that are estimated and the values that actually exist is referred to as bias.Researchers are able to identify and address any biases that may be present because to the support that simulations provide in understanding how MLE functions in a variety of settings and with varying sample sizes.
II.The term "efficiency" refers to the accuracy and dependability of the estimator in determining the true values of the parameters being estimated.Simulations offer a method for determining whether or not MLE is effective by analyzing the degree to which estimated values vary across a number of different iterations of the simulation.
Researchers are able to better comprehend the accuracy of the estimations and evaluate the MLE's performance under a variety of circumstances as a result of this.
III. Consistency is a desired quality in an estimator, which indicates that as the sample size rises, the estimated values should converge to the true parameter values.In other words, consistency means that an estimator should always produce the same results.By producing data sets of varied sizes and determining whether the estimated values approach the true values as the sample size expands, simulations can assist in verifying the reliability of MLE.This can be done by determining whether the estimated values approach the true values.The consistency of the MLE is a basic characteristic, and simulations are helpful in validating the validity of this property.
IV. Researchers are able to test the robustness of MLE by subjecting it to breaches of the assumptions that underpin the estimation approach by using simulations.Simulations allow for the evaluation of how sensitive MLE is to various kinds of violations by purposefully introducing deviations from the anticipated model circumstances.Analysis of robustness assists in the identification of potential constraints and directs researchers in the selection of acceptable estimating methods or, when necessary, in the modification of the model's underlying assumptions.
V. Simulations offer a foundation upon which MLE can be compared to many other estimating approaches.
Researchers are able to evaluate the relative effectiveness of various methods of estimate, such as the method of moments and Bayesian estimation, by simulating data sets and subjecting them to predetermined conditions during the simulation process.When compared to other methods, simulations assist identify the benefits and drawbacks of MLE, which in turn makes it easier to choose the technique that is going to be the most useful for any specific situation.
The following main algorithms are performed to assess the performance of the estimation method under the new compound G family and the IR baseline model: 1) Using the inversion method, where   =  −1  (), we generate N=500 samples of size  from the GCGR-IR distribution; 2) Compute the MLEs for each sample of the 500 samples 3) Compute the standard errors of estimations (SErEs) of the MLEs for the 1000 samples.The SErEs were computed by the well-known method of inverting our obtained information matrix.4) Compute the biases (  ()) and mean squared errors (MSEs) given for all parameters where  = , , . 5) Repeated steps 1-4 for  = 50,60, … ,500 with an initial value for all parameters equal to one, then compute biases, mean squared errors (MSE  ()) for , ,  for all  = 50,60, … ,500 .This line represents the biases.As n increases, the biases for each parameter move closer and closer to zero, as can be seen in Figures 1, 2 and 3, and the MSEs for each parameter likewise move closer and closer to zero as n increases.

Real data modeling for comparing the competing models
One of the most significant events that takes place within the realm of statistics is something that is referred to as a competition of probability distributions.Researchers are able to make inferences about the features of the population as a result of this activity, which helps to a better knowledge of the process of data gathering and enables them to do so.The selection of the ideal probability distribution, on the other hand, can be a difficult task, especially in circumstances in which there are numerous possible distributions that might be applied to a single dataset.The importance of using actual data models cannot in any way be understated when it comes to the process of comparing various probability distributions applicable to this setting.One of the jobs involved in real-life data modelling is comparing the degree to which various probability distributions are a good fit for the actual datasets that are being modelled.Using this strategy, researchers are able to identify the optimal distribution for a certain dataset and draw inferences about the characteristics of the population.The modelling of real data is vital for a variety of reasons, and each of these reasons is unique in its own way.
Real data modelling is a crucial instrument for comparing and studying a variety of probability distributions because it is based on actual data.Researchers have the option to select the ideal distribution for the data set at hand, arrive at inferences on the parameters of the population, and evaluate the adequacy of the fitted distribution.This opportunity is made available to them thanks to the availability of the data.When researchers employ actual data modelling, they are able to draw more accurate and reliable conclusions regarding the population parameters and the underlying data generating process.This is because actual data modelling takes into account all of the relevant variables.Table 1 reports some competitive models.
Table 1: Some competing models with authors and its corresponding abbreviations.2013)); TR-IR odd log-logistic-Inverse Rayleigh; OLOGL-IR odd loglogistic generalized-Inverse-Rayleigh; OLLGZ-IR odd loglogistic Inverse Rayleigh; OLOGL-IR generalized odd loglogistic-Inverse Rayleigh; GOLOGL-IR Quantitative analysis, visual inspection, and hybrid methods of both are all viable options for assessing real-world datasets.The "nonparametric Kernel PDF estimation (NK-PDFE)" method for investigating the initial density shape, the "Quantile-Quartile (QN-QN)" plot for investigating the "normality" of the data, the "total time in test (TTT)"plot for investigating the initial shape of the empirical HRFs (for more information, see Aarset (1987)), and the "box plot" for investigating the extremes will all be considered.

Modeling the reliability stress data
The first raw data set is 100 measurements of carbon fiber breaking stress (in Gba) provided by Nichols and Padgett (2006).Figure 4 displays a box plot (right-hand panel), NK-PDFE plot (left-hand panel), TTT plot (right-hand panel), and QN-QN plot (left-hand panel).According to the data for carbon fiber breaking stress, shown in the first row of the left panel of Figure 4, the distribution is asymmetric bimodal with a right heavy tail.Figure 4's first-row right panel shows that the HRF has been rising consistently over the past few years.The extreme values in this data are readily apparent in the left and right panels of Figure 4's second row.The third panel in the right-hand row of Figure 4 shows that the current data cannot be characterized by theoretical distributions such the normal, uniform, exponential, logistic, beta, lognormal, and Weibull distributions.Table 2 shows the results of the Kolmogorov-Smirnov test, the Cramér-von Mises statistic, the Anderson-Darling statistic, and the P-value for all of the fitted models.The MLEs and associated SEs are tabulated in Table 3. Table 2 shows that the GCGR-IR model has the most conservative estimates, with CRVM = 0.061251, ANDR = 0.444536, KG-SM = 0.057898, and   = 0.888821.Because of this, we can choose the GCGR-IR as our preferred model.Figure 5 displays the projected PDF and CDF.The P-P plot and estimated HRF for the current data set are displayed in Figure 6.The revised GCGR-IR model provides satisfactory fits to the experimental functions, as shown in Figures 6 and 7.

Modeling the reliability glass fibers dataset
The second group of numbers pertains to the glass fiber strengths as described by Smith and Naylor (1987).Box plot, QN-QN plot, NK-PDFE plot, and TTT plot are shown in Figure 7's second row left panel, second row right panel, and first row left panel, respectively.In the first left panel of Figure 7, we can see that the data for glass fibers exhibits an asymmetric bimodal distribution with a right heavy tail.The HRF of glass fibers exhibits a monotonically increasing trend, as shown in the first row, right panel of Figure 7.As may be seen in the panels labeled "second row left" and "second row right" of Figure 7, the statistics for glass fibers include some extreme values.Panel (right third row) of Figure 7 demonstrates that the glass fiber data cannot be explained by the normal, uniform, exponential, logistic, beta, lognormal, or Weibull distributions.
The CRVM, ANDR, KG-SM, and   statistics for all fitted models are shown in Table 4.The MLEs and corresponding SEs are shown in Table 5.Table 4 shows that when compared to other models, the CRVM (0.114399), ANDR (0.895333), KG-SM (0.003421), and   = (700635) provided by the GCGR-IR model are the lowest.As a result, we can choose the GCGR-IR distribution as our preferred model.Figure 8 displays the projected PDF and CDF. Figure 9 displays the P-P plot and estimated HRF for the glass fiber data.The GCGR-IR model provides a good fit to the empirical functions, as seen in Figures 9 and 10.

Modeling the medical relief times data
The "Wingo data" is the third type of information, and it features the reported alleviation times (in hours) for 50 arthritis patients during a clinical trial.Figure 10 11.For the data collected during the relief periods, the P-P plot and estimated HRF are shown in Figure 12.Figures 12 and 13 show that the suggested GCGR-IR model provides a good fit to the empirical functions.

Conclusions
In this work, a compound G distribution with two parameters that is completely new to the family was introduced as a member of the compound G distributions.It is possible to deduce essential statistical properties, such as the generating function, ordinary moments, and incomplete moments.It is possible to accomplish this.An inquiry has been conducted specifically into the Inverse-Rayleigh model, which serves as the benchmark for comparison with other models.When making an estimation of the characteristics of the new family, the approach that has the best possibility of producing accurate results is the one that is selected.For the purpose of making it easier to compare the many different approaches to estimating, numerical simulations are run through each of them.In order to produce a number of distributions that were bivariate as well as multivariate, we relied on the copula method.These brand-new distributions have the potential to be of great assistance when it comes to modeling data that has both bivariate and multivariate components.In addition to this, three different actual data sets are used in order to assess and compare the various estimating approaches.The flexibility and importance of the purposed family of products are demonstrated by the use of three separate applications to observable data.The following examples demonstrate how the new G family performs better than some of the G families that have been around for a longer period of time: I.
The Cramér-von Mises statistic, the Anderson-Darling statistic, the Kolmogorov-Smirnov test statistic, and the accompanying p-value all suggest that the new family performs better than the odd Burr G family, the odd loglogistic G family, and the odd log-logistic G family.Additionally, all of these statistics indicate that the new family performs better than the odd Burr G family.Modeling the breaking stress of carbon fibers requires the use of the generalized G family, the odd, transmuted G family, the generalized G family, the McDonald G family, the Marshall-Olkin G family, and the Beta G family.This is according to the Cramér-von Mises statistic, the Anderson-Darling statistic, and the Kolmogorov-Smirnov test.II.
When it comes to modeling glass fibers, the purposed family performs better than the odd log-logistic G family and the odd log-logistic generalized G family, according to the Cramér-von Mises statistic, the Anderson-Darling statistic, the Kolmogorov-Smirnov test statistic, and its accompanying p-value.This was determined by comparing the purposed family to the Anderson-Darling statistic.The comparison of the purposed family to the other two families allowed for the discovery of this result.III.
The performance of the new family is superior to that of the odd Burr G family, the generalized odd loglogistic G family, the odd log-logistic generalized G family, the generalized G family, the Beta G family, and the transmuted G family when it comes to modeling the relief timings.The Cramer-von Mises statistic, the Anderson-Darling statistic, the Kolmogorov-Smirnov test statistic, and its accompanying p-value all provide evidence for this assertion.Based on the above applications we conclude that: I.
Compound G families of continuous probability distributions allow us to model and describe real-world data accurately.By identifying the distribution that best fits the data, we gain insights into its underlying behavior and characteristics.This can be crucial for understanding the data's central tendency, variability, and potential outliers.

II.
Compound families of probability distributions enable us to perform statistical inference and estimate parameters.This includes tasks like hypothesis testing, confidence intervals, and maximum likelihood estimation.These techniques help us draw conclusions about the population from which the data is sampled.

III.
In many fields like finance, insurance, and engineering, understanding and quantifying risk is essential.Compound G families of continuous probability distributions can be used to model risks and estimate probabilities of extreme events, which are crucial for risk assessment and management.

IV.
Compound G families of continuous probability distributions can be integrated into decision-making processes.They help assess uncertainty and make more informed decisions, particularly in situations where outcomes are uncertain, and risks need to be considered.

V.
Probability distributions are fundamental in forecasting and prediction tasks.By fitting historical data to an appropriate distribution, we can generate probabilistic forecasts for future events or outcomes.

VI.
In quality control and manufacturing, compound continuous probability distributions are used to monitor and control production processes.They help identify deviations and anomalies in the manufacturing process.

VII.
In engineering and reliability studies, compound probability distributions are used to model failure times and lifetimes of products or systems.This information is crucial for design improvements and maintenance planning.
Here are some future research points in this area: I.
Investigate the extension of the novel geometrically generated Rayleigh family to multivariate distributions.Develop and explore new multivariate distributions with applications in fields like finance, environmental modeling, and engineering.

II.
Apply Bayesian methods for parameter estimation and inference in the context of the new geometric Rayleigh family and copula models.Study the advantages of Bayesian techniques in handling uncertainty.

III.
Explore the capabilities of the proposed models in modeling extreme events, such as natural disasters, financial crises, or rare diseases.Investigate the tail behavior of the distributions and copulas.

IV.
Apply the new models to time series data.Investigate their ability to capture temporal dependencies and trends.Develop time series versions of the proposed distributions and copulas.

V.
Conduct robustness and sensitivity analyses to evaluate how the proposed models perform under different data conditions, parameter settings, and model assumptions.

VI.
Extend the models to incorporate covariates or explanatory variables.Analyze how these factors affect the distributions and dependence structures in real-life data.
Afify et al. (2016a) for the complementary transmuted geometric G family with modelling real-life datasets, Afify et al. (2016a) for the complementary geometric transmuted G family with properties and applications, and Aryal and Yousof (2017) for the exponentiated generalized Poisson family are some examples of the families that have been studied.Cordeiro and de Castro (2011) studied the Kumaraswamy family with some applications.Cordeiro et al (2014).

Figure 4 :
Figure 4: Graphical description for the breaking stress of carbon fibers data.

Figure 5 :
Figure 5: Estimated PDF (left application plot) and estimated CDF (right application plot) for the breaking stress of carbon fibers data.

Figure 6 :
Figure 6: P-P plot (left application plot) and estimated HRF (right application plot) for the breaking stress of carbon fibers data.

Figure 7 :
Figure 7: Graphical description for the glass fibers data.

Figure 8 :
Figure 8: Estimated PDF (left application plot) and CDF (right application plot) for the glass fibers data.

Figure 9 :
Figure 9: P-P plot (left application plot) and estimated HRF (right application plot) for the glass fibers data.
displays a box plot (second panel from the left), NK-PDFE plot (first panel from the left), TTT plot (first panel from the right), and QN-QN plot (second panel from the right).In the first panel on the left of Figure10, the data for the relief times are presented as symmetric data.The HRF for this data is continuously increasing, as shown in the right panel of the first row in Figure10.

Figure 1 '
s panels for the second row, both left and right, show that the relief times have no outlying values.As shown in the third panel from the right of Figure10, theoretical distributions including the normal, uniform, exponential, logistic, beta, lognormal, and Weibull cannot account for the relief times.

Figure 10 :
Figure 10: Graphical description for the relief times data.

Figure 11 :
Figure 11: Estimated PDF (left application plot) and estimated CDF (right application plot) for the relief times data.

Figure 12 :
Figure 12: P-P plot (left application plot) and estimated HRF (right application plot) for the relief times data.8.ConclusionsIn this work, a compound G distribution with two parameters that is completely new to the family was introduced as a member of the compound G distributions.It is possible to deduce essential statistical properties, such as the generating function, ordinary moments, and incomplete moments.It is possible to accomplish this.An inquiry has been conducted specifically into the Inverse-Rayleigh model, which serves as the benchmark for comparison with other models.When making an estimation of the characteristics of the new family, the approach that has the best possibility of producing accurate results is the one that is selected.For the purpose of making it easier to compare the many different approaches to estimating, numerical simulations are run through each of them.In order to produce a number of distributions that were bivariate as well as multivariate, we relied on the copula method.These brand-new distributions have the potential to be of great assistance when it comes to modeling data that has both bivariate and multivariate components.In addition to this, three different actual data sets are used in order to assess and compare the various estimating approaches.The flexibility and importance of the purposed family of products are demonstrated by the use of three separate applications to observable data.The following examples demonstrate how the new G family performs better than some of the G families that have been around for a longer period of time:

Table 2 :
CRVM, ANDR, KG-SM and   for the breaking stress of carbon fibers data.

Table 3 :
MLEs and SEs for the breaking stress of carbon fibers data.

Table 4 :
CRVM, ANDR; KG-SM and   for the glass fibers data.

Table 5 :
MLEs and SEs for the glass fibers data.

Table 6
displays the CRVM, ANDR, KG-SM, and P_ values for all of the fitted models.Table7details the MLEs and their corresponding SEs.From Table6, the minimum values for the GCGR-IR model are: CRVM =0.048111, ANDR =0.401121, KG-SM =0.082471, and   =0.900632.Therefore, the GCGR-IR may be selected as the best model.The predicted PDF and CDF are shown in Figure

Table 6 :
CRVM, ANDR, KG-SM and   for the relief times data.

Table 7 :
MLEs and SEs for the relief times data.