Multimodal Alpha Skew Normal Distribution: A New Distribution to Model Skewed Multimodal Observations

Multimodal alpha skew normal (MMASN) distribution is proposed for modelling skewed observations in the presence of multiple modality at arbitrary points. To this end the multimodal skew normal distribution of Chakraborty et al. (2015) is extended by integrating it with alpha skew normal distribution of Elal-Olivero (2010). Cumulative distribution function (cdf), moments, skewness and kurtosis of the proposed distribution are derived in compact form. The data modelling ability of the proposed distribution is checked by considering three multimodal data sets from literature in comparison to some nested and known distributions. Akaike Information Criterion (AIC) and the likelihood ratio (LR) test, both clearly favored proposed model over its nested models as expected


Introduction
introduced skew-normal (SN) distribution, as a natural extension of normal distribution by inducing an additional skewness parameter with probability density function (pdf) given by ( ; ) 2 ( ) ( ); (1) where (.)  and (.) are the pdf and cdf of standard normal distribution respectively.A lot of extensions and generalizations of this distribution were proposed and studied (for details see Chakraborty andHazarika, 2011, Ali et al. (2008), Arnold and Beaver (2000), Shah et al. (2022), Shah et al. (2023a), Shah et al. (2023b) and among others).We focus on two extensions namely the alpha skew normal () ASN distribution of Elal-Olivero (2010) and the multimodal skew normal (MMSN) distribution of Chakraborty et al. (2015).
A random variable Z is said to follow alpha skew normal distribution of Elal-Olivero (2010) if its density function is given by ); ( ( This distribution denoted by ) ( ASN can be bimodal unlike the skew normal distribution which is unimodal.A random variable Z is said to be a multimodal skew normal distribution (Chakraborty et al., 2015), denoted by ) , (

MMSN
if its density function is given by Pakistan Journal of Statistics and Operation Research This distribution can be multimodal unlike the () ASN  and thus can overcomes the limitations of both skew normal and alpha skew normal distributions with respect to multiple modality.
Intent we propose a new extension which includes both the ASN and MMSN as particular case and call it the multimodal alpha skew normal (MMASN) distribution.This new distribution is derived by integrating the ideas of ASN and MMSN to effectively overcomes the limitation of the periodic nature of MMSN's multiple modes as well as the inability of ASN to cater for more than two modes effectively.The prime motivation behind developing the MMASN is to do away with the periodic multimodality of MMSN to offer multiple modes at arbitrary locations.This distribution is shown to be very flexible to support unimodality, bimodality as well as multimodality and thus provides improved fit as compared to its sub-models and some other distributions while dealing with multimodal data.
To understand the relevance of the issue raised in the last paragraph let us look at a real life data.This data set provides the Oits IQ Scores for 52 Non-White males.It is easy to see from Figure 1 that this data is multimodal with 6 modes.Naturally the ASN which is a bimodal distribution fails to adjust the data well as can be seen if Figure 2. We then tried to fit the MMSN which is supposedly multimodal with an inherent periodic nature of modality.This distribution also fails to adjust the data correctly as apparent from Figure 3.This provided us a motivation for proposing a new distribution which can cater to multimodality at arbitrary location to fit such data sets.We shall discuss the three such multimodal data fitting examples in Section 5 by considering three data sets including the above to establish the benefit of the proposed distribution.The article is summarized as follows: In the next Section, we define the proposed distribution and identify its basic properties like cdf, moments, skewness and kurtosis etc.The characterizations of the MMASN distribution via a simple relationship between two truncated moments are given in section 3.In section 4, the parameter estimation and a simulation study has been conducted.The real life comparative data modelling of the proposed distribution and LR test among the nested models are provided in Section 5.The article ends with concluding remarks in Section 6.

Multimodal Alpha Skew Normal Distribution
In this section we introduce a new form of multimodal skewed extension of skew normal distribution and investigate some of its basic properties.It is observed from Figure 4 that as  increases the number of peaks also increase, as  increases the curve tends to normal curve (but not in all cases) and the skewness is positive (negative) according to 0 ) ( 


. Also, the pdf approaches to the normal pdf as  and  tend to zero.So the three parameters play an important role on determining the shape of the proposed distribution.

Theorem 1:
The cdf of  and (.)  respectively are the pdf and cdf of standard normal distribution, is the characteristics function of where where is the mgf of where (Chakraborty et al., 2015).By substituting n = 1, 2, 3, etc. in (8), one can easily get the moments.In particular the mean is given by

Skewness and Kurtosis
The skewness or Pearson's 1 The distribution is positively (negatively . The values of 1  are plotted in Figure 5 against for different values of  and  .
From Figure 5, it is observed that 1  increases when  increase and remain constant for 5   .From Figure 6, it is observed that 2  increases when  increase and remain constant for 5   .

Characterization of MMASN distribution
This section is devoted to the characterizations of the MMASN distribution via a simple relationship between two truncated moments.The characterization applies a theorem of Glänzel (1987), Theorem 3 given below.Clearly, the result holds as well when the H is not a closed interval.This characterization is stable of weak convergence.(Glänzel (1990) be a continuous random variable with the distribution function F and let k and h be two real functions defined on H such that and F is twice continuously differentiable and strictly monotone function on the set H. Finally, assume that the equation k h =  has no real solution in the interior of H. Then F is uniquely determined by the functions k, h and  , particularly, be continuous, and let is a continuous random variable and ) (z h is as in Proposition 1.Then, Z has pdf (4) if and only if there exist functions k and  defined in theorem 3 satisfying the following first order differential equation The general solution of the above differential equation is where, D is a constant.A set of functions satisfying this differential equation is presented in Proposition 1 with satisfying the conditions of Theorem 3.

Maximum Likelihood Estimation and Simulation
A location and scale extension of is said to be the location (  ) and scale ( ) extension of Z and has the density function given by The corresponding normal equations and information matrix are provided in Appendix D.

Simulation
In order to study the efficiency of the Maximum Likelihood estimates of the parameters of , otherwise, step back to I and continue the process. Here,


. The number of replications r = 1500.For each sample the MLEs were computed using GenSA package in R and then Bias and MSE were computed.
From the simulation results presented in Table 1, 2, and 3 (in Appendix E), it is observed that the estimated values of the average bias and MSE gradually decrease as the sample size increases as expected.

Data Modelling Applications
We provide three applications of the new proposed distribution using real data for illustrative purposes to show the flexibility and usefulness of the new proposed distribution.
The first data set is concerning the Oits IQ Scores for 52 Non-White males hired by a large insurance company in 1971, given in Roberts (1988)  Using GenSA package in R, the MLE of the parameters are obtained by using global numerical optimization routine.In order to compare the models, we consider the model selection criteria viz., the AIC.
Table 5 shows the MLE's, log-likelihood and AIC of the above mentioned distributions.Graphical representation of the results taking only the top three competitors of the proposed model is given in Figure 7.
It is found from Table 5 that the proposed ) , , , , ( MMASN distribution provides best fit to all the three data sets in terms of AIC value.The plots of observed and expected densities presented in Figure 7 clearly confirm our findings.It is important to note that the proposed distribution could capture the multiple modes in all the three examples in much better way than the others.

Likelihood Ratio
, where , where, where, The result of the LR test is shown in Table 6.From Table 6, we observe that, in seven out of the nine test cases, the value of LR test statistic exceeds the corresponding critical value at 5% level of significance.Thus, there is evidence in support of the alternative hypothesis.Thus, we may conclude that the sampled data comes from

Conclusion
In this article, a new family of skew distribution is introduced which can cater to unimodal, bimodal as well as multimodal data modelling.Some of its distributional properties are investigated.To study the behaviour of MLE's a simulation study has been conducted.The numerical results show that the ) , , , , (      MMASN distribution provides better fit compared to the other known distributions applied here.The methodology applied in this article can be applied to extend Logistic and the Laplace distributions which will be considered in follow up works.Further, bivariate generalizations and logarithmic transformed distributions can also be considered in future.
D: Normal equations and information matrix: ) . Solving them simultaneously one may get the estimates of the parameters but solving them is not mathematically tractable.Hence the maximum likelihood estimates of ) , , , , ( . The generalized simulated annealing algorithm implemented in R software package is used in numerical optimization.The variance-covariance matrix of the estimators can be obtained by inverting the Fisher Information Matrix (I) given by ) ) ) While it is not easy to get the closed-form expression for the elements of I, the estimate of the elements of I can be well approximated by substituting the parameters by their corresponding MLEs.

Figure 1 . 2 .Figure 3 .
Figure 1.Histogram of the data Figure 2. Histogram and fitted ASN of the data where Figure 5.

values of 2 
are plotted in Figure6against  for different values of  and  .

Figure 5 .
Figure 5. Plots of skewness of

2 C
Applying this method a simulation study was conducted for sample sizes 100 = n , 300 and 500 with different combinations of the true values of the parameters  ,  and  for fixed values al. (2015).It may be worth noting that other related distributions we have fitted namely the alpha-skew-logistic distribution of Hazarika and Chakraborty (2014), the alpha-skew-laplace distribution of Harandi and Alamatsaz (2013), the alpha-beta-skewnormal distribution and beta-skew-normal distribution of Shafiei et al. (2016), the generalized alpha skew normal distribution of Sharafi et al. (2017), the Balakrishnan alpha skew normal distribution of Hazarika et al. (2020), the Log-Balakrishnan alpha skew normal distribution of Shah et al. (2020a), the Balakrishnan alpha skew logistic distribution of Shah et al. (2020b), the Balakrishnan alpha skew Laplace distribution of Shah et al. (2020c) have not been reported here as all those distributions too have inferior performance than the proposed one.
from other distribution considered in seven out of nine tests.

Figure 7 .
Plots of observed and expected densities of Data Set I, II, III.

Table 5 .
MLE's, log-likelihood, and AIC of Data Set I, II, III.

Table 6 .
The values of LR test statistic for different hypothesis.