Kumaraswamy regression modeling for Bounded Outcome Scores

In this paper, we use a regression model for modeling bounded outcome scores (BOS), where the outcome is Kumaraswamy distributed. Similar to the Beta distribution, this distribution can take a variety of shapes while being computationally easier to use. Thus, it is deemed as a suitable alternative distribution to the Beta in modeling bounded random processes. In the proposed model, the median of a bounded response is modeled by the linear predictors which are defined through regression parameters and explanatory variables. We obtained the maximum likelihood estimates (MLEs) of the parameters, provided closed-form expressions for the score functions and Fisher information matrix, and presented some diagnostic measures. We conducted Monte Carlo simulations to investigate the finite-sample performance of the MLEs of the parameters. Finally, two practical applications of this model to the real data sets are presented and discussed.


Introduction
Bounded outcome scores (BOS) are measures restricted to a finite interval and are common in many health and medical disciplines. Visual Analogue Scale (VAS), which is often used to assess the severity of pain in pain relief studies (Wewers and Lowe, 1990) ; the PASI score that is used to record intensity psoriasis (Hu, Yeilding, Davis, and Zhou, 2011); the Barthel index for Activities of Daily Living (ADL) in stroke patients (Lesaffre, Rizopoulos, andTsonaka, 2006); quality-of-life (QoL) measures in survey studies involving health assessment (Arostegui, Núñez-Antón, and Quintana, 2007; Hunger, Baumret, and Holle, 2011; Hunger, Döring, and Holle, 2012) are wellknown instances of BOS measure outcomes. The distributions of these outcomes can often take different shapes including unimodal, U-shaped, and J-shaped. Due to the variety of their distributions, BOS typically display heteroscedasticity, where the variance is smaller near the extremes, and asymmetry. Therefore, usual regression models such as simple linear or nonlinear regression models, are not suitable for these outcomes (Kieschnick and McCullough, 2003 Xu et al., 2013). Beta distribution is a well-known distribution which is widely used to model double bounded data. The popularity of Beta distribution lies in the high flexibility of its density function which can take many shapes (e.g. unimodal, uniantimodal, increasing, decreasing, and uniform) through changing its parameter. Beta regression model, introduced by Ferrari and Cribari-Neto (Ferrari and Cribari-Neto, 2004) has been used in various disciplines to model random variables limited on a finite interval. An alternative to the Beta distribution, which possesses similar characteristics, yet easier to use computationally is Kumaraswamy ("Kum for short") distribution (Kumaraswamy, 1980). Kum distribution was proposed to model hydrological data, but is in fact useful for modeling many other outcomes that have lower and upper bounds such as heights of individuals, scores obtained in a test, atmospheric temperatures, and so on. A random variable ̃ following a Kum distribution on the interval [ , ] with respective shape parameters p > 0 and q > 0, is denoted by ( , , , ) and has a density function given by The mean and variance of this distribution are respectively give by: and where (. , . ) is the Beta function.
The Kum distribution has many of the properties of the Beta distribution with some advantages in terms of tractability. Jones (Jones, 2009) specified the basic properties of the Kum distribution and discussed some similarities and differences between the Beta and Kum distributions. According to him, both distributions are very flexible and can take many different shapes depending on the values of their parameters. The author also emphasized several advantages of the Kum distribution over the Beta distribution: the normalizing constant is very simple; the distribution has a simple explicit formulae and quantile functions which do not involve any special functions; a simple formula for random variate generation; explicit formulae for moments of order statistics and L-moments. In addition, Jones noted that the Beta distribution has the following advantages over the Kum distribution: simpler formulae for moments and moment-generating function; a one-parameter sub-family of symmetric distributions; simpler moment estimation and more ways of generating the distribution via physical processes. Due to the similarity of Kum and Beta distributions and some advantages of Kum to the Beta distribution, including its explicit expression of the distribution and quantile functions, the Kum distribution has attracted some attention in the literature (Cordeiro, Nadarajah, and Ortega, 2012; Cordeiro, Ortega, and Nadarajah, 2010; de Pascoa, Ortega, and Cordeiro, 2011; Kızılaslan and Nadar, 2016). This study aimed to use the Kum distribution for modeling BOS. However, since in the standard parameterization of the Kum distribution, shape parameters are functions of the covariates, formulation of regression modeling and interpretation of the parameters are more difficult. In addition, the lack of simple closed-form expressions for the mean and variance of the Kum distribution, given in (2) and (3), has limited its use for modeling purposes. On the other hand, the median of this distribution has the following simple expression, Therefore, we consider the median-based re-parameterization proposed by (Mitnik and Baek, 2013) aiming to facilitate its use in regression models. In order to develop the proposed model, we will follow the generalized linear models (McCullagh and Nelder, 1989) and Beta regression (Ferrari and Cribari-Neto, 2004) methodologies,; although as stated before, we will be modeling the median instead of the mean. The paper unfolds as follows. Section 2 presents the Kum regression model and discusses the maximum likelihood estimation of this model. A Monte Carlo simulation study for assessing the finite sample performance of the conditional maximum likelihood approach is performed in Section 3. Finally, two applications using real data are presented in Section 4 and concluding remarks are given in Section 5.
We use the general theory of maximum likelihood estimation. The estimates in this approach have properties such as asymptotic normality, consistency, and efficiency (see Ferguson, 2017;and Millar, 2011). Let a random variable ̃ has the Kum distribution with density given by (1). With = (̃− )/( − ), its medianbased re-parameterization density function (Mitnik and Baek, 2013) is denoted by ( , ) which is given by where = (̃− )/( − ), 0 < < 1, and p > 0. This form of density function allows us to use Kum distribution and develop models analogous to the Beta regression models. With a Kum-distributed dependent variable, ~ ( , ), t=1,…,n ; the model is obtained by assuming that the median of can be written as: is the linear predictor, = ( 1 , … , ) is a vector of unknown regression parameters ( ∈ ℝ ) and 1 , … , are observations on k known covariates ( < ). Moreover, we assume that the link function (. ): (0,1) ⟶ ℝ is a strictly monotonic and twice differentiable. Note that a number of various link functions such as logit, probit, log-log, and complementary log-log can be used. A comparison of these link functions can be found in Atkinson (1985, Ch . 7) and (Atkinson and Atkinson, 1985;McCullagh and Nelder, 1989). Our focus will be on the logit link, ( ) = log ( 1− ), which is one of the more popular link function that can be used. The popularity of the logit function is due to its straightforward interpretation, since the resulting regression coefficients can be interpreted as odds ratios. Let = ( 1 , … , ) be a random sample of size n from Kum distribution and = ( , ) be a ( + 1)-dimensional parameter vector. The log-likelihood function for the Kum regression model can be written as:  (6) is a function of β. The MLE can be calculated by maximizing the log-likelihood function (7). The components of the score vector are obtained by differentiating the log-likelihood function with respect to the unknown parameters and . The derivative of the log-likelihood function with respect to the i th element of , is given, for i=1, . . ., k, as = , and thus , and The matrix form of (9) is given by and by using (10), it can be written as follows: ). An iterative optimization algorithm requires initial value. The starting value of the regressors parameter β was selected from the maximum likelihood estimate from a Beta regression.
In the next step, we obtain an expression for Fisher information matrix as minus the expected value of the second derivatives of the log-likelihood. It is shown in Appendix A that Fisher information matrix is given by:

Numerical results
In this section, we present the results of Monte Carlo simulation, where we study the finite-sample distributions of the MLEs of the parameters and along with the Kum regression model in which (ω t ) = 0 + 1 1 + 2 2 , = 1, … , , where (. ) is the logit link function.
The covariate values are obtained as random draws from the following distributions: covariate 1 is generated from a continuous Uniform U (1, 5) distribution, while 2 is generated from a Binomial distribution with three number of trials and 0.5 success probability in each trial (x 2t~( 3,0.5)). The covariate values remain constant throughout the simulations. We take 0 = −6.0, 1 = 1.0 and 2 = 2.0, which resulted in median values between zero and one. The value of p is also set at p = 2. We take samples of sizes n = 50, 100, 200, and 500. For each sample size, we perform 1000 simulations and compute the mean of the estimates, percentage relative bias and mean square errors. It should be noted that the percentage relative bias is defined as the ratio between the bias and the true parameter value times 100. Table 1 presents the simulation results, and the plot of corresponding percentage relative bias (RB %) for the different sample sizes n, is shown in Figure 1. We note that the mean of estimates is close to the true parameter values, the percentage relative bias and mean square errors decrease as sample size increases. Overall, we observe that the estimates improve with increasing sample size.

Application
In this Section, we present two applications of the Kum regression model described in Section 2.

Health-related quality of life in patients with epilepsy
The first application is based on an empirical dataset that comes from a longitudinal study on the quality of life in patients with epilepsy in Iran from March 2014 to December 2015. Epilepsy is one of the most common chronic brain disorders that has a considerable impact on person's QoL. We will examine the relationship between medication adherence and seizure severity on QoL. Medication Adherence and seizure severity were assessed by the Medication Adherence Report Scale (MARS-5) and the Liverpool Seizure Severity Scale (LSSS) questionnaires, respectively. The quality of life in epilepsy (QOLIE-31) was used for measuring health-related QoL in epileptic patients. QOLIE-31 has seven sub-scales, which are seizure worry, cognitive functioning, energy/fatigue, emotional well-being, social functioning, medication effects, and overall QoL (Cramer et al., 1998).The subscales are constrained to a bounded interval (e.g. at 0 and 1) and a higher score indicates better QoL.
Our interest is to model one of the sub-scales of QoL, namely social functioning, as a function of medication adherence and seizure severity. Table 2 lists some summary statistics for the social functioning. We note that the median of social functioning is greater than its mean, indicating left skewness of its distribution. Given that the outcome, social functioning, is bounded between 0 and 1 and its distribution is negatively skewed, the traditional regression models may not be suitable for regression modeling. Therefore, the Beta and Kum regression models which is proposed in Section 2 will be used instead. We determine the MLEs of the model parameters using the quasi-Newton optimization algorithm known as BFGS with analytical derivatives. The choice of starting values for the unknown parameters is based on the suggestion made in the second section. The MLEs of the model parameters, their estimated standard errors, and the Akaike Information Criterion (AIC) are presented in Table 3. We note that the AIC value for the Kum model is smaller than the AIC value for Beta model. So, based on the AIC criterion, the Kum distribution is preferable in fitting these data than the Beta distribution.

Food expenditure
The next application is based on the food expenditure data analyzed by (Ferrari and Cribari-Neto, 2004), the source of the data is (Griffiths, Carter Hill, and Judge, 1993). We model the proportion of income spent on food as a function of the level of income and the number of persons in the household using Beta and Kum regression models. The link function used is the logit link.
Estimates of the parameters, their standard errors, and the Akaike Information Criterion (AIC) are given in Table 4. The values in Table 4 show that the estimated coefficients associated with the covariates are quite similar in both models. The lower AIC value for the Kum model indicates that this model is a better fit for the current data than the Beta model.

Concluding remarks
In this paper we introduced a regression model for the double bounded dependent outcomes. The underlying assumption was that the response variable follows a Kum distribution. In the proposed regression model, the median of the dependent variable was modeled by a linear predictor that was defined by regression parameters and explanatory variables. Inference about the Kum regression model was discussed and the parameter estimation was performed via maximum likelihood. The closed-form expressions were obtained for the score functions and the Fisher's information matrix. Based on the asymptotic normality of the maximum likelihood estimator, confidence interval and hypothesis testing were obtained. Furthermore, some diagnostic measures were provided. A Monte-Carlo simulation study was performed to evaluate the finite-sample performance of the MLEs of β and p with the Kum regression model. The simulation results showed that the MLE performs well even for small sample sizes. Finally, to illustrate its usefulness, two applications of the Kum regression model to the empirical real data were presented, where one of them was based on a previously published results. We hope the current model could attract applications in several areas.

Appendix A
In this appendix we provide observed information matrix for ( , ) in the class of kum regression models. The information matrix can be obtained as minus the expected value of the second derivatives of the log-likelihood.

Observed information matrix
The second-order partial derivative of equation (7) with respect to s is given by = . .
In matrix notation, we have  Finally, the second derivative of ( , ) with respect to comes by differentiating the expression in equation (12) with respect to p. Therefore ( 2 ( , ) 2 ) = ∑ =1 ,which, in matrix notation, can be written as Now, the Fisher information matrix for ( , ) as given in (13), can be obtained.