Bayesian Nonlinear Latent variable Models with Mixed Non-normal Variables and Covariates for Multi-sample Psychological Data

The purpose of this paper is to develop a latent variable model with nonlinear covariates and latent variables. Mixed ordered categorical and dichotomous variables and covariates with two different types of thresholds (with equal and unequal spaces) are used in Bayesian multi-sample nonlinear latent variable models and the Gibbs sampling method is applied for estimation and model comparison. Hidden continuous normal distribution (censored normal distribution) and (truncated normal distribution with known parameters) are used to handle the problem of mixed ordered categorical and dichotomous data. Hidden continuous normal distribution (truncated normal distribution with known parameters) is used to handle the problem of mixed ordered categorical and dichotomous data in covariates. Statistical analysis, which involves the estimation of parameters, standard deviations and their highest posterior density, are discussed. The proposed procedure is illustrated using psychological data with the results obtained from the OpenBUGS program.


Introduction
Latent variable models (LVMs) (Lee, 2007) are a statistical technique for modelling a sequence of correlated data to estimate the interrelationships among manifest and latent variables. In recent years, many researchers have proposed models that contain nonlinear terms among the manifest and latent variables. Some of these papers have been proposed by Lee and Song (2003) Lee and Song (2002) offer a specific method for applying the Bayesian approach in factor analysis. They created an analysis model that implements joint Bayesian estimates for the factor scores and structural parameters as they are related to the determined constraints allowing multiple findings to be determined simultaneously. This system, which uses the Metropolis-Hastings algorithm in combination with the Gibbs model, has been proven efficient in generating computations of these estimates. Song and Lee (2006b) developed a Bayesian nonlinear latent variable model with nonlinear fixed covariates and latent variables in the structural model and linear fixed covariate and latent variables in the measurement model. Mixed continuous and dichotomous data are used in this study and hidden continuous normal distribution (truncated normal with unknown parameters) is presented to solve the problem of dichotomous data. Lee (2007) used underlying latent continuous normal distribution (truncated normal distribution with unknown parameters) to solve the problem of ordered categorical variables in Bayesian multi-sample nonlinear latent variable models, and Gibbs sampling method is used to estimate the parameters. The strategy of multi-sample analysis is very important in many applications, for instance, cross-cultural research. Practically, nonlinear effects, such as quadratic and interaction effects among the covariates and latent variables are important in establishing the substantive theory in many areas. The rapid growth of LVMs is due to the demand of subtle models and the related statistical methods for solving complex research problems in various fields. The Bayesian approach is developed with the Gibbs sampler algorithm (Geman and Geman, 1984), and the hidden continuous normal measurements and the latent variables in multi groups are treated as hypothetical missing data. Non-informative priors are used for the thresholds (cut points with equal and unequal distances) and conjugate priors are used for the structural parameters. The main objective of this paper is to propose a Bayesian approach for analysing multisample nonlinear LVMs with mixed variables and covariates. The Deviance Information Criterion (DIC; see Spiegelhalter et al., 2002) is used for model comparison. The main idea is to handle the mixed variables and covariates in the Bayesian analysis and to treat the hidden continuous measurements as a missing data and augment them with the observed data in the posterior analysis. The paper is organized as follows. The model description is described in section 2. The Bayesian estimation of multi-sample latent variable models that contain nonlinear models is described in Section 3. The comparison of models using DIC is described in Section 4. A case study of psychological data is presented in section 5. The results and discussion are described in section 6, and some concluding remarks are given in section 7.

Model Description
The latent variable models which have been suggested for application in this case have linear covariates and latent variables within the measurement equation. It also has nonlinear latent variables in the structural model, as well as nonlinear covariates. For example, the following LVM is considered.
where ( ) = { ( ) , ( ) } has been established as a × 1 random vector of manifest variables, where = ( 1 , 2 , . . . , ) is a subset of variables whose exact continuous measurements are unobservable, = ( 1 , 2 , . . . , ) is the remaining subset of variables such that ≥ = − ≥ 0 and the corresponding continuous measurements are unobservable, ( ) ( × 1 ) is a vector of linear covariates, ( ) ( 1 × 1) is a vector of mixed ordered categorical and dichotomous covariates, ( ) ( × ) is thus a matrix with unknown parameters, which is usually considered the factor loading matrix, ( ) is a × 1 random vector of latent variables, ( ) is a × 1 random vector of residuals. As a result, it is assumed that the resulting is independent, and that ( ) is then distributed independently as [0, ( ) where is a diagonal matrix with diagonal elements 1 , . . . , . It has also been determined that, in this case, ( ) and ( ) are both independent as well.
To carry out more complicated mathematical scenarios, , a latent vector, is subdivided into( , ) , in which ( 1 × 1) and ( 2 × 1) are both vectors. More specifically, ( 1 × 1) is the vector of the endogenous latent variables and ( 2 × 1) is the vector of the exogenous latent variables. In order to create an estimation of the potential significant causal effect of ( 2 × 1), the vector of mixed ordered categorical and dichotomous covariates on , however, if is non-normal, then is non-normal also. Thus, it is important to solve the problem of mixed ordered categorical and dichotomous data in covariates. The latent variable is defined by the following general latent variable: where ) is a vector-valued function with differentiable functions 1 , . . . , , and ( ) ( 1 × ) is a matrix of unknown parameters. For simplicity, (2) can be expressed as: is a vector of error measurements, ( ) = ( ( ) , ( ) ), and ( ( ) , ( ) , ( ) ) = ( ( ) , ( ( ) , ( ) ) ) . It must first be assumed that where is representative of a diagonal matrix containing the elements 1 , . . . , 1 , and for which ( ) and ( ) are each independent. A more specific example of the generalized nonlinear latent variable defined in (2) that is associated with ( ), ii  =  = ( 1 , 2 ) , and = ( 1 , 2 ) is: (2) + 2 (2) 1 (2) 1 (2) + 3 (2) 1 (2) 2 (2) + 4 (2) 1 (2) 1 (2) 2 (2) + 5 (2) 1 (2) 1 (2) 2 (2) + 1 (2) 1 (2) + 2 (2) 2 (2) + 3 (2) 1 (2) 1 (2) + 4 (2) 2 (2) 2 Here, (1) = ( 1 (1) , 2 (1) , 3 (1) , 4 (1) , 5 (1) , 1 (1) , 2 (1) , 3 (1) , 4 (1) ), (2) = ( 1 (2) , 2 (2) , 3 (2) , 4 (2) , 5 (2) , 1 (2) , 2 (2) , 3 (2) , 4 (2) ), where g = 1; 2. Further, and are both quadratic terms of elements, which can be assessed via an appropriately defined LVM. As may be drawn from the arbitrary distributions for covariates that are mixed ordered categorical and dichotomous data, the proposed nonlinear latent variable model can be used to manage a large variety of situations. Furthermore, with regard to nonlinear latent variable models, one must be careful to correctly interpret the mean vector, , namely as it relates to . More specifically, allow and to stand for the k th row for each A and . when = 1, . . . , ; it can be determined, as in Equation (1), that = ( ) which is also= + ( ). However, when ( ) = 0, it can then be determined that in accordance with Equation (2), ( ) ≠ 0if ( , ) is a nonlinear function of . Therefore, going forward, both ( ) ≠ 0, and ≠ . So, allow = ( , ) to be a partition of which corresponds with = ( , ) , which is also a partition. If follows that ( ) = 0, and = ( − ) −1 ( , ), it follows from Equation (1) However, ( , ) is generally uncomplicated when used in the practical application setting, and as such it can be expected that ( ( , )) is also relatively simplistic, allowing the computation of to be performed without any struggle. It is also valuable to study this indirect method for modeling covariates, similar to those demonstrated above, by first supplementing with , and then by then managing each element of the latter as if it is exogenous latent variable which can be measured precisely with use of a single indicator. To manage the difficulties that arise as the result of mixed ordered categorical and dichotomous outcomes, assume that ( ) is a × 1sub-vector of unobservable continuous responses, and that the information derived from it is reflected by an observable ordered categorical vector ( ) . Generally speaking, in keeping with this idea, the ordered categorical variable, in this case ( ) , can be defined according to the related latent continuous random variable ( ) by: Also, when regarding the dichotomous data, the correlation between y and z can be determined according to the set of cut points in which x in Equation (8) signifies dichotomous variables, it follows that To handle the problem of mixed ordered categorical and dichotomous data in cavariates, we will use hidden continuous normal distribution ( ) (truncated normal distribution with known parameters). For ordered categorical covariates data, it follows that: To handle the problem of dichotomous data in cavariates ( ) , we will use hidden continuous normal distribution ( ) (truncated normal distribution with known parameters). It follows that: It should be understood, however, that for every mixed ordered categorical and dichotomous variable, the number of thresholds (cut points) for each group are equivalent. However, we use two types of thresholds (equal and unequal categories distances). Lee et. al., (1990) determined that a single-sample method could be implemented with mixed ordered categorical and dichotomous variables, but could not be identified with imposing special identification conditions.

Bayesian Analysis of Multi-sample Nonlinear Latent variable Models
Allow ( ) be an unknown parameter vector within the previously acknowledged model and similarly allow ( ) be a vector of unknown thresholds for the mixed ordered categorical and dichotomous variables, which correspond to the gth group. This is selected because a select kind of parameter in ( ) is frequently identified as an invariant within group models in multi-sample analysis. For example, restrictions on cut points are subject to the following constraints: (1) =. . . = ( ) , (1) =. . . = ( ) and/or (1) =. . . = ( ) , are frequently implemented as thresholds on the model. Hence, when analyzing the data, we can allow certain common parameters, (1) =. . . = ( ) . More specifically, we allow to be a vector containing the complete set of unknown distinct parameters (1) =. . . = ( ) , and allow to be the vector that encompasses all the unknown thresholds. Thus, the Bayesian estimate of and are generated according to the Gibbs sampler. More specifically, allow ( ) = ( 1 ( ) , . . . , ( ) ) to stand for the observed ordered categorical data matrix, ( ) = ( 1 ( ) , . . . , ( ) ) to represent the observed dichotomous data, and ( ) = ( 1 ( ) , . . . , ( ) ) and ( ) = ( 1 ( ) , . . . , ( ) ) to denote the matrices of latent variables and continuous measurements, respectively. Then, in the posterior analysis, augment the observed data with Y. Once Y has been established, all the data is accounted for, and considered continuous, so the problem will be simpler to manage. Also, be aware that the observation of , or the nonlinear structural and measurement equations to condense into a regular simultaneous regression model. Complications that arise as the result of the nonlinear relationships between the latent variables are significantly improved. Thus, issues associated with the more complex elements of the model can be dealt with by augmenting the data. Through posterior analysis, (Z), which represents the set of observed data, can be supplemented by( , ). Further, we will demonstrate the joint posterior distribution[ , , , | ]. The Gibbs sampler, as developed by Geman and Geman (1984), can be applied in order to create a series of observations from the related joint posterior distribution. As a result the Bayesian solution can be gained through a series of standard inferences based on the generated sample of observations. Further, by using the Gibbs sampler, we can use the iteration approach to create a set of sample observations from these conditional distributions: To begin, the prior distributions for the unconstrained parameters from various groups are implicitly expected to act independently. Further, when creating an estimate for the unconstrained parameters, it becomes necessary to identify the specific value of its prior distribution, and to outline the data that belongs in the corresponding groupings, in order for them to be fully applied. In the case of constrained parameters across groups, a prior distribution for the associated constrained parameters is required, and all the related data groups are then combined for estimating (see Song and Lee, 2001). This section describes the Bayesian estimation and model comparison in the setting of multiple group nonlinear LVMs with ordered categorical variables. In completing the general scheme, the idea of data augmentation is used together with MCMC tools. Theoretically, a multiple group nonlinear LVM is a particular case of the two-level LVM, with some conditional distributions required in the Gibbs sampler which can be achieved from the outcomes. However, as specific constraints among the parameters in different groups are compulsory, it is essential to pay more attention in stipulating the equivalent prior distributions. Likewise, the model comparison in two-level LVMs requires some insight in applying the path sampling procedure (Lee and Song, 2012). The goal for this section is to define how to analyze the preceding nonlinear LVM, in the context of the mixed ordered categorical and dichotomous variables, using the Bayesian approach. Using this approach is beneficial to the overall application in several ways, including: (1) application of the prior knowledge can enhance the overall analysis when it is directly incorporated. More specifically, it generates more accurate parameter estimation. It should be noted, however, that ( , | ) is dependent on sample size, where ( ) is not. As such, for problems with a large sample, ( ) is less significant and ( | , ), the posterior density function, is more relevant, as it is most similar to the likelihood function ( , | ). So, both the Bayesian approach and ML model are asymptotically equivalent, and thus contain the same optimal asymptotical properties. However, continue to note that ( ) is significant with regard to the Bayesian approach when the sample size is reduced or when the information derived from Z contains mixed ordered categorical and dichotomous data. In this case, MCMC methods are applied by allow yi to be the unobserved variables that parallel the manifest mixed ordered categorical and dichotomous variables in Zr , Zs. This means that it is necessary to specifically identify the prior distribution for the related components in , even if developing the conditional distribution, ( | , , , , ) described in Step (1). In generally, during Bayesian analysis, the conjugate prior distributions have proven to be both malleable and suitable to the purpose (Broemeling, 1985). This kind of prior distribution has been widely applied to many Bayesian analysis in latent variable models, (see Song and Lee, (2007). Hence, the following well-known conjugate prior distributions are used: Given the definition that (⋅) ∼is the probability of (⋅), and that (⋅)is distributed according to, , which is the k th diagonal element of , ′ and ′ are the k th rows of and , respectively. 0 , 0 and 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , and 0 are assumed to be known as prior information. Generally speaking, prior information is obtained via causal observance, theoretical consideration, or analysis of past data.
More specifically, when using the Bayesian approach, it is necessary to evaluate the posterior distribution[ , , | , , ] with the case of non-normal covariates , but the distribution can become relatively complex. So, in order to correctly demonstrate the characteristics, an increased number of observations are drawn, so that the related empirical distribution of the resulting observations remains consistent with the true distribution. The Gibbs sampler makes an excellent candidate for this process, according to (Geman and Geman, 1984), because it can simulate , and , all from the conditional distribution. However, as a result of the existence of mixed ordered categorical and dichotomous variables and covariates in this case, the related conditional distributions can be made too complex to easily derive or simulating data from them. This encourages the additional augmentation of , , , the latent matrices, in the posterior analysis, and motivates attention to the joint posterior distribution[ , , , , , , | , , , ]. To garner observations of this posterior distribution, using the Gibbs sampler, it is essential to begin with the starting values ( (0) , (0) , (0) , (0) , (0) , (0) , (0) ). The following procedure is then implemented to simulate ( (1) , (1) , (1) , (1) , (1) , (1) , (1) ) and so on. More

Model Comparison
in which ̃ is the Bayesian estimate of . Let { ( ) : = 1, . . . , } be a sample of observations simulated from the posterior distribution. The expectations in Equations (15) and (16) can be estimated as follows: , ).
(17) In Bayesian LVMs, the model with the smaller DIC value is selected.

A Case Study of Psychological data
Let us consider the data which can be used for deriving results of two different independent samples which are selected from the natural history based on the study of a rural drug found in Ohio (n=200) and Kentucky (n=200) in USA between the time 2003 and 2005 (Booth, et. al, 2006). Further many amendments were made under the BSI-18 scale which involved 3 categories of psychiatric disorders which was measured under this scale and considered aspects like somatization (SOM), depression (DEP), and anxiety (ANX). The data consists of sixteen variables and two covariates in each group. Moreover 9 of these variables were evaluated based on ordered categorical variables: (1, not at all; 2, a little bit; 3, moderately; 4, quite a bit; 5, extremely and the rest of variables are dichotomous variables and recoded as a dummy variable: 0, 'not at all' or 'a little bit;' 1, 'Moderately' to 'Extremely' for the purpose of model demonstration (Wang and Wang, 2012). A real data study is presented here to give some idea of the empirical performance of the proposed Bayesian approach in which 16 manifest variables are related to two basic latent variables ( ( ) , 1 ( ) 2 ( ) ) from multi-sample nonlinear LVMs defined in Equation (19) and Equation (20), respectively. Hence, some quadratic and interaction effects of the latent variables are considered. To illustrate the Bayesian methods in analysing linear and nonlinear latent variable models with mixed ordered categorical and dichotomous variables, we use a real data set that is related to random vectors with G=1,2, (1)
where parameters with an asterisk are treated as fixed for identifying the model.

Results and Discussion
The objective of this section is to present results of a simulation study to reveal the empirical performance of the Bayesian estimates and the DIC for model comparison. For nonlinear LVMs with covariates, we have the following proposed models for g=1,2:       The best fitted model with a smallest DIC value is a censored normal distribution with unequal spaces of thresholds (12408). Also, the DIC value of truncated normal distribution with equal spaces of thresholds is (13456). As a result, we observed that the performance of DIC is not satisfactory and would be worse with mixed data using truncated normal distribution with unequal spaces of thresholds.

Conclusions and Recommendations
The multi-sample nonlinear models which involving nonlinear effects with nonlinear covariates and latent variables are very common in social and behavioural sciences. The first purpose of this analysis was to use multi-sample nonlinear LVMs with nonlinear covariates and latent variables to obtain all the estimated parameters. The second purpose is to solve the problem of mixed variables by using hidden continuous normal distribution (censored normal distribution and truncated normal distribution) and to solve the problem of mixed ordered categorical and dichotomous covariates using hidden continuous normal distribution (truncated normal distribution with known parameters). The proposed procedures have been done using two types of thresholds (with equal and unequal categories distances). However, this assumption is likely to be violated in many practical applications.
In this paper, a Bayesian approach is proposed for analysing multi-sample nonlinear models with mixed variables. In addition to point estimation, we provide statistical methods to obtain standard deviations estimates, and model comparison using the Deviance Information Criterion (DIC) owing to the complexity of the proposed model. As we have seen, difficulties arising from the nonlinear causal relationships among the latent variables and the discrete nature of mixed data manifest variables are alleviated by data augmentation with some MCMC methods. This strategy is very powerful and can be applied to other more complex models.