The Performance of Bayesian Analysis in Structural Equation Modelling to Construct The Health Behaviour During Pandemic COVID-19

Originating from Wuhan, China, COVID-19 is spreading rapidly throughout the world. The epidemiological model is required to provide evidence for public health policymakers to reduce the spread of COVID-19. Health behaviour is assumed could reduce the spread of this virus. This study purposes to construct an acceptable model of health behaviour. To achieve this goal, a Bayesian structural equation modelling (SEM) is implemented. This current study is also purposed to evaluate the performance of Bayesian SEM, including the sensitivity, adequacy, and the acceptability of parameters estimated with the result that the acceptable model is obtained. The sensitivity of the Bayesian SEM estimator is evaluated by choosing several types of prior and the model results are compared. The adequacy of the Bayesian SEM estimate is checked by doing the convergence test of the corresponding model parameters. The acceptability of the Bayesian approach and its associated algorithm in recovering the true parameters are monitored by the Bootstrap simulation study. The Bayesian SEM applies the Gibbs sample approach in estimating model parameters. This method is applied to the primary data gathered from an online survey from March to May 2020 during COVID-19 to individuals living in West Sumatera, Indonesia. It is found that health motivation is significantly related to health behaviour. Whereas socio-demographic and perceived susceptibility has no significant effect on health behaviour.


Introduction
The first cases of COVID-19 (coronavirus disease 2019) were found in the city of Wuhan, Hubei Province, China in December 2019 (Hou et al., 2020). The pneumonia-causing virus was identified as the new coronavirus named a severe acute respiratory syndrome (SARS-CoV) 2, and the disease was named COVID-19 by World Health Organization Zhu et al., 2020). Within two months, the infection spread rapidly to other countries and regions. As of May 4, 2021, worldwide, there were 154,159,212 cases of COVID-19 and 3,226,235 death related to it (Worldometers, 2021). In Indonesia, 1,682,004 cases with 45,949 death among them have been confirmed. Several policies have been established to prevent the spread of this pandemic, including in Indonesia. The policies include a strict city-wide lockdown and a ban on foreign tourists from entering the country. These policies have considerably influenced people's lives, including changing their health habits and lifestyle. It needs someone to wear masks, wash their hands with soap or alcohol hand sanitizer, avoid crowded places, etc. . Health behaviour is needed to protect people from illness. Khoso et al (2016) state that health behaviour is a health-related activity to maintain good health and prevent potential health hazards.
Health behaviour contains several primary concepts that predict why people will take action to prevent, screen for, or control illness conditions; these include susceptibility, seriousness, benefits, and barriers to a behaviour, cues to action, and most recently, self-efficacy (Zvolensky et al., 2020). Health behaviour is achieved since individuals expect to avoid diseases. They expect that a specific health action may prevent illness besides individuals having the motivation to stay healthy. The expectancy was further delineated in terms of the individual's estimates of personal susceptibility to and perceived susceptibility to an illness and the likelihood of being able to reduce that threat through personal action (Conner & Norman, 2007;Glanz et al., 2008).
Because of the COVID-19 pandemic, it's important to model the health behaviour of individuals to prevent the spread of this disease (Perlman, 2020). Various studies have been carried out to identify the factors of health behaviour. Corner and Norman (Conner & Norman, 2007) identified that individuals have health motivation and perceived susceptibility that affects their health behaviour to avoid illness. Glanz et al. (2008) modeled that perceived susceptibility varies for different ages, gender, and ethnicity. They found that perceived susceptibility could affect the individual's health behaviour.
Health behaviour, socio-demographics, health motivation, and perceived susceptibility cannot be measured directly from the research object unless measured by observable variables. These factors are known as unobservable or latent variables. The structural equation modelling (SEM) approach is particularly suitable for modelling the interrelationship between these latent variables and their corresponding indicators (Kelava et al., 2014;Olsson et al., 2000;Yanuar, 2016;Yanuar et al., 2010). SEM assumes that the observations are independent and identically distributed using the multivariate normal distribution. If this assumption is not met, the estimate of the parameter model could be difficult to obtain in the usual way. Therefore, many researchers, such as Asparouhov & Muthén (2021) and Yanuar et al. (2013), proposed using the Bayesian approach in SEM to overcome these problems. This method uses a Gibbs sampler to obtain samples of arbitrary sizes for summarizing the posterior distribution for describing the parameters of interest (Nam et al., 2018;Y. Thanoon et al., 2016). The user can compute point estimates, standard deviations, and interval estimates from these samples to make an inference. The Bayesian approach is appealing because it allows the user to use prior information to update current information about the parameters of interest (Asparouhov & Muthén, 2021;Cain & Zhang, 2019). Because of this framework, parameter estimation based on Bayesian is more precise than standard SEM (Garnier-Villarreal & Jorgensen, 2020; Yanuar et al., 2013). Accordingly, the main purpose of this study is to evaluate the performance of Bayesian SEM in constructing the health behaviour model during the COVID-19 pandemic of an individual living in West Sumatra. The performance of Bayesian SEM will be evaluated in terms of its adequacy, acceptability, and sensitivity, since the traditional fit indices, such as RMSEA, CFI or TLI are not available when performing Bayesian SEM (Asparouhov & Muthén, 2021;Cain & Zhang, 2019). Not much works have been done on modelling of health behaviour using Bayesian SEM, particularly when information on socio-demographics, perceived susceptibility, health motivation are considered. This study becomes important to be done in the condition of pandemic COVID-19 outbreak.

Data
The data in this study was obtained by distributing the online questionnaires and house visits between March to May 2020. Online questionnaires are conducted because due to the physical distancing policy. The questionnaires are disseminated via WhatsApp and email. A small number of respondents are chosen randomly based on the house visit. The house visit is chosen randomly based on the list of available households with the enumerators. The respondents eligible to participate in the survey are those who are 18 years old or older and living in West Sumatra, representing the adult population of West Sumatra. More than 1000 respondents participated in this survey. However, 756 respondents with complete information were involved in the analysis. The information gathered in the survey includes information about socio-demographics, perceived susceptibility, health motivation, and health behaviour of individuals living in West Sumatra, a province of Indonesia. The province has an area of 42,012.89 km 2 , with a population of 5,534,472 at the 2021 census. The province is subdivided into twelve regencies and seven cities. Padang is the province's capital and the largest city. Table 1 provides the descriptive statistics of respondents involved in this study.

Factors of Health Behaviour
The health behaviour can be in the form of preventive behaviour or health facilities. Khoso et al. (2016) said that Health behaviour is influenced by Health belief factors that can form healthy living behaviour during the COVID-19 pandemic to avoid the transmission of this disease outbreak. Thus, the model is used in explaining and predicting preventive health behaviour, as well as sick-role and illness behaviour. This health behaviour model has been applied to all studies of health behaviour. A person's motivation to undertake a healthy behaviour can be divided into two main categories: individual perceptions and modifying behaviour. Individual perceptions are factors that affect the perception of illness or disease. They deal with the importance of health to the individual, in this study named it health motivation. Modifying factors that are included in this study are socio-demographic and perceived susceptibility. Combining these factors causes a response that often manifests into action, provided it is accompanied by an alternative rational course of action to having healthy behaviour (Arora & Grey, 2020). Socio-demographics, health motivation, perceived susceptibility, and health behaviour are unobservable variables measured through indicator variables. The following briefly explains each unobservable variable considered in this study and its respective indicators. The indicators used for describing the socio-demographic factor are age, gender, and educational level . The age of the respondents is classified into six levels, which are 'between 18 and 20 years old', 'between 21 and 30 years old', 'between 31 and 40 years old', 'between 41 and 50 years old', 'between 51 and 60 years old', and 'more than 60 years old', coded as 1 to 6 respectively. For respect to the gender of the respondents, the responses obtained are coded 1 as 'male' and 2 as 'female'. The responses to educational level are coded as 1 as 'attend elementary school', 2 as 'attend junior high school', 3 as 'senior high school' and 4 as 'attend college/university'. The indicators for health motivation as hypothesized in this study are based on a study by Arora & Grey (2020). The indicators considered in this study are avoiding touching the face, avoid shaking hands, avoid meeting or standing in long queues, avoid touching objects in public areas, avoid taking public transportation (online), avoid going home, avoid worshiping in mosque/church/others, and avoid ordering online food. The responses obtained for each indicator are coded 1 for 'never', 2 for 'seldom', 3 for 'sometimes', 4 for 'often', and 5 for 'always'. Meanwhile, perceived susceptibility as a latent variable is measured by indicator variables based on the study by Conner & Norman (2007). The respondents were asked about their worries regarding personal health, the health of their family members, when they go outside, when they are going to their village, work/school, and using public facilities / public transportation, and food availability. The responses obtained for each indicator are coded as 1 to 5 for denoting 'not worried at all', 'not worried', 'a little worried', 'worried', and 'very worried', respectively.
Many of the health behaviours in response to SARS were similar to those recommended to prevent the spread of COVID-19 including frequent handwashing, social distancing, and self-quarantining (Parekh & Deierlein, 2020;Zvolensky et al., 2020). In this study, the indicators of health behaviours are keeping a distance of 2 m, wear a mask, hand sanitizer, wash hands for 20 seconds, inform others if having symptoms of COVID-19. The responses of each indicator are in 5 Likert scales which are considered to include 'never' and coded as 1, 'seldom' coded as 2, 'sometimes' coded as 3, 'often' coded as 4, and 'always' coded as 5. Figure 1 provides the conceptual model of health behaviour based on the SEM structure constructed in this study. It is also reasonable to hypothesize that health behaviour is affected by socio-demographics, perceived susceptibility, and health motivation. The hypothesis indicators for each latent are provided in Table 2.

Estimation Method
SEM combines two types of equations simultaneously. These are structural equations and measurement equations. The measurement equation explains the relationship between the indicator variable to its latent variable, which is formulated as follows:

Where
represents indicator variable and is as latent variables, is loading factors and is measurement errors. Meanwhile, the structural equation is the interrelationship among the latent factors and is formulated by: = + i , = 1, … , .
(2) Let the latent variable be partitioned into ( , ) where and are latent variables respectively, is loading factors, and i is structural error.
In this study, we implemented the Bayesian analysis to estimate the parameter model in order to construct the best model based on the SEM approach in the case of the normality assumption violated. In the Bayesian method, the ordered categorical data a treat as continuous normal distribution by using threshold specification. Here, we consider = ( , , … ) and = ( , , … ) to be the ordered categorical data matrices and latent continuous variables respectively and Ω = ( 1 , 2 , … , ) be the matrix of latent variables. The observed data X are augmented with the latent data ( , Ω) in the posterior analysis. In this subsection, we will apply the Bayesian estimation in SEM in posterior analysis to obtain the values of the unknown threshold in = ( 1 , 2 ), joint Bayesian estimates of Ω and the structural parameter , a vector that includes all the unknown parameters in , ,  , and  .
The Bayesian method is applied to derive the posterior distribution for [ , , Ω| ]. The researchers usually may use the maximum likelihood (ML) estimation method to derive the posterior distribution. However, it is difficult to apply the ML estimation method in this study since a high-dimensional integration is required. , ) The prior distribution for ( ,  ) and are needed in the process of determining the posterior distribution. In this present study, we take the following conjugate prior distribution for those three parameters. Letting  be the kth diagonal element of  and be the kth row of , we consider: ( , ) is the gamma distribution, (, ) is an q dimensional Wishart distribution, parameters 0 , 0 , 0 , 0 , positive definite matrix ; H 0yk and 0 are collectively hyperparameters that are assumed to be described by an uninformative prior distribution (Yanuar et al., 2013).
This present study suggests evaluating the performance of Bayesian analysis in SEM through three aspects, the sensitivity, adequacy, and acceptability of the Bayesian SEM estimate. It is important to test the sensitivity of the Bayesian analysis concerning the choice of the prior (Rahmadita et al., 2018). Several types of prior inputs can be constructed as long as they can be compared to the model results. The adequacy of the Bayesian SEM estimate is checked by doing the convergence test of the corresponding model parameters. The convergence is assessed by plotting the time series to assess the quality of the individual parameters with different starting values graphically, providing a diagnosis based on the trace plots and density plots (Lee, 2007;Yanuar et al., 2022). This study also uses the Brooks-Gelman-Rubin (BGR) convergence statistics which compares the variation between and within multiple chains, denoted by R. The estimated parameters converge if the R-value is close to 1. The accuracy of the posterior estimates is inspected by assuring that the Monte Carlo error (an estimate of the difference between the mean of the sampled values and the true posterior mean) for all the parameters to be less than 5% of the sample standard deviation (Depaoli and Clifton 2015). At the same time, the acceptability of the Bayesian approach and its associated algorithm in recovering the true parameters are monitored by doing a simulation study to construct the Bootstrap confidence interval (Zhang & Savalei, 2016). In this Bootstrap method, new data set is generated by sampling with replacement from the original data set, and then estimating the statistics for each new data set. Several model fits are determined to calculate the 95% confidence interval of all parameters model.

Results and Discussion
In the first stage, we include all variables into the model and then fit the hypotheses model to the data, but an acceptable model is not achieved. We then exclude socio-demographics from the hypotheses model since this latent variable is not significant to give effect to the response, health behaviour. At this second stage, we estimate the goodness of fit of the proposed model and we could obtain the acceptable model. The results of the model fitting obtained based on Bayesian SEM are presented in Figure 2.

Figure 2. The Proposed Model of Health Behaviour
To test the sensitivity of the Bayesian analysis, we consider three types of prior inputs. The model resulted based on three different priors. In this present study, to have good prior information, we take a small variance for each parameter in assigning the values for the hyperparameter where all parameters are fixed based on a study by Yanuar et al. (2013). Therefore, all prior inputs are summarized as follows: a. Type I prior: the unknown loadings in are all taken to be 0.5, those values corresponding to 2 = 0.80 and 3 = 0.10. b. Type II prior: the hyperparameters are equal to 75% of the values given in a. c. Type III prior: the hyperparameters are equal to 150% of the values given in a.
The results for Bayesian estimation of the structural and measurement equations obtained based on the three types of prior inputs are provided in Table 3. Table 3, the parameter estimates and standard errors resulting under three different priors are reasonably close. It could be concluded here that the statistics obtained based on the Bayesian SEM are robust to the different prior inputs. Or it can be said that the Bayesian SEM implemented here is not sensitive to these three different prior inputs. Accordingly, the results obtained using Type I prior are used as the proposed model based on the Bayesian SEM method in this present study. The values provided in Table 3 were estimated using WinBUGS 14 (Ntzoufras, 2009). Code in WinBugs is provided in Appendix.

Based on
The adequacy of Bayesian SEM estimated is evaluated using the convergence statistics test for all parameters model obtained. Plots of sequences of observations corresponding to ̂2 and ̂3 4 as a selected parameter for illustrative purposes are provided in Figures 3 (a) and (b). Both plots show that data lie within two parallel horizontal lines, which indicates that the algorithm converged in less than 15.000 iterations for two initial values. Density plot in Figure 3 (c) for ̂2 and Figure 3 (d) for ̂3 4 indicate that iteration samples have a normal distribution which informs that the parameter values resulting from using the algorithm have converged. The BGR plot for (a) ̂3, (b) ̂2 4 as a selected parameter is provided in Figure 4. We found that the R-value is close to 1, meaning the estimated parameters have converged. The Monte Carlo errors for all parameters yielded are less than 5% of the sample standard deviation. To assess the plausibility of the proposed model is checked by plotting the estimated residual versus the case number. The distribution of data is centered at zero and no trends are detected. It could be concluded here that the estimated parameter model resulting in this present study is considered adequate and could be accepted.
The next analysis is to determine the 95% bootstrap confidence interval to test the acceptability of the Bayesian approach and its associated algorithm in recovering the true parameters. Table 4 presents the results from the Bootstrap simulation study. It is clear from Table 4 that all parameter estimates fall within the 95% bootstrap confidence intervals taken from this simulation study. These results inform us that the Bootstrap method 0.086 (0.135)

Perceived Susceptibility
( ) Health Behaviour ( ) 0.851* (0.062) *denotes significant at 5% level R 2 = 0.629 seems to work well in this study. Thus, the estimated posterior mean obtained based on Bayesian SEM are acceptable since we believe that the power of our Bayesian SEM could result in the best fit for the proposed model. The relationship between health behaviour with its endogenous latent variables in the structural equation can be modeled as follow: ̂= 0.851 2 + 0.086 3 , R 2 = 0.629 (6)  This estimated structural equation modelling indicates that health motivation ( 2 ) has a greater effect on the health behaviour ( ) than perceived susceptibility ( 3 ). We can say here that health motivation is significantly correlated to health behaviour, which implies that people with high motivation to be healthy tend to have better health behaviour. It is also obtained that perceived susceptibility has no significant effect on health behaviour. It is possible to incorporate some other indicator variable into the model which is assumed as an indicator of perceived susceptibility. Meanwhile, the values of estimated measurement equations or the coefficient of factor loading and the associated standard errors for each indicator variable are provided in

Conclusions
Our enthusiasm for SEM is based on its ability to enhance our understanding of health behaviour modelling, as both latent and observed variables are considered. Through the use of SEM, one can understand better how direct and indirect factors affect the health behaviour model of the population in West Sumatra during the COVID-19 outbreak. SEM method is quite flexible as it allows one to use data that is violated the normality assumption. In this study, the commonly used approach of maximum likelihood or robust weighted least square (RWLS) is not applied. Yanuar et al. (2013) have proved that Bayesian SEM results in a better model than classical SEM. Therefore, the Bayesian method is used for parameter estimation here. The estimated posterior mean obtained based on Bayesian SEM are acceptable since we believe that the power of our Bayesian SEM could result in the best fit for the proposed model. Besides, Bayesian SEM is found to be pertinent to be used for constructing the health behaviour of an individual living in West Sumatra. This study found that the significant factor that affects health behaviour is health motivation.  The proposed model obtained in this study provides the health behaviour to combat COVID-19, such as keeping physical distance, wearing a mask, washing hands with soap and running water or using hand sanitizer, avoiding crowded places, and limiting the mobilization and interaction. One other important thing to protect yourself from COVID-19 is always keeping the motivation to be healthy. In practical terms, this model is useful for an epidemiologist or a health manager. It will be easier for the epidemiologist or a health manager to control public health, reduce the rate of COVID-19's spreading effectively, and streamline government programs in the health sector.