Bayesian Inference for Double Seasonal Moving Average Models: A Gibbs Sampling Approach

In this paper we use the Gibbs sampling algorithm to develop a Bayesian inference for multiplicative double seasonal moving average (DSMA) models. Assuming the model errors are normally distributed and using natural conjugate priors, we show that the conditional posterior distribution of the model parameters and variance are multivariate normal and inverse gamma respectively, and then we apply the Gibbs sampling to approximate empirically the marginal posterior distributions. The proposed Bayesian methodology is evaluated using simulation study.


Introduction
High frequency time series that are observed at small time units may be characterized by exhibiting multiple seasonal patterns. For example, hourly electricity load data can exhibit intraday and intraweek seasonal patterns. Other examples of high frequency time series contain multiple seasonal patterns include daily hospital admissions, daily usage of water and natural gas, hourly volumes of call center arrivals, hourly traffic on a road, hourly access to computer web sites, and half-hourly demand for public transportation. The notion of modelling multiple seasonalities is not new and it can be traced back to 1971 when Thompson and Tiao (1971) showed that monthly disconnections of the Wisconsin telephone company have annual and quarterly (double) seasonal patterns. Accordingly, seasonal autoregressive moving average (SARMA) models being widely applied to analyze time series with single seasonal pattern need to be modified and extended to accommodate multiple seasonalities, see for example Box et al. (1994) and Taylor (2003). In addition to SARMA models, other techniques have been extended to fit multiple seasonal time series, which include neural networks, exponential smoothing methods and innovation state models. A quick review of these techniques can be found in Feinberg & Genethliou (2005).
In particular, the multiplicative double SARMA (DSARMA) models have been the subject of interest of many researchers and extensively studied and employed in modeling and forecasting time series data exhibiting double seasonal patterns. Taylor (2003) showed that electricity load in England and Wales features daily (within day) and weekly (within week) seasonal patterns. Taylor et al. (2006) compared the forecast accuracy of some univariate models including double SARMA for electricity demand forecasting in Brazil and in England and Wales. Cruz et al. (2011) compared the forecasting accuracy of a set of methods for day-ahead spot price forecasting in the Spanish electricity market.
Other references may include Taylor (2008a&b), Caiado (2008) Bayesian analysis of SARMA model for single seasonality has been well established, and different approaches have been developed in literature. Analytical approximation is one of these approaches, which simply approximates the posterior and predictive densities to be standard closed-form distributions that are analytically tractable, see for example Shaarawy and Ismail (1987). However, this approach is conditioning on the initial values leading to waste observations, and treats SARMA model as an additive not a multiplicative model which can introduce new unnecessary coefficients in the model. To address the limitations of analytical approximation, in recent years MCMC methods, especially Gibbs sampling algorithm, have been proposed to ease the Bayesian time series analysis. Ismail (2003a&b) used Gibbs sampling algorithm to achieve Bayesian analysis for multiplicative seasonal moving average (SMA) and seasonal autoregressive (SAR) models. This work was extended by Ismail and Amin (2014) to multiplicative SARMA model. The literature in Bayesian analysis of DSARMA models is still immature. Amin and Ismail (2015) have used Gibbs sampling algorithm to develop a Bayesian analysis to multiplicative double SAR models. In the current paper, we extend this work to develop a Bayesian analysis to multiplicative DSMA models based on Gibbs sampling algorithm. The initial idea of this work was presented in the 60th ISI World Statistics Congress 2015, and the main advantages of the proposed methodology are that the Bayesian analysis is unconditional on the initial values of errors and it treats DSMA model as a multiplicative not an additive model to achieve parsimonious property.
The remainder of this paper is organized as follows: Section 2 presents multiplicative DSMA models. Section 3 summarizes the posterior analysis and full conditional posterior distributions of the parameters. The implementation details of the proposed algorithm including convergence monitoring are presented in Section 4. The proposed methodology is illustrated in Section 5 using several simulated examples. Finally, the conclusions are given in Section 6.
It should be noted that the DSMA model (1) has an extra terms compared with the usual multiplicative single SMA model. The new term is Ψ 2 ( 2 ) that accommodates the second seasonal pattern. Accordingly, the model (1) can be written as Equation (2) shows that the multiplicative DSMA model can be written as a moving average model of order (1 + )(1 + 1 )(1 + 2 ) − 1 with some coefficients are products of nonseasonal and seasonal coefficients. Therefore, the DSMA model is nonlinear in the coefficients , Θ, and Ψ which complicates its Bayesian analysis. In the following sections we explain how to apply the Gibbs sampling to facilitate the analysis. The DSMA model (2) is invertible if the roots of the polynomials ( ), Θ 1 ( 1 ) and Ψ 2 ( 2 ) lie outside the unit circle. For more details about the properties of SARMA models see Box et al. (1994).

Likelihood Function
Suppose = ( 1 , 2 , ⋯ , ) is a realization of the DSMA model (2), knowing that ~ N(0, 2 ) and employing a straightforward random variable transformation from to , the likelihood function is given by where the errors is computed directly from the model (2). It is obvious that this likelihood function is a complicated function in , Θ, Ψ and 0 . However, the errors can be estimated recursively as follows where ̂∈ , Θ ∈ 1 , and Ψ ∈ 2 are consistent estimates obtained by the nonlinear least square (NLS) estimation method. Substituting the estimated errors in the likelihood function (3) results in an approximate likelihood function: where, is defined in (3), and Λ is a n × ((1 + )(1 + 1 )(1 + 2 ) − 1) matrix with the ℎ row:

Prior Specification
Assuming that the parameters , Θ, Ψ and 0 are independent apriori, given the error variance parameter 2 , the joint prior distribution is where * = + 1 1 + 2 2 , ( , Δ) is the r-variate normal distribution with mean and variance Δ, and IG(a, b) is the inverse gamma distribution with parameters a and b. Therefore, the joint prior distribution can be written as follows where * = + 2 + 1 (1 + 1 ) + 2 (1 + 2 ).
The prior distribution (9) is chosen for several reasons, especially it is a conjugate prior and thus facilitates the mathematical calculations. Multiplying the joint prior distribution (9) by the approximate likelihood function (6) results in the joint posterior ( , Θ, Ψ, 2 , 0 | ) which may be written as

Full Conditional Posterior Distributions
The conditional posterior distribution for each one of the DSMA parameters is obtained from the joint posterior distribution (11) by first grouping together terms in the joint posterior that depend on this parameter, and then finding the appropriate normalizing constant to form a proper and closed-form density. In this study all the conditional posteriors obtained for the unknown parameters are normal and inverse gamma distributions for which sampling techniques exist.

The conditional posterior of
We obtained the conditional posterior of 2 by finding out ( 2 | , , Θ, Ψ, 0 ) that we proved to be an inverse gamma ( , where * is defined in (10) and To ease the Gibbs sampling algorithm process, the parameter 2 can be sampled from the Chi square distribution using the transformation ( 2 )~( + * ) 2 .

The conditional posterior of
In the beginning, we write explicitly the elements of 0 using the model (2) as follows: , * = + 1 1 + 2 2 and * = ( 1 , 2 , … * ) that has the * multivariate normal distribution with zero mean and variance ( 2 * ), where * is the unit matrix of order * .

The Proposed Gibbs Sampler
The proposed Gibbs sampling algorithm for DSMA model can be implemented as follows: 1.
Specify starting values 0 , Θ 0 , Ψ 0 , ( 2 ) 0 and 0 0 and set j=0. A set of initial estimates of the model parameters can be obtained using the NLS method.

Simulate
1. The first group includes the relative numerical efficiency (RNE) and numerical standard errors (NSE). The RNE estimates indicate the required number of draws to produce the same numerical accuracy when iid sample is drawn directly from the posterior distribution. The NSE estimates reflect the variation that can be expected if the simulation were to be repeated.

2.
The second group of diagnostics includes a test of whether the Gibbs sampler has attained an equilibrium state. This can be achieved by testing the equality of the two means of the first and last parts of draws and the Chi squared marginal probability is obtained. Usually, the first and last parts are the first 20% and the last 50% of the draws.
LeSage (1999) implemented these convergence using Matlab package, and we use them in the following section to monitor the convergence of our proposed Gibbs sampling algorithm for DSMA model.

Simulation Study
In this section we show the efficiency of the proposed estimation method using simulation study for five examples of DSMA models. Table 0 shows these selected DSMA models and their true parameters values used in the simulation. By these five examples we try to represent different pairs of the season periods that are chosen to cover different seasonality patterns with different data types.
Once the time series datasets of size = 1,000 are generated from these selected DSMA models, the Bayesian analysis is performed by assuming a non informative prior for the parameters , Θ, Ψ and 2 and a normal prior with zero mean for initial errors 0 with variance 2 * . To apply the proposed Gibbs sampler, the starting values for the parameters , Θ, Ψ and 2 are obtained using NLS method, and the starting values for 0 are assumed to be zeros. For each dataset, the Gibbs sampler was run 11,000 iterations where the first 1,000 draws are ignored and every tenth value in the sequence of the last 10,000 draws is recorded to have an approximately independent sample of 1,000 draws. Accordingly, all posterior estimates of the parameters are computed directly as sample averages of the 1,000 Gibbs sampler draws. In the following, we discuss the results of the proposed algorithm and investigate its convergence. Table 1 presents the true values and the Bayesian estimates of the parameters for example I. In addition, a 95% confidence interval using the 0.025 and 0.975 percentiles of the simulated draws is constructed for every parameter. From Table 1, it is clear that the Bayesian estimates are close to the true values and the 95% confidence interval includes the true value for each parameter. Sequential plots of the parameters generated sequences together with marginal densities are displayed in Figure 1. The marginal densities are computed using non parametric technique with Gaussian kernel. Figure 1 shows that the proposed algorithm is stable and fluctuates in the neighborhood of the true values. In addition, the marginal densities show that the true value of each parameter (which is indicated by the vertical line) falls in the constructed 95% confidence interval.  The convergence of the proposed algorithm is monitored using the diagnostic measures explained in Section 4. The autocorrelations and Raftery and Lewis diagnostics are displayed in Table 2 whereas Table 3 presents the diagnostic measures of Geweke (1992). Table 2 shows that the draws for each of the parameter have small autocorrelations at lags 1, 5, 10 and 50 which indicates that there is no convergence problem. This conclusion was confirmed by the the diagnostic measures of Raftery and Lewis where the reported (Nmin) is 994 which is close to the 1,000 draws we used and Istat value is about 1 which is less than 5. Geweke diagnostics in Table 3 confirm the convergence of the proposed algorithm where Chi squared probabilities show that the equal means hypothesis can not be rejected and no dramatic differences between the NSE estimates are found. In addition, the RNE estimates are close to 1 which indicates the iid nature of the output sample.

Conclusion
In this paper we showed that the conditional posterior distribution of the DSMA model parameters and the variance are multivariate normal and inverse gamma, respectively. Exploiting that the conditional posterior densities are standard distributions, we used the Gibbs sampling algorithm to develop a Bayesian method for estimating the parameters of the multiplicative DSMA model. Simply, we applied the Gibbs sampling algorithm to approximate empirically the marginal posterior distributions along with using several diagnostics that showed the convergence of the proposed algorithm was achieved. Accordingly, we computed directly the posterior estimates of the parameters via sample averages of the simulation outputs. The simulation results confirmed the accuracy of the proposed methodology. Future work may investigate model identification using stochastic search variable selection, outliers detection, and extension to multivariate double seasonal models.