Improved Inference of Heteroscedastic Fixed Effects Models

Heteroscedasticity is a stern problem that distorts estimation and testing of panel data model (PDM). Arellano (1987) proposed the White (1980) estimator for PDM with heteroscedastic errors but it provides erroneous inference for the data sets including high leverage points. In this paper, our attempt is to improve heteroscedastic consistent covariance matrix estimator (HCCME) for panel dataset with high leverage points. To draw robust inference for the PDM, our focus is to improve kernel bootstrap estimators, proposed by Racine and MacKinnon (2007). The Monte Carlo scheme is used for assertion of the results.


Introduction
Panel data is the combination of time series and cross-sectional units collected from the same group of cross-sectional units over time.In econometric research, it has many advantages over time series and cross-sectional data.The estimation of PDM includes two sorts of cases, the fixed effect model (FEM) and random effect model (REM).In the first stated case, the individual-specific effect or heterogeneity is assumed as fixed while in the last mentioned, it is expected to be random.The FEM has been frequently used in panel data analysis for the issue of individual heterogeneity.
An important assumption of classical linear regression model (CLRM) is homoscedasticity that the variance of error term remains constant and thus, the error term is identically distributed.The results are inadequate if this assumption is not met.The ordinary least square (OLS) estimates are not biased and inconsistent but they do not remain best linear unbiased estimator (BLUE) when the assumption of homoscedasticity is violated.Heteroscedasticity is a usual problem in the PDM, like unit-specific heteroscedasticity (USH) and unit-time varying heteroscedasticity (UTVH) and it is desirable to concentrate on it for making some robust inference.
Earlier studies about the problem of heteroscedasticity in the PDM were carried out by Mazodier and Trognon (1978).Eicker (1963) and White (1980) proposed heteroscedastic consistent covariance matrix estimator (HCCME) for non-panel data to tackle the problem of heteroscedasticity, which makes it conceivable to draw asymptotically robust inference.In the existing literature, it can be seen that Arellano (1987) builds White's estimator for the PDM.Uchôa et al. (2014) used another variant of the White estimator for the case of FE.However, the HCCME proposed by Cribari-Neto et al. (2007) for cross-sectional data has not been used for the study of panel data yet by any researcher.
Thus, in current study, this estimator is being used for the improvement in inference of PDM.
Besides the HCCME, some bootstrap estimators have also been developed to draw correct inference about the PDM.Cameron et al. (2008) and MacKinnon and Webb (2013) used the bootstrap technique in order to draw robust inference for the PDM with clustered errors.Unfortunately, a wide literature on the use of bootstrap estimators is not observed for the PDM, especially when there are also some high leverage points.
Another statistical approach is also available in the literature to make robust inference of linear regression model.This is the kernel smoothing approach proposed by Racine and MacKinnon (2007).This approach has not been studied in context of the PDM yet by any researcher and it is tried to fill this gap in our current work.
The paper is outlined as follows.Section 2 provides the model description and the HCCMEs.Section 3 discusses the HCCME based quasi t-test statistic.Section 4 presents the bootstrap estimators and Section 5 presents the kernel based versions of bootstrap.Section 6 displays all the numerical results and Section 7 concludes the results.

The Model and HCCMEs
The concerned model is the heteroscedastic FEM and its basic framework is where i  is unobserved heterogeneity.Model (1) can be embodied in the form of matrix by stacking the data over time dimension

  
. Following model is obtained by assembling the whole data set cross-sectionally where

  
. Here, T e is T × 1 vector of ones, unobserved individual heterogeneity is captured by and  is the Kronecker product.Additionally, y is 1  nT response vector, X is q nT  implicitly fixed regressor (q < nT),  is q × 1 vector of unknown parameters and  is The estimation of (2) can be achieved by within group estimator (WGE).It can be accomplished by pre-multiplication of (2) with the following matrix is a vector of ones of index T (for details; see Greene, 2003).Model (2) becomes The variance of estimator is and the hat matrix can be defined as The usual covariance matrix of  ˆcan be described as where . In case of homoscedasticity, Eq. ( 5) reduces to variance given in (4).For the heteroscedastic case, the error term is independent but non-identically distributed, hence there is a need of consistent estimator of φ ~.Arellano (1987) improved White (1980) estimator for the FEM which can be defined as For the FEM, HC3 can be found by replacement of 0  with following diagonal elements in ( 5) where it h ~is being the it th diagonal element of Uchôa et al. (2014) used the HC4 for FEM having high leverage points.For this estimator,


, where 1 1 min 4, , , min 4, In the available literature, the HC5 version of HCCME has not been used for the FEM which has proposed by Cribari-Neto et al. (2007) for non-panel data to study the influence of maximal leverage.In our present work, we propose to use the HC5 for the FEM.For this purpose, we use .., , 3 0, where  l , l = 0 indicates HC0, l = 3 is for HC3 and so on.

The HCCME based t-test and the Heteroscedastic-Consistent Covariance Interval Estimator (HCCIE)
The quasi-t test statistic can be intended with the concept of normality of regression parameter, such that The t-test statistic under the null hypothesis is where   rr φ ˆ is the r th diagonal element of φ ˆ and r = 0, 1, …, q -1.The The power of test can be measured as Now for the case of heteroscedastic errors, the Statistic (7) can be improved as , where l = 0, 3, …, 5.
Cribari-Neto and Lima (2009) constructed confidence interval based on the OLS  ˆ and HCCMEs for linear regression models.But, we are going to construct confidence interval for the FEM.Similarly, for large dataset, the HCIE can be derived and (1−α) × 100 % (two sided) confidence interval for  is

The Bootstrap Estimator
To draw robust inference about the heteroscedastic linear FEM, some bootstrap estimators have also been observed.The residual bootstrap (RB), WB and pair bootstrap (PB) estimators have been frequently used for the PDM (for more details; see Cameron et al., 2008).Cameron et al. (2008) have considered different bootstrap procedures for clustered errors like bootstrap-T method and bootstrap-se method.MacKinnon (2007) discussed the bootstrap p-value procedures for non-panel data.The bootstrap schemes are given below.

The RB Estimator (t) (RBE (t))
The bootstrap scheme for the residual resampling is as follows 1.
Create pseudo-sample of residuals  by resampling of residual ˆ.

Make a bootstrap sample (
where  ˆ and   are the WG coefficient and residual, respectively.

5.
Repeat Steps 1 to 4 for large number of (say, B) times.

Reject the null hypothesis at level  if and only if
The percentile-t two-sided confidence interval at level

The WB Estimator (t) (WBE (t))
It has proposed by Liu (1988), who followed the suggestions of Wu (1986) and Beran (1986).Its scheme is given below.

1.
For each m, m = 1, …, n, draw a random number m R from a population that has zero mean and unit variance.

2.
Construct a bootstrap sample ( WB y , X ~), as where i h ~is the it th diagonal element of hat matrix,  ˆ and  ˆare the WG coefficient and residual, respectively.
Steps 3 to 5 are similar to the scheme of ARB (t).The weight in ( 11) is based on HC3.

The Bootstrap-p Procedures
MacKinnon (2007) presented the way of computing bootstrap p-value for linear regression model.

The RBE (p)
For the RB, the bootstrap-p technique is as follows ˆ is the t-statistic obtained from RB and t ˆ is the t- statistic acquired from WG.It is the way of computing symmetric bootstrap p-value.

The WBE (p)
The method of the computation of the symmetric bootstrap p-value on the basis of the WB is as follows

The Kernel Estimators
The kernel estimator of Racine and MacKinnon (2007) has proposed for non-panel model.In the present work, we propose to use this estimator while considering the FEM.The scheme of kernel estimators is given below:

The RB Kernel Estimator
The kernel estimator is going to be used for the RBE and can be defined as is the CDF, w is the bandwidth and WB ˆ is the quasi-t test statistic obtained from the WBE.It is termed as the "WBE (k)".

Empirical Results
For the empirical results, we used the same Monte Carlo scheme as used in some previous studies like Li and Stengos (1994), Roy (2002) and Aslam and Pasha (2007).The considered model is We have used two data generating processes (DGPs) (i) . It is supposed that heteroscedasticity is of additive form.Let the total variance i  and expected variance of and  , respectively.For comparison across different DGPs, the expected total variance is set to be . The values of  are 0, 1, 2 and 3, where 0 indicates homoscedastic unit-specific error and other shows different levels of heteroscedasticity for the fixed value of 2   and the values assigned to 2   are 2, 4 and 6.Increase in  cause increase in degree of heteroscedasticity.Moreover, the value of  can be obtained using different values of  for each value of 2   and  is obtained using the additive heteroscedastic design specified above for given  .Thus, the values of i  for each For large samples, it is expected that performance of the OLSE is improved and it happens that for smaller UTVH, the OLSE performance is improved but still not better than the WGE.
Under DGPII, the results are given in Table 2 (a) and 2 (b) for given Scheme I and II of sample size, respectively.The obtained results do not differ from those generated under DGPI.The WGE outperformed OLSE for 2   = 2 but MSE of the OLSE is smaller than WGE for larger UTVH.
The average length and coverage is measured for study of finite sample properties of estimators under heteroscedasticity.The estimation of confidence interval is done as illustrated in (10).We have performed experiment for 2   = 2, though we have performed for 2   = 4 and 6 and found similar results.Under DGP I and II, empirical coverage is presented in Fig. 1 (a) and 1 (b), respectively.Under DGPI, the OLSE curve shows under-coverage while the curve of WGE is closer to the nominal coverage (95%).There is overlapping of the curves of HC3 and HC5.The curve of WBE (t) is closer to 95%.The similar behaviour is observed in Fig. 1 (b).Under DGPI, Table 3 (a) and 3 (b) carry empirical coverage and average length for Scheme I and II, respectively.Performance of the OLSE is not satisfactory in Table 3 (a) as it shows under-coverage for homoscedastic as well as heteroscedastic cases.The empirical coverage of WGE is closer to nominal coverage for all degrees of heteroscedasticity and it outperforms the OLSE.On the other side, it is noticed that the best empirical coverage among the HCCMEs are provided by HC4 and HC5.Among the bootstrap estimators, the WBE (t) exhibits best empirical coverage for heteroscedastic cases.This is verification of findings of Liu (1988) for linear regression models.Performance of the estimators in Table 3 (b) is similar to that observed in Table 3 (a).With the increment in sample size, there is no improvement in performance of the OLSE.Again, performance of the WBE (t) is remarkable and it remains an attractive choice.Table 4 (a) and 4 (b) show empirical coverage and average interval length for DGP II for Scheme I and II, respectively.Table 4 (a) shows that the OLSE confidence interval does not exhibit good empirical coverage.For all degrees of heteroscedasticity (λ = 0, 1, 2 and 3), HC5 shows the best coverage among all the HCCMEs.It is noticed in Table 4

(b) that the WBE (t) confidence interval displays coverage that is closer to the nominal coverage (95%). The similar behaviour of estimators is observed in Table 4 (b) as observed in Table 4 (a).
In this work, hypothesis of interest is .5 .0 : , 5 .0 : Empirical size can be measured according to (8).We have performed experiment for 2   = 2, though we have performed for 2   = 4 and 6 and found similar results.Empirical size is presented in Fig. 2 (a) and 2 (b) under DGP I and II, respectively.The OLSE curve reveals high size distortion in Fig. 2 (a).The WGE curve is closer to nominal level (5%).The bootstrap and kernel bootstrap estimators curves are also closer to nominal size.The similar trend is observed in Fig. 2 (b).Table 5 (a) and 5 (b) carry empirical results under DGP I for the estimators described above for Scheme I and II, respectively.It is noticed in Table 5 (a) that empirical sizes executed by the OLSE are very poor and there is high size distortion under both homoscedastic (λ = 0) and heteroscedastic cases (λ = 1, 2 and 3).However, the OLSE gets improvement in performance.While performance of all the other estimators is remarkable in case of heteroscedasticity.The HC4 and HC5 provide show the best null rejection rate (NRR) among all the HCCMEs.The HC3 also shows admirable rejection rates.Good performance of HC4 verifies the results of Uchao et al. (2014).The HC5 carries out a sound performance from mild (λ = 1) to severe heteroscedasticity (λ = 3) which justifies our new design for the FEM.The RBE also performs well in terms of NRR.The WB (p) approach provides remarkable NRR and gives confirmation to the results of MacKinnon (2007) for cross-sectional data.Among all the bootstrap estimators, WB performs excellently and the similar results are observed in the literature like in the work of Cameron et al. (2008).Among all the considered estimators, the best NRR is provided by kernel bootstrap estimators in the presence of heteroscedasticity (λ = 1, 2 and 3) and it becomes an attractive choice for heteroscedastic PDM at all nominal LOS.This verifies findings of Racine and MacKinnon (2007) for linear regression models.Our proposed kernel bootstrap estimator performs the best among all the estimators under consideration in the presence of heteroscedasticity.It also justifies our proposal for the PDM.The results given in Table 5 (b) indicate that performance of all the estimators is analogous to those given in Table 3.9.Performance of the OLSE is expectedly very poor.The HC5 shows substantial performance under all cases of heteroscedasticity (λ = 1, 2 and 3).The WB (k) approach justifies new formulation by providing NRR closer to all the nominal LOS (1%, 5% and 10%).Table 6 (a) and 6 (b) show empirical sizes under DGP II for Scheme I and II, respectively.The results given in these tables striking the same mark as in DGP I. Empirical power can be estimated according to (9).Under DGP I, Fig. 3 (a) shows empirical power curves based on all the estimators under consideration for Scheme I.For the case of homoscedasticity (λ = 0), it is observed that power curves of all the estimators are identical except that of OLSE which shows high power distortion for smaller UTVH ( 2   = 2).The curves of all other estimators are identical and they perform equally well.It verifies reported results of Aslam (2006).Fig. 3 (b) displays empirical power curves for Scheme II under DGP I.It is expected that with the increment in sample size, the power curves tend to slumber.The curve of OLSE does not improve and shows power distortion for homoscedasticity (λ = 0) and heteroscedasticity (λ = 1, 2 and 3).

Conclusion
We have considered the heteroscedastic FEM and tried to draw robust inference using quasi t-test based on the HCCME and bootstrap estimators.It is clear from the results given above that performance of the HC5 is the best among all the HCCMEs in the presence of heteroscedasticity and leverage data points and the similar results can be seen of n dummy variables associated with each cross-sectional unit, nT I and n I is the identity matrix of order nT and n, respectively.


can be generated as

Fig. 4 (
Fig. 4 (a) and 4 (b) show power curves for Scheme I and II, respectively under DGP II.The similar results are noticed under DGP II as given under DGP I.

Figure 4 (
Figure 4 (b) Empirical power of test at 5% LOS (DGP II; n = 100, T = 3) is the t test statistic from WGE and Racine and MacKinnon (2007) calculated the p-value for simple linear regression model but in the current work, it is computed for the FEM as

Table 1 (
a) and 1 (b) contain the mean and MSE for Scheme I and II under DGPI, respectively.Intercept is excluded in the WG estimation (seeAslam, 2006), therefore it is not given in these tables and discussion is focused only on the slope estimates.Table1(a) shows that all the estimators are efficient for all homoscedastic and heteroscedastic cases but the OLSE is inefficient for smaller UTVH.For 2   = 2, the MSE of OLSE is more than twice of WGE.However, the OLSE performs equally well for2


= 6), performance of the OLSE is improved and it shows smaller MSE than that of WGE.The similar behaviour of OLSE and WGE is noticed in Table1(b).

Table 6 (b): NRR of quasi-t test for n = 100, T = 3 under DGP II
Figure 3 (a) Empirical power of test at 5% LOS (DGP I; n = 50, T = 3) by Cribari-Neto et al.(2007)for linear regression model.Performance of the kernel bootstrap estimators is better than the bootstrap estimators in respect of coverage, empirical sizes and empirical power.It is concluded that the WB (k) outperforms all the HCCMEs and bootstrap estimators in the presence of severe heteroscedasticity.It is justification of our new formulation for the PDM.