Improvement estimation on two successive occasions in cluster sampling

In this paper we attempt the problem of estimation of the sum of mean and the change of mean in mail surveys. This problem is conducted for current occasion in the context of cluster sampling over sampling on two successive occasions. The sampling units are clusters and the observations on the first occasion are regarded as ancillary information for the observations on the second or current occasion. The results obtained are demonstrated with the help of an empirical study, which reveals that under certain condition, the cluster sampling on two occasions is more efficient than the simple random sampling on two occasions.


Introduction
In sample surveys cluster or area sampling is widely practised because of its low cost and time saving device to conduct large scale and complicated surveys. Its use becomes more desirable when a list of elements is not available or units of the population are widely scattered and it is required to take repeated observations on the selected units (Pradhan, 2004). It is well known that the cluster sampling is better than the simple random sampling when the intra class correlation within the same cluster is negative and smaller than −(M − 1) − 1, where M denotes the size of the cluster. The relative efficiency of the cluster sampling is controlled by both the size of the cluster and the intra-class correlation coefficient, it decreases if the size of the cluster increases substantially (Sukhatme and Sukhatme, 1970). Zarkovich and Krane, 1965 demonstrated that the correlation between two characters in cluster sampling with clusters as sampling units is expected to be higher than correlation coefficient in element sampling. Pradhan, 2004 andKumar, 2011 have proposed estimators in the estimation of the current population mean under the set up of cluster sampling on two occasions. Sampling on successive occasions was first considered by Jessen, 1942 in the analysis of farm data and this theory was further extended by Patterson, 1950, Eckler et al., 1955, Rao and Graham, 1964, Gupta, 1979, Das, 1982, Singh and Singh, 2001, Artes Rodriguez and Garcia Luengo, 2005, Garcıa Luengo and Oña, 2010 among others. Continuing this line of work, we develop the Hansen et al., 1953 technique to estimate the sum of mean and the change of mean for current occasion in the context of cluster sampling over sampling on two successive occasions. An empirical study that allows us to investigate the performance of the proposed strategy is carried out.

Notation
Suppose that the population is composed of N clusters of M elements each, and that a simple random sample of n clusters is drawn without replacement from it. Let (x ij , y ij ), (i = 1, 2, .., N ; j = 1, 2, .., M ) be the values of the characteristic on first and second occasions for the j th unit of the i th cluster, respectively. A simple random sample (without replacement) of n clusters is drawn on the first occasion. On the second occasion, a simple random sample of m = nλ (0 < λ < 1) clusters (i.e. M m elements) of the n clusters selected on the first occasion is retained (matched) while an independent sample of u = nµ = n − m (µ = 1 − λ) clusters (i.e. M u elements) is replaced (unmatched with the first occasion) from the entire population. The characters x and y are supposed to be correlated when they are observed on the same unit repeatedly. Define,X The means of the i th cluster on the first and second occasion respectively.
The cluster population mean of x and y respectively.
The population means of x and y per element on the first and second occasions respectively.
The population mean square between elements on the first and second occasions respectively.
The intra-class correlation coefficient between elements of a cluster on first and second occasions respectively.
The simple correlation coefficient between cluster means on both occasions. Sample means based on a simple random sample of mM units.

Estimation of the sum of mean in cluster sampling on two successive occasions
Consider the following minimum variance linear unbiased estimator of the sum García Luengo Amelia Victoria, Shahzad Usman, Koyuncu Nursel, Muhammad Hanif which expected value is given by Substituting the value of b and d in equation (1), we obtain The variance of z 1 is given by Assuming that N is sufficiently large, other covariance terms being zero, the variances and covariance involved in (3) are given by Minimizing the variance of z 1 with respect to a and c, the optimum values of a and c are: Using the optimum values of a and c, the estimator z 1 reduces to In case S x = S y andρ x =ρ y , z 1 reduces to Improvement estimation on two successive occasions in cluster sampling with variance We note that, for ρ b > 0, equation (4) is minimum for µ = 0, i.e., the variance of z 1 is minimized if the clusters on both occasions are independent. In this case,

Efficiency of cluster sampling on two occasions
If the samples on both occasions are drawn using SRSWOR, the variance of the optimum estimator z neglecting the finite population correction factor is given by where ρ is the simple correlation coefficient between values of units on first and second occasion. The relative efficiency of z 1 compared to z is The cluster sampling on both occasions provides more efficient estimate than the simple random sampling on both occasions if Further, in order that z 1 would be more efficient than z if which gives the upper limit of M . Tables: 1-3 have been computed below to show the relative efficiency of cluster sampling in sampling on two occasions compared to simple random sampling of elements for some specified values of ρ y , ρ, ρ b , M and µ.   0891 1.1170 1.1429 1.1669 1.1893 1.1999 1.2061  0.3 1.0510 1.0780 1.1029 1.1261 1.1477 1.1579 1 (5), we obtain The variance of ∆ 1 is given by Assuming that N is sufficiently large, other covariance terms being zero, the variances and covariance involved in (7) are given by We wish to choose whose values of a and c that minimize V(∆ 1 ). Equating the derivatives of V(∆ 1 ) with respect to a and c to zero, it follows that the optimum values are: Using the optimum values of a and c, the estimator ∆ 1 reduces to In case S x = S y andρ x =ρ y , ∆ 1 reduces to Improvement estimation on two successive occasions in cluster sampling with variance We note that, for ρ b > 0, equation (8) is minimum for µ = 0, i.e., the variance of ∆ 1 is minimized if the clusters on both occasions are identical. In this case,

Efficiency of cluster sampling on two occasions
If the samples on both occasions are drawn using SRSWOR, the variance of the optimum estimator ∆ neglecting the finite population correction factor is given by where ρ is the simple correlation coefficient between values of units on first and second occasion. The relative efficiency of ∆ 1 compared to ∆ is The cluster sampling on both occasions provides more efficient estimate than the simple random sampling on both occasions if Further, in order that ∆ 1 would be more efficient than ∆ if which gives the upper limit of M . Tables: 4-6 have been computed below to show the relative efficiency of cluster sampling in sampling on two occasions compared to simple random sampling of elements for some specified values of ρ y , ρ, ρ b , M and µ.

Conclusions
In sampling on two occasions we have considered the estimation of the sum of mean and the change of mean for current occasion when the sampling units are clusters and the observations on the first occasion are regarded as ancillary information for the observations on the second or current occasion. Under certain condition, the cluster sampling on two occasions is more efficient than the simple random sampling on two occasions.
The obtained results have revealed that for fixed ρ y (intra-class correlation coefficient) and ρ b (correlation coefficient between cluster means) the efficiency increases with large increase in ρ (ρ > ρ b ) (the simple correlation coefficient between values of units on first and second occasion) for the estimation of the sum of mean and the efficiency