Generalized P-phased Regression Estimators with Single and Two Auxiliary Variables

Multiphase sampling has been the concept not being utilized is estimation of ratio and regression estimator widely. In the recent study we have proposed new dimension of sampling survey of estimations by proposing two generalized p-phase regression estimators with single and two auxiliary variables for estimating population mean. The proposed estimators are the generalized p-phase cases of Hanif et al (2015) and Hanif (2007) respectively. Both the estimators from which we took motivation are now special cases of our proposed estimators. We have derived unbiasedness, expression of Mean Square Errors along with family of estimators based upon p-phased generalization. We have derived expression of MSE in such a way that these expression can be used to obtained results for every phase we desire. By conducting empirical study on proposed estimators we have shown many situation in which MSE can be reduced by increasing number of phases. Hence, our study will open new horizon in the field of multiphase sampling where a lot of challenges are waiting to be resolved by proposing new estimators for phases above 2 phase.


Introduction
Sampling survey is perhaps the oldest statistical procedure to determine the accurate and useful estimates under prevailing constraints of time and money. The regression and ratio methods of estimation are two strongest pillars of sampling survey. In ratio and regression estimation many interventions in terms of estimators with different structural and functional form have been made. Two phase and multiphase sampling are the concepts associated in estimation of population mean from finite population under different cases of availability or non-availability of auxiliary information. In the recent study we have proposed new dimension of sampling survey of estimations by proposing generalized p-phase regression estimator. The proposed estimators undertakes multiphase sampling producers for estimation. Our estimator is attempt to utilize information from phases above second phase. Not only we become able to gather maximum information about auxiliary variables, but also in many cases the Mean Square Error (MSE) is also reduced.
The regression estimator have been used widely by Srivastava (1967), Walsh (1970), Reddy (1973, 74), Gupta (1978), Sahai (1979), Vos (1980)  The attempts to present RE above second phase is a new dimension in this context Hanif et al (2015) presented two estimators in three phase and four phase sampling.
The optimizing constants 1 , 2 , 3 were identical as under:- The corresponding MSE of (1.8) and (1.9) were ( ̅ 1 ) = 3 2 (1 − 2 ) + 1 2 2 (1.10) The corresponding expression of MSE of (3.7.2) is as under:- Based upon the finding in third and fourth phase Hanif at el (2015) concluded that MSE will tends to increase. Earlier, Samiuddin and Hanif (2007) considered NIC to present RE in two phase sampling which was similar to structural formation of (1.8) and (1.9), but in two phase sampling with two auxiliary variables :- Here, the co-efficient and were obtained through optimization of the expression of MSE. The expression of MSE of (1.12) was obtained as:- The optimizing coefficients were:-

P-Phased Sampling Notations
The p-phase sampling plan is just the extension of 2 nd phase sampling procedure. Under such plan subsequent sub samples of sizes 1 , 2 − − − − − − are drawn for first phase, 2 nd phase up to p th phase respectively. The information on auxiliary and understudy variables will be gather in similar pattern as in two phase case.
The Finite Population Correction Factor (FPC) in p-phase Simple Random Sampling With Out Replacement (SRWOR) sampling will be denoted by , = 1,2 … … . , and given as:- The error term in each of the variables will be represented as under with the assumption that the quantities| ̅ |, | ̅ | are very small as compare to | ̅ | and | ̅ |.
Furthermore, the following results for expectations can be derived using (2.3) The expectation of square of error terms will be as under:- In addition to (2.4) and (2.5) the following results of expectations from cross products will also be utilized:- Finally, the following results will also be used:-

Proposed Estimators
We took motivation from Hanif et al (2015) and Samiuddin and Hanif (2006) to present estimators in generalized p-phased estimators as under. Generalizing Hanif et al (2015) estimators (1.8) and (1.9) we propose.

(a)-Proposed Estimator-I
Taking Motivation from Hanif et al (2015), we propose the following generalized pphased regression estimator with single auxiliary variable.

Unbiasedness and MSE of proposed Estimator-I
Now, we will prove unbiasedness of the proposed estimators (3.1) and (3.2) as well as the expression of MSE will be derived.
Considering (3.1) and using comment (2.3) we get: Applying expectation and using (2.4) we obtained:- From (4.2) we observed that (3.1) is unbiased estimator of Population Mean. Now, to derive MSE of (3.1) consider (4.1) and squaring both sides and applying expectations Upon using (2.6) the third term on R.H.S of (4.3) will vanish and with results (2.6) and (2.7) we obtain:- The optimizing co-efficient will be determine by minimization of (4.4). For this purpose we proceed as: Now putting:- We get:- Upon Simplification: The expression (4.8) will turn out to be positive because that 2 is variance and the term ( − 1 ) will also be positive for = 2,3, − − − − , under the condition presented in (2.2). Furthermore, because, these are non-zero quantities so the second derivative test will not fall into indecision zone. Now, we replace the optimized values obtained in (4.7) into (4.4) to get the final expression of MSE of (3.1).
Expressing (4.7) as (4.10) and using in (4.9) On expanding the summation we get:- Hence, (4.11) provides generalized p-phased expression of proposed estimator (3.1). From the expression (4.11) we observed that our proposed estimator is generalized case of Hanif at el (2015) estimators (1.8) and (1.9). Based upon (4.11) proposed estimator (3.1) can considered as generalized p-phased regression estimator with single auxiliary variable having family of estimators for ≥ 2.

Intra-Phase Comparison for Proposed Estimator-I
Considering (4.11) and replacing by − 1, we get:- Now, comparing (4.11) and (4.1.1):- Condition (4.1.2) suggest that for every next phase FPC should decrease which is contradictory to basic methodology of multiphase sampling as we have stated in (2.2). Therefore, MSE will lean towards increasing pattern.

Empirical Study of Proposed Estimator-I
Now, we undertake the following population to observe the behavior of MSE of proposed estimator-I  We can observe that the MSE tends to increase phase by phase. At earlier phases MSE shows slow and steady increase very little, but dynamic quick increasing pattern is observed afterwards. Relative efficient will always greater than 100% because of increasing pattern of MSE. Because of small population the curve in Figure 4.2.1 shows sharp slope. Now, simulating population using multivariate normal distribution up to = 200000. First stage sample is taken as 3000, afterwards decrease of 100 is at each phase is considered until = 30. The results are reflected in Figure 4.2.2.The curve is very smooth and steady in slope, but still upward.
We can conclude that if we have large population and considerably larger sample sizes, there is negligible danger of losing efficiency. As in the current example we can easily observe that up to phase 25 intra phase MSEs are adjacent. Therefore, proposed estimator (3.1) is fair choice in sufficiently large population.
Based upon the mathematical finding and empirical studies. Now, we are in a straight forward position to conclude that MSE of (3.1) will tend to increase by increasing phases. Such a statement is justified, because we have generalized p-phased analysis of the general case of Hanif at al (2015). They just utilized four phase to conclude that "MSE will increase should be the case". Whereas, we not only conducted intra-phase mathematical comparison, but also supported our conclusion with the empirical study having flexible choice of phases

Unbiasedness and MSE of Proposed Estimator-II
First we derive unbiasedness of (3.2), for this consider use (2.3) in (3.2) and applying expectations both sides:- Squaring both sides and applying expectations The last term on R.H.S of (5.1.3) can be expanded as:- For ≠ we have two possibilities for < , using results (2.5) and (2.6):- Similarly, for > .

Empirical Study for Proposed Estimator-II
To perform empirical study on proposed estimator -II we will use different combination of correlations between the variable to demonstrate the pattern of MSE. Since we have two auxiliaries and an understudy variable therefore, we will have eight different combinations of correlations with respect to their signs. Such combinations are presented in Table 5.2.1 for moderate and low correlations. The graphical pattern of both cases is under Figure 5.2.1. It is evident that for case (ii) MSE is reduced while phases increased.  As we have just shown that only two types of results can be produced for all eight combinations of correlations therefore, in Table 5.2.3 we present intra-phase and second phased reference relative efficiencies For case one when see that all the reported figures are above 100 which means that there is increasing pattern in MSE. The efficiency of the estimator reduces as phase increase. But there is not a much of difference as we move from one phase to another. For case 2 both types of efficiencies are less than 100 up to phase four. Having relative efficiency less than 100 means that the performance of the estimator is getting better and better with the increment in the phase. For example if we consider phase four in 2 nd case the relative efficiency for current phase versus 2 nd phase is 33.05 which means MSE at phase four is just 33% of MSE which was produced by phase two. Similarly, the intra-phase efficiency for case two is also better as with increasing phase MSE decreases resulting in better performance.
Now, we consider high correlation between variables and will observe what pattern MSE will display. For this purpose we will now consider only two combinations i.e. all positives and all negatives. Consider the following table 5.2.4 We have consider four different combinations on the basic of degree of correlation between auxiliary variables. Each of high positive and high negative correlation between understudy and auxiliary variables is combined with corresponding high and low correlation between auxiliaries. One the ground of four cases we computed MSE of proposed estimator which are presented in Table 5.2.5.
From the Table 5.2.5 we can observe for combinations of correlation at serial no 1 in Table 7     As a final comment we say that in case we have combinations of high correlations in such a way that high positive or high negative correlation between understudy and auxiliaries combined with low positive or negative correlations between auxiliaries and our proposed estimator will produce better results with increasing phase.
After discussing performance of proposed estimator (3.2) at combinations of high correlations. Now, we will examine its performance at low correlations. For this task consider table 5.2.7.      Table 5.2.7. We observed that for first three combination both type of efficiencies are more than hundred. Which means that with increasing phase the relative efficiency of the estimator decline. Whereas, for the fourth combination the relative efficiencies are less than 100. In fact with the increase in phases the MSE rapidly decline and performance of the estimator gets better and better.

Conclusions and Recommendations
On the grounds of mathematical results, mathematical comparisons, constructed families and empirical studies of proposed estimators (3.1) and (3.2) we can draw following conclusions 1.
Our proposed Estimators are generalized p-phased which provide flexibility to go up to any phase of sampling. Furthermore, for every desired phase we do not have to construct mathematical expressions right from the word go. We have readymade expressions of MSE and just need to replace desired value of .

2.
The proposed estimator-I is generalized p-phased and estimators by Hanif et al (2015) are now special cases of the proposed estimators-I, for = 3 and = 4 respectively.

3.
The proposed estimator-II is also generalized p-phased and estimators by Samiuddin and Hanif (2007), Hanif et al (2015) and proposed estimator-I are now special cases of the proposed estimators-II, for different conditions over , and .

4.
Based upon the results of empirical study conducted for proposed estimator-I we can conclude that the MSE of the estimator will have increasing tend with increasing phases. This conclusion is based upon generalized results in contrast to the same conclusion drawn by Hanif et al (2015), who just utilized third and fourth phase.

5.
In case of large population we observed that MSE of (3.1) are very close to each other. Therefore, the loss in efficiency will be negligible if we wish to go beyond 2 nd phase. In this way we can get maximum information out of samples as well as the desired principal of repetition can also be achieved under NIC.

6.
The empirical study for proposed estimator-II reviled that for all possible eight combinations of correlations between variables only two types of results of MSE are produced. This is because of the structural formulation of the expression of MSE of (3.2) presented in (5.1.27).

7.
For moderate-low correlation in both cases the behavior of (̅ ) is anti. For all positive case (̅ ) tends to increase with number of phases. For 2 nd case (̅ ) has decreasing pattern. Hence, we can concluded that estimator (3.2) will be useful under the situation of case (ii). It will not only reduce MSE but also the efficiency of the estimation will also be enhanced.

8.
For other different combinations of correlations we may conclude that estimator (3.2) will perform better by reducing MSE and increasing efficiency when there is (i)-High positive correlation between and , and and low positive between and . (ii)-High positive correlation between and , and and low negative between and . (iii)-High negative correlation between and , and and low positive between and . In all other case there is a smooth steady and slow increase in MSE per phase.