Preliminary Test Estimators and Phi-divergence Measures in Pooling Binomial Data

Two independent random samples are drawn from two Binomial populations with parameters 1  and 2  respectively. Ahmed (1991) considered a preliminary test estimator based on maximum likelihood estimator for estimating 1  when it is suspected that 1 2 = .   In this paper we combine minimum phidivergence estimators as well as phi-divergence test statistics in order to define preliminary phi-divergence test estimators. These new estimators are compared with the classical estimator as well as the pooled estimator.


Introduction
Pooling data is an old classical problem that has been studied for many authors in different context.If we have a random sample of size n from a random variable X and a random sample of size m from a random variable Y (the distributions of X and Y belong to the same family of probability distributions) how shall estimate the mean,  , E X of the random variable X ?Shall we merely take the sample mean of the random sample X , or shall we attempt to combine the means of the two random samples in order to improve our estimation of   E X in some sense?From a historical point of view Mosteller (1948) presented for the first time the analysis of pooling univariate normal data with variance known.Kale and Bancroft (1967) extended the study to discrete data by using proper transformations.For unknown variances the problem was studied by Han and Bancroft (1968).The problem for multivariate normal data was studied by Han and Bancroft (1970) with known covariance matrix and by Ahmed (1992) with unknown covariance matrix.Other interesting papers in pooling data are Ahmed (1993Ahmed ( , 1997)), Ahmed et al. (1989Ahmed et al. ( , 1997Ahmed et al. ( , 1999)), Mehta and Srinivasan (1971), Raghunandanan (1978) and references therein.
It is advantageous to utilize a linear combination of the two sample means ponderated by the sample size of them if   E X coincides with   E Y , i.e., to use the restricted estimator.In many situations it is not clear if     = .E X E Y In order to tackle this uncertainty we can perform a preliminary test and then choose between a restricted and an unrestricted estimator.This line of thought was first proposed by Bancroft (1944).For a wide study of preliminary test estimators in different statistical problems see Saleh (2006) and references therein.
In this paper we focus on the problem of pooling proportions of two independent random samples taken from two possibly identical binomial distributions (see Ahmed (1991)).This author considered a preliminary test based on restricted maximum likelihood estimator and classical Pearson's test statistic.In this paper instead of considering the restricted maximum likelihood estimator we shall consider the restricted minimum phidivergence estimator and instead of Pearson's test statistics a family of phi-divergence test statistics.Therefore in this paper we introduce a family of preliminary test estimators for the problem of pooling binomial data that contains as a particular case the preliminary test estimator considered by Ahmed.
The behaviour of phi-divergence measures in the definition of preliminary test estimators can be seen in Menéndez et al (2008Menéndez et al ( , 2011)), Pardo and Martin (2011) and references therein.
Section 2 is devoted to introduce the family of preliminary test estimators considered in this paper and in Section 3 we obtain some asymptotic distributional results that are necessary in the next Section.Finally, Section 4 is devoted to get the asymptotic bias as well as the asymptotic mean squared errors of the family of estimators introduced in the paper.

Estimation strategies in pooling binomial data based on phi-divergence measures
Let 1 ,..., n X X and 1 ,..., m Y Y be two independent random samples of sizes n and m from two Bernoulli random variables of parameters 1  and 2 ,  respectively.The main problem in which we are interested is in the estimation of 1  when we suspect that


( 1 y  number of successes associated with the random sample 1 ,..., n X X ) and the MLE of 1  (restricted maximum likelihood estimator) under the assumption that 1 2 =   is given by


( 2 y  number of successes associated with the random sample We consider the two following probability vectors: .
It is not difficult to see that
Based on equation ((4)) we have If we replace Kullback-Leibler divergence measure by a more general divergence measure we obtain a new estimator.In the following we shall consider the family phidivergence measures defined in the model under consideration by where ''


For more details about phi-divergence measures see Pardo (2006).Based on the phi-divergence measures defined in (5) we consider in this paper the restricted family of minimum phi-divergence estimators defined as   If we consider in (6    we obtain the restricted maximum likelihood estimator (MLE), therefore the restricted MLE can be obtained as a special case of the restricted minimum phi-divergence estimator or we can say that the restricted minimum phi-divergence estimator is a natural extension of the restricted MLE.
From a practical point of view the restricted minimum phi-divergence estimator can be obtained as a solution of the equation  ' ' ' ' If we consider the power divergence measures introduced and studied by Cressie and Read (1984), obtained from (5) when we consider the family of functions, the corresponding minimum power divergence estimator is given by On the other hand the classical test statistic for testing Its asymptotic distribution is chi-squared with one degree of freedom.
If we consider the function we can see that the test statistic N Z can be written as where A first extension of (( 8)) is obtained if we consider a general function 1 and a more general extension if we consider the minimum 2  -divergence estimator In this case we have a new family of phi-divergence statistics defined by It is well-known that  1  has a smaller asymptotic risk under quadratic loss than  1  when 1 2 =   holds, but as 2  moves away from 1 ,   1  may be both asymptotically biased and inefficient, while the performance of  1  remains constant over such departure.For this reason Ahmed (1991) developed an estimator which is a combination of  1  and  1  which is less sensitive to the departure from 0 1 2 : = H   because incorporates a preliminary test on the null hypothesis : .

H  
 This estimator was termed preliminary test estimator and defined as where N Z was introduced in (8) and by ( ) ,

Asymptotic distributional results
Our problem is the estimation of 1  when it is suspected but one is not sure that Pre ,    we must pay special atention when 2  is close to 1  .
For this reason we are going to assume that : = , The null hypothesis 0 1 2 : = H   given in ( 7) can be writen as   where by 2 2  I we are denoting the identity matrix of order 2.

If we denote by
Therefore by (11) (12) we get In the following theorem we present the distributional asymptotic results that are necessary in the following Sections.

Theorem 1. 1Under the contiguous alternative hypotheses
, where d) The symptotic distribution of the phi-divergence test statistic and They are also asymptotically independent.Based on ( 12) on ( 13) we have From ( 12) and ( 14) and taking in account that The matrix 1 (1 ) X X is asymtotically distributed as a chi-square with a degree of freedom and noncentrality parameter .

  
 4 Asymptotic bias and asymptotic mean squared errors of the estimators Let    be an estimator of 1  and F the asymptotic distribution of the random variable We understand by the asymptotic bias of  In the following theorem we get the asymptotic bias of  1, pre ., given by: represents the distribution function of a noncentral chi-square random variable with three degrees of freedom and noncentrality parameter   pre ,    can be written as Let Y be a normal random variable with mean  

 
We can write, If  is a normal random variable with mean  and variance 1 , For more details see Judge and Bock (1978) or Saleh (2006).Then,  be an estimator of 1  and F the asymptotic distribution of the random variable We understand by the asymptotic mean squared error of  In the folowing theorem we get the asymptotic mean square error of  1, Proof.Parts a) and b) are immediate on the basis of ( 14) and (15).We denote where W is a normal random variable with mean  and variance   Based on this result we have, We know by (14) that

MSE E U W U Y E U E W U Y E U W U Y
We are going to get the expression of

I I I I I
Then we get, Now applying: "If  is a normal random variable with mean  and variance 1 , then


one degree of freedom.For more details seePardo (2006).

AIx 1 Z
we are denoting the indicator function taking the value 1 if x A  and 0 if x A In this paper we consider the minimum 2  -divergence estimator  2 Then we shall consider the preliminary phi-divergence test estimator based on  By Pardo J. A. et al. (2003) (see also page 246 in Pardo 2006) we have, denoting  Parts a) and b) follow by previous Theorem.We observe that  1 2