Circular Functional Relationship Model with Wrapped Cauchy Errors

This paper extends the simple linear regression model with wrapped Cauchy error to the functional case when both variables are subjected to wrapped Cauchy errors. Assuming the ratio between the two error variances is known and the slope parameter equals one the maximum likelihood estimates are obtained. The closed-form expression for the maximum likelihood estimators are not available and the estimates are obtained iteratively by choosing a suitable initial values. The quality of estimates and the accuracy of the model are illustrated via simulations and the results revealed an acceptable performance of the estimators where they are unbiased, consistent and robust. The sampling variances of the model parameters are obtained via bootstrapping methods and consequently the confidence intervals were constructed. The proposed model is illustrated with an application on the analysis of wind directions data at two cities in the Gaza strip, Palestine.


Introduction
The functional relationship model is a part of the general class of Error-In-Variables Models (EIVM) which also known as the measurement error or random regression models.The study of EIVM had been firstly explored by (Adcock 1877(Adcock , 1878) ) and then (Kendall 1951(Kendall , 1952) ) formally made a distinction between functional and structural relationship between the two variables.In the ordinary regression we supposed that the explanatory variables measured without error and all the errors are in the response variable, but the EIVM assumed that the errors are in both response and explanatory variables.Hence, there is no distinction between the response and explanatory variables.
The extension of EIVM to the case when both variables are circular has not received much attention.The un-replicated linear functional relationship for two circular variables was first introduced by (Hussin 1997) assuming that errors of both variables are independently distributed with von Mises distribution with equal variances.
Later, (Hussin 2003) improved the model by estimating the concentration parameters of error for any ratio  using the asymptotic properties of the Bessel function.Then, (Hussin 2005) extended the model for replicated observations and considered the Fisher information matrix of model parameters.(Caires and Wyatt 2003) proposed the simple linear circular functional relationship model in which the value of the slope parameter  is fixed to be one, and the errors are assumed to follow von Mises distribution for any ratio  . (Hussin et al 2010) derived the variances of model parameters by deriving the asymptotic covariance matrix and developed an outlier detection procedure.(Satari, et al 2014) developed a new functional relationship model for circular variables by extending the (Downs and Mardia 2002) circular-circular regression model.
The heavily tailed property of the wrapped Cauchy distribution motivated (Abuzaid and Allahham 2015) to propose the simple linear regression for circular variables in the following form: where i  circular random error having a wrapped Cauchy distribution with circular mean 0 an concentration parameter .The estimates of the parameters were obtained  based on maximizing the log-likelihood function via iterative procedure.The Results show that their model is more robust to the deviation of assumptions compare to (Hussin et al 2004).
The rest of this paper is organized as follows, Section 2 formulates the proposed model and derives parameter estimators.Section 3 performs an extensive simulation studies to investigate the properties of estimators.Section 4 applies the proposed model on wind directions data from Palestine.

Circular Functional Relationship Model with Wrapped Cauchy Errors
The robustness of circular regression model proposed by (Abuzaid and Allahham 2015) motivates to extend it to the functional case, where the relationship is linear.Such situation can be found in the calibration between two instruments for measuring circular variables ) as a ratio of error concentration parameters to be known for the proposed circular functional relationship model.
In the context of functional relationship models there are strong reasons for fixing 1  = , the first reason is related to the desired symmetry of the functional relationship model, where the conclusions should be independent of which quantity is chosen to be X or Y .Furthermore, the function does not vary continuously when going from 2 to 0 , for further discussion see (Caires and Wyatt 2003).

Parameter estimation:
The probability density function of error in (1) is given by: ( ) By introducing the following re-parameterization The probability density function in (2) can be written in the form ( ) then errors in model (1) can be formulated as follows: ( ) where Then the log-likelihood function for model ( 1) is given by: ( ) Assuming the equality of error concentration parameters, i.e. 1  = , there are (n + 2) parameters to be estimated which are , and  and 2  are obtained by differentiating the log-likelihood function in (5) with respect to 1  and 2  , respectively and equating them with zero as follows: where ) sin( ) cos( 1 a) Estimation of  : From Equations ( 6) and ( 7) we have From the definition of 1  and 2  we get , tan then, the estimate of  is obtained as follows:

b) Estimation of  :
The estimate of  can be obtained after easy mathematical formulation as given below,  and 2  are obtained by differentiating the log- likelihood function in (5) with respect to 1  and 2  , respectively and equating them with zero as follows: where From Equations ( 11) and ( 12) we have The first derivative of the log-likelihood function in (5) with respect to i X is given by: ( ) The previous equation can be solved iteratively given some initial guesses for i X .Suppose 0 ˆi X is an initial estimate for i X ˆ. , . Also we have and 00 cos( ) cos( ) cos sin( )sin , For small , we have cos 1 and sin .(15, 16 and 17) become 00 sin( ) sin( ) cos( ) , and 00 cos( ) cos( ) sin( ) .14) is simplified to: Possible initial guesses for iteration are . The iterative reweighting algorithm for maximum likelihood estimation obtained step by step as follows: Step 1: 8) and (21), respectively.
Step 4: Obtain the values of ˆ and ˆ by solving the Equations ( 9) and ( 10) respectively.

Asymptotic variance of circular functional relationship model parameters:
It is difficult to derive the sampling variance based on the expectation of the second partial derivative of the log-likelihood function of the proposed model in (1).Therefore, we used bootstrapping method (Chernick 1999) as explained below: For any n pairs of circular observations 11 ( , ),...,( , ) nn x y x y of two circular variables X and Y with linear relationship.
Step 1: Select m pairs of the observations such that ( mn  ).
Step 2: For the selected m observations, obtain the estimates of and  and label them  and (1)  Step 3: Repeat Step 1 and Step 2, B times.
Step 4: Obtain the variance for the parameters  and  as follows:

Confidence intervals of model parameters:
The percentile or bootstrap-p method is used to construct the 100(1 − a)% confidence intervals for parameters' estimates, since it is the most widely used method due to its simplicity and natural appeal, using the first three steps mentioned in Subsection 2.2 and then continued by the following step: Step 4: Arrange the bootstrap estimates in an increasing order

Simulation Study
The main objective of this section is to assess the accuracy and biasness of the parameters of the proposed model (1) via simulation.The simulation results are obtained based on 1000 generated samples for each set of parameters values are shown in Table 1 and Table 2.The values of X follows the (0.5, 2) vM and the intercept parameter is fixed at 0.
 = Six choices of sample size n = 10, 30, 50, 70, 100 and 150 have been considered.The simulation study is conducted as follows: 1.

Two random samples and
ii  of size n are generated from the wrapped Cauchy with mean 0 and concentration  for the errors.

2.
A random sample of size n is generated from the von Mises with mean 0.5 and concentration parameter 2 for the independent variable .X

3.
Obtain i x based on formula 2 ) Obtain Y variable as given in (1), then obtain i y based on the formula

5.
Estimate the models parameter using the iterative procedure as derived in Section 2.

6.
For each combination of sample size n and concentration parameter  , the process is repeated s=1000 times.

Biasness of estimators
Results of simulation study are tabulated in Table 1, and it shows that  is a good estimator of  , where its bias generally decreases with the increasing of the sample size n or the concentration parameter  .Similar conclusions may also be drawn from as the mean of  close to the true value of  with the increase of sample size n or the concentration parameters of circular random errors.Also, we note from Table 1 and Table 2 that there is an inverse relationship between the sample size n and the values of the

Coverage probability of confidence intervals
The coverage probability of a confidence interval is the proportion of the time that the interval contains the true value of interest, see (Dodge;2003).In this simulation study we construct intervals at 0.95 confidence level.Hence, the good indicator must give the coverage probability close to 0.95.The estimation of the parameters were repeated for different sample sizes (n = 30, 50, 100, and 150) and concentration parameter  = 0.2, 0.4, 0.6, 0.8 and 0.99.The results of the simulations are given in Table 3 based on 1000 repetition times.
Results in Table 3 show that the observed coverage appears to be approximately equivalent to the confidence level, which reveals an coverage of the confidence intervals.

 
The results of simulation study show that, for data with highly concentrated error 0.6,   the bias of  is less than 0.3 for small samples ( n = ) and less than 0.15 for moderate and large samples ( 30 n  ).Regardless the concentration of error and the sample size, the optimum values of bias of  are obtained for moderate levels of contamination 0.6   .It may be referred to the nature of the circular data, where the shift of 5% of the generated data by more than /2  of its original values may lead to get closer to the majority of the data.Almost similar conclusion can be drawn for the concentration parameter.

Real Data Analysis
As an illustration of the proposed model, this section considers the measurements of wind directions that have been collected from two meteorological stations in two main cities in the Gaza Strip, namely Gaza and Khan Younis in Palestine.The data were provided by the Palestinian Metrological Authority, in 2007.It represent the monthly average wind directions every three hours a day viz, 0:00 mid night, 3:00 6:00 am, 9:00 am, 12:00 noon, 15:00 pm, 18:00 pm and 21:00 pm.(Badawi 2013) analyzed the Palestinian wind data in order to establish a wind farm to reach the optimum electricity energy.Recently, (Abuzaid and Allahham 2015) modelled the data by using the simple circular regression model assuming the wrapped Cauchy error.The linear circular regression model between dependent and independent variables is given by 0.747 0.842 (mod 2 ), YX  =+ where dependent variable Y is the wind direction data for Gaza while the wind direction data for Khan Younis is the independent variable X .

4.1 Fitting the functional circular regression model with wrapped Cauchy error
Since the relationship between the wind directions of Gaza and Khan Younis is reasonably linear as shown in Figure 1; the simple functional circular regression model ( 1) is suggested to fit the data.The parameters estimates are obtained by applying the iterative procedure and after converting the data into radians.The initial values of 1  and 2  are taking to be 0.3.
Table 4 presents the estimates of parameters, their standard error and the 95% confidence intervals, where the convergence is occurred after 16 iterations.Hence, the estimate relationship for wind directions of Gaza and Khan Younis data set is given by 0.237 (mod 2 ), YX =+ where variable Y is the wind directions data for Gaza and the wind directions data for Khan Younis is the variable X .A   Therefore, the closer * () A  to 1 indicates a better fitting the model.
Thus, the goodness-of-fit for the model is * ( ) 0.799.

A  =
The obtained residuals were tested to follow the wrapped Cauchy distribution via Kolomogrov-Simrnov test, where the values of test for the errors of and are 0.0401 and 0.0321 with P-values 0.835 and 0.895, respectively.

Conclusions
A new linear functional relationship model for circular variables with a wrapped Cauchy errors has been proposed due to the attractive properties of wrapped Cauchy distribution.
The maximum likelihood estimates of parameter has been obtained assuming equality of concentration parameters of errors.Estimation has been obtained iteratively since the closed-form expression for estimates are not available, by choosing a suitable initial values, the standard error of estimates as well as their confidence are obtained by bootstrapping methods.Moreover, the proposed angular regression model has been applied on a real data set of wind directions at two cities in the Gaza strip.


of the true values of circular variables X and Y respectively.Assuming that there is a linear relationship between these two variables with known slope parameter equals one.For any fixed values , respectively and thus the full model can be written as , and ,


for i=1,2,…,n.The maximum likelihood estimates of 1 The maximum likelihood estimates of 1 .., B  , then the (1 −  )% confidence intervals of  and  are given by the estimators  and  , respectively, these findings indicate the consistency of the estimators.

Figure 1 :(
Figure 1: Scatter plot of wind directions in Gaza versus KhanYounis

Robustness of the estimates
Robustness of an estimator is a useful property which gives a fair assurance that the existence of any possible outlier or violation of model assumptions will not have much effect on the parameters estimates.To assess the robustness, based on simulation study, where 5% of the generated data are contaminated start from observation d as follows: *