A General Base of Power Transformation to Improve the Boundary Effect in Kernel Density without Shoulder Condition

In this paper, a general base of power transformation under the kernel method is suggested and applied in the line transect sampling to estimate abundance. The suggested estimator performs well at the boundary compared to the classical kernel estimator without using the shoulder condition assumption. The transformed estimator show smaller value of mean squared error and absolute bias from the efficiency results obtained using simulation.


Introduction
Line transect sampling is considered as a common technique to estimate population abundance (density). For this method, the study area divides into non-overlapping parts (strips) with total length , assuming that an observer follows each strip and records the perpendicular distance of each detected animal (object). The perpendicular distances where ℎ is the bandwidth parameter and (. ) is the kernel function. The kernel "Rosenblatt-Parzen" density estimator is commonly applied in the literature to find a reliable estimate of (0); the estimator is direct, easy, and allows the sample to illustrate its characteristic density value at a chosen value of . For the case of reducing the estimation bias, several studies were considered on the related kernel estimator. For examples, Jones, Linton, and Nielsen (1995) proposed a simple bias reduction method for density estimation, Cheng, Fan, and Marron (1997) investigated the best weight functions for local polynomial fitting at endpoints to fix the boundary correction, Mack (2002) suggested several techniques to reduce bias, Eidous (2005) proposed frequency nonparametric histogram estimators, Karunamuni and Alberts (2006) used a transformation that is easy to implement to correct bias around the boundary, Koekemoer and Swanepoel (2008) proposed a semi-parametric kernel density estimator based on transformation, Eidous (2011a) introduced an additive histogram frequency estimator based on the case of the shoulder condition doesn't valid using line transect method, Eidous (2012) proposed a new kernel estimator for abundance without the shoulder condition, Wen and Wu (2015) made an improved transformation-based kernel estimator of densities on the unit interval, and Eidous (2015) improved the histogram estimation for (0) applying line transect data with and without the shoulder condition. Recently, Albadareen and Ismail (2017) introduced several kernel estimators for (0), Eidous and Al-Eibood (2018) proposed a bias-corrected histogram estimator for line transect sampling, Albadareen and Ismail (2018) suggested a generalized form of Epanechnikov kernel function to the adaptive estimation of (0), and Albadareen and Ismail (2019) proposed a form of powertransformation to the adaptive estimation of (0) when the shoulder condition is violated.
Assume that a random sample of the line transect method with non-negative distances are 1 , 2 , … , . When an asymmetric kernel function is assumed, Chen (1996) derived the classical reflection estimator of ( ) at = 0 as: The bias and variance of the estimator (2) are: The asymptotic mean squared error (AMSE) is: In this study, a general base of power-transformation of perpendicular distance under the kernel method is proposed for the population density when the shoulder condition is violated. The asymptotic theoretical properties (bias, variance, and mean squared error) of the estimator are derived and compared to the classical reflection of the kernel estimator. The efficiency results are supported by simulation studies, and the performance comparison is carried out between the proposed estimator and the classical kernel estimator.

Methodology
In some cases, the kernel estimator in (2) yields underestimated values and has a large negative bias under the line transect method (see Eidous (2011b)). Several bias reduction techniques were suggested in the literature, and one of the common method is the transformation approach (see Charpentier & Flachaire, 2015; Devroye & Gyorfi, 1985; Marron & Ruppert, 1994). In this article, we propose the power transformation with general base form and apply the transformation on the kernel estimator when ′ (0) ≠ 0. Assuming that the range of the perpendicular distances are 0 ≤ ≤ , the proposed transformation is = / − 1, > 1, where is a general base of power transformation. This function transforms the perpendicular distances by a non-decreasing function that produces estimator (0). The original estimator in equation (2), which is ( ), is applied to the original data and the transformed estimator, (0), is obtained from the back-transformation such that: so that (0) = (0) ( log( ) ), i.e. when = 0 then = 0.
In the line transect method, the estimation of ( ) is required at = 0. For this proposed method (estimation by back-transformation), an equivalent estimation value ̂( 0) ( log( ) ) is substituted. To obtain the estimation value at = 0, the kernel estimator with respect to in equation (2) can be applied, and the transformed kernel estimator ̂( 0) is: If the kernel function ( ) is assumed to follow a Gaussian density, the estimated values obtained from (2) and (8) converge to zero as > , when considering that is large (such as ≥ max( ) + 4ℎ). The density value of ( ∓ ℎ ) disappear when | ∓ | > 4ℎ.
The bias and variance of ̂( 0) are: The asymptotic mean squared error is obtained by assuming the small terms (. ) and (. ) to be zero, It should be noted that [̂(0)] ≤ [̂(0)] if log( ) ≤ 1. The value of that produces a smaller theoretical variance is defined as log( ) ≤ 1, i.e. ≥ , under the constrain > 1. Without loss of generality, the base value = will be assumed throughout this article.

Simulation
The theoretical asymptotic value in (13) is derived based on a large sample assumption. The simulation study is carried out to compare and examine the proposed estimator ̂( 0) with the classical kernel estimator ̂( 0) using different small sample sizes, which are = 50, 100, and 500. The efficiency measurements are the relative bias b) Negative exponential model (Gates, Marshall, & Olson, 1968) The detection function is ( ) = − , > 0, 0 ≤ ≤ , and ( ) = − , 0 ≤ ≤ . The density parameter values = 1.5, 2.0, 2.5, and 3. 0 are chosen with the truncation point = 3.0 for these models.

Bandwidth selection
The efficiency of the kernel estimator is based on the value of bandwidth. Since the mean squared error of the kernel estimator in equation (2) does not produce large variability by applying different symmetric kernel functions such as Gaussian, Epanechnikov, and biweight (Ghosh, 2018), the kernel function ( ) is assumed to follow the standard normal distribution for comparison purposes.
For our study, recommended bandwidth approaches are applied for the original data and the transformed data. The two estimators considered are: • Estimator 1 (Est1): The kernel estimator given by equation (2) is considered using the original data with the bandwidth method recommended by Silverman (1986), which minimizes the mean integrated squared error; . Although these bandwidth value was computed based on a reference density has a shoulder (i.e. the half-normal density), it is better to compute ℎ based on another density hasn't a shoulder as it is assumed with the proposed estimator (i.e. the negative exponential function).
• Estimator 2 (Est2): The proposed estimator given by equation (8) Table 1 and Table 2 provide the simulation results. The transformed estimator (Est2) shows smaller absolute relative bias and relative mean error than the traditional kernel estimator (Est1) under both families. Likewise, the relative mean errors of estimator 2 (Est2) decrease as the sample sizes increase, i.e. (Est2) provides a more consistent fit asymptotically as illustrated in Figure 1.   1. RB and RME of the simulation results of the Beta and the negative exponential model

Conclusion
This article proposed an adaptive method of the kernel estimator to estimate the population abundance (density) at the boundary under the line transect method. A general base of power-transformation is suggested to improve the estimation efficiency when the shoulder condition is violated. The proposed transformation estimator presents more efficient and consistent results than the traditional kernel estimator. The asymptotic bias, variance and mean squared error of the proposed estimator are also derived. The simulation results show that the proposed estimator is more efficient than the traditional kernel estimator.