Qualitative Robustness in Estimation

Qualitative robustness, influence function, and breakdown point are three main concepts to judge an estimator from the viewpoint of robust estimation. It is important as well as interesting to study relation among them. This article attempts to present the concept of qualitative robustness as forwarded by first proponents and its later development. It illustrates intricacies of qualitative robustness and its relation with consistency, and also tries to remove commonly believed misunderstandings about relation between influence function and qualitative robustness citing some examples from literature and providing a new counter-example. At the end it places a useful finite and a simulated version of qualitative robustness index (QRI). In order to assess the performance of the proposed measures, we have compared fifteen estimators of correlation coefficient using simulated as well as real data sets.


Introduction
Hampel in his Ph.D. thesis (1968) developed three concepts: qualitative robustness (also  -robustness), breakdown point and influence function to assess robustness in estimation and thus raised rigorousness in robust estimation to a satisfactory level.He developed qualitative robustness to uphold qualitative side of robustness gauging distributional robustness,  -robustness; a form of qualitative robustness suitable for dependent observations, breakdown point to quantify global side of robustness and influence function to quantify infinitesimal side of robustness.In Huber's (1972) words, " Hampel (1968) recognized and sorted out the stability aspect of robustness, in close analogy to stability of a mechanical structure (say of a bridge): (i) the qualitative aspect: a small perturbation should have small effects; (ii) the breakdown aspect: how big can the perturbation be before everything breaks down; (iii) the infinitesimal aspect: the effects of infinitesimal perturbations."In Robust Statistics parameters are considered as functionals; , set of finite signed measures defined on the sample space or a subset of it (such as  

X S p
, set of probability measures) and co-domain is R k or  R or a function space or set of sets of a metric space.Fundamental concept of robustness is directly or indirectly related to continuity or differentiability of T. A natural sample estimator of T F ( ) is provided by statistical function T F n ( ) based on the sample d.f.F n .It is intuitively obvious that when T is continuous at F and F n is near F , T F n ( ) is near T F ( ) .Equicontinuity w.r.t.n is more desirable." Continuity"- one of the fundamental concepts of Classical Analysis, which was generalized to spaces which include Euclidean spaces during first two decades (Frechet, 1906 (metric space); Hausdorff, 1914 (topological space)) of last century is the backbone of the concept; "Qualitative Robustness" of estimators.If T is differentiable at F, one can find the differential of T and thereby can measure the nearness of T F n ( ) to T F ( ) due to infinitesimal change in F through F n .Now arises the question how one can define and check continuity and differentiability of T. Continuity of T is related to qualitative robustness, while differentiability to quantitative and infinitesimal robustness.It is important as well as interesting to study relation among them, specially the relation between qualitative robustness and influence function because both deal effect of small perturbations.Hampel discussed and elaborated this concept at the outset of his thesis, breakdown point and influence function in the latter part.His seminal article on qualitative robustness (1971) was published three years before his mostly quoted article on influence function (1974).Breakdown point attracted wide range of researchers only after the development of finite version of breakdown point by Donoho (1982) and Donoho and Huber (1983).Qualitative robustness, though no less important than the other two concepts from the viewpoint of robustness has gained less popularity.Most probably its mathematical complications and absence of finite versions have acted behind this present but undesirable unpopularity.
The concepts of qualitative robustness and   robustness (more restrictive concept than qualitative robustness) introduced by Hampel (1968 and1971) were extended in different directions in last eighties.Huber (1977 and1981) modified Hampel's definition suggesting asymptotic equicontinuity of sampling distribution of the estimators with respect to n on the ground that nonrobustness gets worse for large n.Rieder (1982) and Lambert (1982) introduced qualitative robustness in hypothesis testing, Boente et al. (1987) following Papatoni-Kazakos and Gray (1979) and Cox (1981) generalized qualitative robustness for stochastic processes.Cuevas (1987 and1988) adjusted some results of Hampel (1971) and Huber (1981) in the context of abstract inference.He showed incompatibility of consistency and qualitative robustness in the case of kernel density estimators.Cuevas and Romo (1993) and Nasser (2000) applied this concept in nonparametric bootstrapping and Basu et al. (1998) in Bayesian inference.In case of Mestimators Clarke (1983) extended Huber's results forwarding sufficient conditions for not only weak-continuity (qualitative robustness) but also Fréchet differentiability at a particular parametric model.In 2001 he enriched his previous results showing global weak continuity of some well-known M-functionals in neighbiurhood of a prarametic model.Fasano et al. (2012) advocated for a novel form of weak differentiability to prove consistency, asymptotic normality and qualitative robustness of M-estimates under more general conditions than those required in standard approaches.Daouia and Ruiz-Gazen (2004) etc studied qualitative robustness of nonparametric frontier estimator.Hable and Christmann (2011) showed weak continuity of support vector machines, hence its qualitative robustness and thus combining with the existence and uniqueness of support vector machines, they can be treated as the solutions of a well-posed mathematical problem in Hadamard's sense.Mizera (2010) and Krätschmer et al.(2012) tried to develop basic concept of qualitative robustness in two different directions - Mizera (2010) presented connection between weak continuity and qualitative robustness in full generality and under minimal assumptions taking Prokhorov metric on both set of models and set of sampling distributions while Krätschmer et al.(2012) using different metric proposed index of qualitative robustness to order statistical procedures on the scale 0 to ∞ in place of qualitative division of robust and non-robust procedures.
In this article we shall try to examine concept of qualitative robustness and its relation with influence function and thereby to alleviate some related and common misunderstandings.In Section 2 we provide and discuss model-based definitions of qualitative robustness as forwarded by Hampel (1968 and1971), its later development by Huber (1977 and1981), their results and their complications through some new propositions.In Section 3 we discuss works of Mizera (2010) and Krätschmer et al.(2012) more elaborately while in Section 4 we put forward definition of influence function and some fact regarding its relation with qualitative robustness with a new counter example, and place a finite-version and an index of qualitative robustness in Section 5. Finally, we conclude the work in Section 6.

Gist of Hampel's paper (1971) and Huber's results (1981)
Hampel (1968 and 1971) gave definition of qualitative robustness (Bahadur and Savage (1956) fore-shadowed this idea) and continuity of T n when sample space is X, a polish space with metric d and parametric space is R k .Both spaces are endowed with Borel   algebra to make them measurable.He deduced two main theorems, three lemmas and two corollaries to show relation between concept of qualitative robustness and continuity of T n in two cases -i) The general case Staudte (1980) and Staudte and Sheather (1990) followed the definition and theorems of Hampel (1971).Both Hampel (1968 and1971) and Huber (1977 and1981) cited the result of Strassen (1965) to show Prokhorov metric is intuitive to catch up (i) rounding and grouping errors (small errors occurring with large probability) and (ii) gross error (large errors occurring with low probability).Huber quoted and proved two results due to Prokhorov (1956) -i) The Prokhorov metric metrizes the weak topology on S X p ( ) , the set of probability measures on Xi,e it encompasses weak convergence that is the case of idealized approximation of the underlying chance mechanism and ii) S X p ( ) with the topology is a polish space.
Huber's generalization of Hampel's definition.Huber generalized the definition of Hampel on the ground that for "non robust" statistics the modulus of continuity typically gets worse for increasing n.
 (where F j , a discrete probability measures whose atoms have and T be continuous at F, and, , Huber (1981) proved, condition ii in theorem is not required if Huber's definition is adopted (discussed below).Even for general c.s.m.s X, the statement is true is shown in the following subsection.
The theorem is mathematically very nice but looks very strict from practical viewpoint.

Huber's results
Theorem 4.(Proposition 6.2 (Huber, 1981)).Assume that   It is only true when d(x,y)  1.But there is no mention of the condition.Huber assumed the well-known fact, for any metric space X d ,  a metric Comment 5.His proof clearly indicates that this theorem has two parts:  is qualitative robustness (in Huber's sense) at F. We should note the difference between this result and Hampel's theorem 1. Condition ii in Hampel's theorem is required to prove,   h n is equicontinuous at F, where n n  0 .We have demonstrated in the first proposition of the next section the same holds for general c.s.m.s.b) T n is q.r at F. and consistent at G in a nbd of F  T is continuous at F.Here he also without mentioning.All these are more clarified with an example in the next section.The former result does not hold in Hampel's sense, while the latter does in both sense.Comment 6.By Polya's theorem T is Kolmogorov (Kuiper) continuous at F  T is weakly continuous F (F is continuous).So we can infer that T is Kolmogorov (Kuiper) continuous at F T n  is qualitative robustness ( in Huber's sense) when F is continuous ( Staudte 1980).It is well known that Kolmogorov metric is equivalent to Kuiper metric.We extend the result in the following subsection in the case X R m  assuming F is absolute continuous.So we can rewrite Huber's theorem 6 as follows: Assume Parr, 1985).This reformulation might be easier to handle.These results seem very important from the viewpoint of attempt to study continuity and differentiability of T in the same topology.Staudte and Sheather (1990) gave due credit for the inception of the idea to Hampel (1968 and1971) and briefly discussed qualitative robustness.For more detailed discussion they pointed to Staudte (1980) where he followed Hampel's definition avoiding Huber's one and described the same relation between continuity and qualitative robustness as given by Hampel (1968 and1971).Jureckova and Sen (1996) briefly quoted the definition of qualitative robustness of Huber and gave the comment: "The weak continuity of ) ( ( at G (here G is the true d.f. of X 1 ) and its consistency at G in the sense that ) (G T T n  almost surely (a.e) as n  characterize the robustness of T n in a nbd of G".Our comment 5 and the new result in proposition 3 indicate that the sentence may lead one to some inappropriateness due to its briefness.

New results
2.2.1.Three New Propositions.This subsection upholds three new propositions some of which are indicated above.
is qualitative robust at F in Huber's sense for general c.s.m.s.w.r.t.Prokhorov metric.
Proof.First part of proof of theorem 2 in Cuevas (1988) begets the result.
In this regard we can quote from Cuevas and Romo (1993), "It is known (Hampel, 1971) that if that T is continuous on some   0 F U then the sequence   n T is qualitatively robust at F 0 ."Our discussion demonstrates, the statement is not precise.Not in Hampel's but in Huber's sense, continuity of T over a nbd is equivalent to qualitative robustness of T n over the nbd.
is consistent in anbd of F, F is absolute continuous.Then T is Kolmogorov-continuous at F is robust at F.

Proof .
Necessary part: Ranga Rao's (1962, p-665) result implies that T is Kolmogorovcontinuous at F  T is weakly-continuous at F, and by the above proposition,   n T is robust at F. Sufficiency part: From comment 2 after Hampel's theorem 2 (in our article) we have,   n T is robust at F  T is weakly-continuous at F, i.e.T is Kolmogorov-continuous at F. Now the following results clarify the intricacies of the above theorems and propositions to some extent: T is not Kolmogorov-continuous (i.e.not weakly continuous) at F, where F is not discrete.

Proof:
a) It is obvious.
The above results demonstrate two things -i) qualitative robustness does not imply consistency of estimators and ii) qualitative robustness and consistency should be the minimum properties of an estimator from the viewpoint of robust estimation.Cuevas (1987 and1988) generalized some of the Hampel's and Huber's results and applied them in areas of abstract inference, such as density estimation, stochastic process.He never mentioned the fact -all the results of Hampel and Huber could be easily generalized in about identical form.All the results of Hampel and Huber can be generalized to the case of "generalized statistics" (statistics which take values in the general complete separable metric spaces").It requires only two modifications; 1) using the metric   , y x d of parametric space in place of y x  and adjusting the definitions with the metric.2) applying Cantor's Intersection Theorem for general complete metric space in proving lemma 2 in Hampel (1971).

Contributions of Mizera (2010) and Krätschmer et al.(2012)
Mizera (2010) placing Hubers' definition of qualitative robustness of a statistics t n (estimators or test statistics) explained the complicacy of median elaborately in order to present as well as to generalize the intricate relationship between qualitative robustness and weak continuity, He generalized Huber's theorem 6.2 in three directions: i) He extended area of application of the theorem using Prokhorov metric in place of Levy metric deriving an Uniform Glivenko-Cantelli property (Mizera, 2010, lemma 4) ii) He also extended the definition of weak continuity to adjust set-valued functional: Definition of weak continuity.A functional T is called weakly continuous at P, if for any ε>0 there is δ >0 such that π(P,Q) ≤ δ implies d(θ, τ) < ε for any value θ and τ of T at P and Q, respectively.
His main theorem: Theorem 1. Suppose that a procedure t n is represented by a functional T. If T is weakly continuous at P, then any lawful version of t n is qualitatively robust at P. He then defining coining the term, "regular functional" presented theorem 2 in order to illustrate complicacy of converse theorem 2, i.e. to show how weak consistency and qualitative robustness Implies weak continuity.

Definition of regular functional.
A representation of a procedure t n by a functional T is called regular, if (i) it is consistent for every P in the domain of T; and (ii) for every P and every τ  T(P), there is a sequence Pν of empirical probabilities weakly converging to P, the functional T is univalued at every Pν, and T(Pν) converges to τ .
Theorem 2.Suppose that the representation of a procedure t n by a functional T is regular.
If some lawful version of t n is qualitatively robust at P, then T is weakly continuous (in particular, uniquely defined) at P.
Observing the fact that though all the classical moments are nonrobust by Hampel's definition, the higher moments are more affected by outliers than lower moments, Krätschmer et al.(2012) introduced a new concept of qualitative robustness that applies to a very large class of tail-dependent statistical functional T. The focus of the approach lies in specifying a metric d on the set of probability models for which T becomes a continuous functional at P. For R as sample space they used a weighted Kolmogorov-type distance whereas the sum of the Prokhorov metric and a moment distance was proposed for R n or any polish space.Then they established extensions of Hampel's theorem essentially stating that when T is continuous with respect to d then it is also qualitatively robust in the sense that Hampel (Huber) condition holds if we choose the Prokhorov metric for d 2 .
The proofs of these results rely on strong uniform Glivenko-Cantelli theorems in fine topologies, They also examined the sensitivity of tail-dependent statistical functionals w.r.t.infinitesimal contaminations, and proposed a new notion of infinitesimal robustness.The theoretical results were illustrated by means of several examples including general Land V-functionals.
Readers would certainly feel interested to understand the sentence--"Nevertheless, we emphasize that the concept of qualitative robustness depends on the specific choice of the metrics d and d and not just on the topologies generated by them" as it differs from comments of other prominent researchers on robustness including proponents in this field.Bur readers' interest would not be satisfied as this point was not illustrated as they pledeged.

Definition
The most central concept in Hampel's fundamental contribution to the theory of robustness (Hampel, 1968(Hampel, , 1971(Hampel, and 1974) ) is the "influence function" (originally termed as "influence curve").In his seminal article in 1974 he first gave the definition for particular case (both sample space and functional range space are R or subsets of R ) and then for general case (sample case-X , a complete separable metric space (c.s.m.s.) and functional range space-k R ).Let T be a k R -valued mapping from a subset of the probability measures on X ,  

X D T
, a finitely full and convex subset of  denote the atomic probability measure concentrated in any given X x  .Then the vector-valued influence function of T at F (here is a measure) is defined point wise by Though for a particular T it is generally considered as a function of F x and , later, for brevity, it is denoted by to plot asymptotic variance vs. F.The heuristics of influence function are heuristics, not theorems.But tendencies to use them as theorems are not rare in literature (Davies, 1993).
We can easily prove that all moments are Kolmogorov-discontinuous at continuous models, hence nonrobust.Their influence functions are continuous but unbounded.Then one may be tempted like Koenker (2005) to infer wrongly that unbounded influence function implies non-robustness.

Relation between influence function and qualitative robustness
There exists no direct relation between influence function and qualitative robustness.The following questions and their answers mainly by examples illustrates their relation:  Does a bounded influence function imply weak continuity of the functional?No.It is well -known that the efficient L-estimate of location parameter for the logistic is not robust, and .We also get weak continuity of T if V, the associated vector space is topologized by Kolmogorov and F is continuous (see Nasser, 2000; proposition 4.6.1).Does a Frėchet differentiable M-functional at F , which is Kolmogorov-continuous at F have always a bounded influence function?Yes (Clarke, 1983).
The discussions amply substantiate our comments made at the beginning of the subsection.We should be very cautious to comment in general about relation between influence function and qualitative robustness.In a particular class of estimators we may have clear-cut relation between the two.

Finite version
We have already mentioned that non-availability of finite sample version of qualitative robustness is one of the main reasons behind its less popularity than the two other concepts influence function and breakdown point.While proposing a definition of finite-version qualitative robustness, we keep in mind that an estimator with finite breakdown point equal to zero should have empirically lower QRI whereas estimators with high breakdown point should have higher QRI.We propose two versions of SQRI(SQRI 1 and SQRI 2): percentile interval, sensitivity curve of each estimator under a variety of situations and also employed probability plot, box plot and perspective plot to judge their performances.The normal score estimator showed the best performance overall.
We have made experiments on simulated as well as real world problems to apply our proposed SQRI method using 15 estimators of correlation coefficient.Detailed information of data sets are in appendix A. The results show that the proposed method successfully chooses the best robust estimator as Alam et al. (2008), the normal score estimator.The results are given in Table 1 and Table 2 (Appendix B).The visualization of data sets are in Figure 1-3 (Appendix B).

A simulated version of qualitative robustness index
To assess the effect of ε% contamination on the sampling distribution of the measures we define Qualitative Robustness Index, QRI(є)= it should be in same line in equation Here q i is the ith quantile of a measure at a model and c i q , the ith quantile of the measure at the model contaminated by ε% contamination.A slight variation of this measure was used in Alam et al. (2010) to quantify effect of contamination on different types of canonical correlation coefficient at multivariate normal models.

Conclusion
"Qualitative robustness is of little help in the actual selection of a robust procedure suited for a particular application.In order to make a rational choice, we must introduce quantitative aspects as well."(Huber, 1981,p-73) As example, both for location and scale parameters, there exit three class of robust estimators -M-type, L-type and R-typeunder mild conditions; and each class contains different robust subclasses (Huber,1981;Chapter 3 and 5;and Hampel et al., 1986;chapter 2).None the less we should start from a consistent and qualitative robust procedure and then seek procedures with extra robust criteria as such high breakdown point, smooth and bounded influence function, uniform asymptotic normality etc.

Theorem 2 .
(Lemma 3 in Hampel's (1971)let   n T be robust at F o and consistent at all

Comment 4 .
a) He proved it taking X R  and d 1 = vy e L  metric, d 2 = Prokhorov metric.He used the result d 2 ( Fig1.Scatter plot of 50 sampled Schools.Fig2.Scatter plot of 44 sampled Schools and 5 contaminated samples.

Fig3. Scatter Biochemical data
Fig3.Scatter Biochemical data Hampel (1971 and1986) is qualitative robustness at F in Huber's sense, but converse is not necessarily true.2.1.1.Three main results ofHampel (1971 and1986) asymptotically equicontinuous w.r.t.n at 0 F .i d are metrics that induce weak topology.In Hampel's definition   n h is equicontinuous w.r.t. for all n.i d are Prokhorov metrics.Comment 1.It is clear if   n T is qualitative robustness at F To get results in Hampel's line we need continuity of T over whole .
It is a local robustness property.Various characteristics of an influence functions are used to develop various concepts such as Gross Error Sensitivity (GES),    (related to upper limit of the range outside which influence function vanishes) etc to delineate definite but different aspects of local robustness property.As important by-products of the attempt to quantify the effect of outlier on the estimators Change of Variance Function (CVF) has been developed The following new counter example shows that even two-valued almost constant influence function does not guarantee the weak continuity of the functional; 