where Θ is a C
∞
function satisfying Θp, 0 = 1 for all p ∈ ∂ S
f
. See Appendix B in Biau, Cadre, and Pelletier, 2008.
Denote by D
2 e
p
the directional differentiation operator of order 2 on V ∂ S
f
, ρ in the direction e
p
. It will be seen in the proofs in Section 3 that the variance in our central limit theorem is determined by
the second order behavior of f near the boundary of its support. Therefore to derive this variance we shall need the following set of second order smoothness assumptions on f .
Assumption Set 2
a There exists ρ 0 such that, for all p ∈ ∂ S
f
, the map u 7→ f p + ue
p
is of class C
2
on [0, ρ].
b There exists ρ 0 such that
sup
p ∈∂ S
f
sup
≤u≤ρ
D
2 e
p
f p + ue
p
∞. c There exists
ρ 0 such that inf
p ∈∂ S
f
inf
≤u≤ρ
D
2 e
p
f p + ue
p
0. For similar smoothness assumptions see Section 2.4 of Mason and Polonik 2009. The imposition
of such conditions appears to be unavoidable to derive a central limit theorem. Note also that Assumption Sets 1 and 2 are the same as the ones used in Biau, Cadre, and Pelletier 2008. In
particular, we assume throughout that the density f is continuous on R
d
. Thus, we are in the case of a non-sharp boundary, i.e., f decreases continuously to zero at the boundary of its support. The
case where f has sharp boundary requires a different approach see for example Härdle, Park, and Tsybakov, 1995. The analytical assumptions on f Assumption Set 2 are stipulations on the local
behavior of f at the boundary of the support. In particular, the restrictions on f imply that inside the support and close to the boundary the maps u
7→ f p + ue
p
, with p ∈ ∂ S
f
, are strictly convex see the Appendix.
2.2 Main result
Let σ
2 f
= 2
d
Z
∂ S
f
Z
∞
Z
B0,1
Φp, t, ududt v
σ
dp, 2.3
with Φp, t, u = exp
−ω
d
D
2 e
p
f pt
2
exp
βuD
2 e
p
f p t
2
2
− 1
, ω
d
denoting the volume of B0, 1 and
βu = λ B0, 1 ∩ B2u, 1 .
2621
Remark 2.1. Let Γ be the Gamma function. We note that βu has the closed expression Hall, 1988,
p. 23 βu =
2
π
d−12
Γ
1 2
+
d 2
Z
1 |u|
1 − y
2 d−12
d y, if 0 ≤ |u| ≤ 1
0, if
|u| 1, which, in particular, gives
β 0 = ω
d
= π
d 2
Γ
1 +
d 2
. We are now ready to state our main result.
Theorem 2.1. Suppose that both Assumption Sets 1 and 2 are satisfied. If r.i r
n
→ 0, r.ii nr
d n
ln n
4 3
→ ∞ and r.iii nr
d+1 n
→ 0, then n
r
d n
1 4
λ
S
n
△S
f
− Eλ
S
n
△S
f
D
→ N 0, σ
2 f
, where
σ
2 f
0 is as in 2.3.
Remark 2.2. A referee pointed out that the methods in the paper may be applicable to obtain a central limit theorem for the histogram-based support estimator studied in Baíllo and Cuevas 2006. The
reference Cuevas, Fraiman, and Rodríguez-Casal 2007 should be a starting point for such an investi- gation.
It is known from Cuevas and Rodríguez-Casal 2004 that the choice r
n
= Oln nn
1 d
gives the fastest convergence rate of S
n
towards S
f
for the Hausdorff metric, that is Oln n n
1 d
. For such a radius choice, the concentration speed of
λS
n
∆S
f
around its expectation as given by Theorem 2.1 is O
p n
ln n
1 4
, close to the parametric rate. Theorem 2.1 assumes d
≥ 2 Assumption 1-a. We restrict ourselves to the case d ≥ 2 for the sake of technical simplicity. However, the case d = 1 can be derived with minor adaptations, assuming
r
n
→ 0, nr
n
ln n
4 3
→ ∞, and nr
3 2
n
→ 0. In fact, the one-dimensional setting has already been explored in the related context of vacancy estimation Hall, 1984.
As we mentioned in the introduction, the quantity λS
n
△S
f
is closely related to the vacancy V
n
Hall 1985, 1988, which is defined as in 1.2. A close inspection of the proof of Theorem 2.1 reveals that taking intersection with S
f
in the integrals does not effect things too much and, in fact, the asymptotic distributional behaviors of
λS
n
△S
f
and V
n
are nearly identical. As a consequence, we obtain the following result:
Theorem 2.2. Suppose that both Assumption Sets 1 and 2 are satisfied. If r.i, r.ii and r.iii hold, then
n r
d n
1 4
V
n
− EV
n D
→ N 0, σ
2 f
, where
σ
2 f
0 is as in 2.3. 2622
Surprisingly, the limiting variance σ
2 f
remains as in 2.3. Theorem 2.2 was motivated by a remark by Hall 1985, who pointed out that a central limit theorem for vacancy in the case nr
d n
→ ∞ remained open.
3 Proof of Theorem 2.1
Our proof of Theorem 2.1 will borrow elements from Mason and Polonik 2009. Set
ǫ
n
= 1
nr
d n
1 4
. 3.1
Observe that, from r.ii and r.iii, the sequence ǫ
n
satisfies e.i ǫ
n
→ 0 and e.ii ǫ
n
p nr
d n
→ ∞. For future reference we note that from r.i and r.iii, we get that
r
n
ǫ
n
→ 0. 3.2
Set E
n
= {x ∈ R
d
: f x ≤ ǫ
n
}. Furthermore, let
L
n
ǫ
n
= Z
E
n
1{ f
n
x 0} − 1{ f x 0}
d x and
L
n
ǫ
n
= Z
E
c n
1{ f
n
x 0} − 1{ f x 0}
d x . Noting that, under Assumption Set 1,
λS
n
∆S
f
= L
n
ǫ
n
+ L
n
ǫ
n
, our plan is to show that n
r
d n
1 4
L
n
ǫ
n
− EL
n
ǫ
n D
→ N 0, σ
2 f
3.3 and
n r
d n
1 4
L
n
ǫ
n
− EL
n
ǫ
n P
→ 0, 3.4
which together imply the statement of Theorem 2.1. To prove a central limit theorem for the random variable L
n
ǫ
n
, it turns out to be more convenient to first establish one for the Poissonized version of it formed by replacing f
n
x with π
n
x =
N
n
X
i=1
1
Bx,r
n
X
i
, where N
n
is a mean n Poisson random variable independent of the sample X
1
, . . . , X
n
. By convention, we set
π
n
x = 0 whenever N
n
= 0. The Poissonized version of L
n
ǫ
n
is then defined by Π
n
ǫ
n
= Z
E
n
1{π
n
x 0} − 1{ f x 0}
d x .
2623
The proof of Theorem 2.1 is organized as follows. First Subsection 3.1, we determine the exact asymptotic behavior of the variance of Π
n
ǫ
n
. Then Subsection 3.2, we prove a central limit theorem for Π
n
ǫ
n
. By means of a de-Poissonization result Subsection 3.3, we then infer 3.3. In a final step Subsection 3.4 we prove 3.4, which completes the proof of Theorem 2.1. This
Poissonizationde-Poissonization methodology goes back to at least Beirlant, Györfi, and Lugosi 1994.
3.1 Exact asymptotic behavior of VarΠ