getdoc07fa. 176KB Jun 04 2011 12:04:03 AM

(1)

ELECTRONIC COMMUNICATIONS in PROBABILITY

IDENTIFICATION OF THE RATE FUNCTION FOR LARGE

DEVIA-TIONS OF AN IRREDUCIBLE MARKOV CHAIN

WEI LIU

Department of Mathematics, Wuhan University, 430072-HUBEI, China email: [email protected]

LIMING WU

Laboratoire de Mathématiques Appliquées, CNRS-UMR 6620, Université Blaise Pascal, 63177 Aubière, France. And Institute of Applied Mathematics, Chinese Academy of Sciences, 100190 Beijing, China email: [email protected]

SubmittedMay 26, 2009, accepted in final formNovember 16, 2009 AMS 2000 Subject classification: 60F10, 60J05

Keywords: Large deviations, irreducible Markov processes, Feynman-Kac semigroups Abstract

For an irreducible Markov chain(Xn)n≥0we identify the rate function governing the large deviation

estimation of empirical mean 1 n

Pn−1

k=0f(Xk) by means of the Donsker-Varadhan’s entropy. That allows us to obtain the lower bound of large deviations for the empirical measure 1

Pn−1

k=0δXk in

full generality.

1 Introduction

Large deviations of Markov processes were opened by Donsker-Varadhan[8](1975-1983) under strong regularity conditions. Generalizations of their fundamental works are very numerous and various. In this paper we are interested in general irreducible Markov processes. In this direction the first general results were obtained by Ney and Nummelin[14](1987). de Acosta[5](1988) derived a universal lower bound for bounded additive functionals valued in a separable Banach space, and the boundedness condition was finally removed by de Acosta-Ney[6](1998) : that is a definite work for the lower bound. For related works see also Jain[10](1990) and some recent works [17, 18, 20] by the second named author, and Kontoyiannis-Meyn[11]and Meyn[12]. However the rate function, a basic object describing the exact exponential rate in a large deviation estimation, is expressed by means of theconvergence parameter, a quantity proper to the theory of irreducible Markov processes. It is not related to the Donsker-Varadhan entropy, unlike the work of Deuschel-Stroock[7]in the more classical strong mixing case.

The main purpose of this note is to identify the rate function appearing in the large deviation results in [14, 5, 6] etc by means of the Donsker-Varadhan entropy. The key is to prove the lower semi-continuity of the Cramer functional associated with the convergence parameter of the

(2)

Feynman-Kac semigroup. This result is based mainly on the increasing continuity of the conver-gence parameter due to de Acosta[5].

This paper is organized as follows. In the next section we present necessary backgrounds and calculate the Legendre transform of the Cramer functional. We prove in section 3 the lower semi-continuity of the Cramer functional, which gives us the desired identification of the rate function of de Acosta-Ney[6]and as corollary we obtain the lower bound of large deviations for the occupation measures. Finally in section 4 we present some remarks about the upper bound of Ney-Nummelin[14]and provide a very simple counter-example to the upper bound of large deviations for unbounded additive functionals.

2 Preliminaries and Legendre transform of the Cramer

func-tional

2.1 Preliminaries

We recall some known facts about irreducible Markov chains from Nummelin [15] and Meyn-Tweedie[13].

LetK(x,d y)be a nonnegative kernel on a measurable space(E,B)whereBis countably gener-ated. It is saidirreducibleif there is some non-zero nonnegativeσ-finite measureµsuch that

∀A∈ Bwithµ(A)>0, +∞

n=1

Kn(x,A)>0, forall x∈E. (1) Such measure µ is said to be amaximal irreducible measure of K, if µK ≪ µ. All maximal irreducible measures ofK are equivalent. Belowµis some fixed maximal irreducible measure of K.

A couple(s,ν)wheres≥0 withµ(s):=R

Es(x)dµ(x)>0 andν a probability measure onE is saidK-small, if there is somem₀≥1 and constantc>0 such that

Km0₍_x_,_A₎_≥_cs_⊗_ν₍_x,_A_{) =}_cs₍_x₎_ν₍_A₎_, _x_∈_E,_A_{∈ B} ₍₂₎ A real measurable functions≥0 withµ(s)>0 is said to beK-small, if there is some probability measure such that(s,ν)isK-small. A subsetA∈ Bwithµ(A)>0 is saidK-small if 1_Adoes. By Nummelin[15, Theorem 2.1], any irreducible kernelKhas always such a small couple(s,ν). A non-empty set F ∈ B is saidK-closed, if K(x,Fc_{) =}_{0 for all} _x _∈ _F_{. For every} _K-closed _F_,

µ(Fc_{) =}_{0 by the irreducibility of}_K.

According to Nummelin[15, Definition 3.2, Proposition 3.4 and 4.7], theconvergence parameter ofK, sayR(K), is given by : for everyK-small couple(s,ν),

R(K) =sup

(

r≥0 : +∞

n=0

rnνKns<+∞

)

=sup

(

r≥0 :(∃K-closedF)(∀K-smalls), +∞

n=0

rnKns<+∞,x∈F

)

(3)

Lemma 2.1. For every K-small couple(s,ν), −logR(K) =lim sup

n→∞

1 nlogνK

n_s

= inf

K−cl osed Fsupx∈F lim sup

n→∞

1 nlogK

n_s₍_x₎

=esssup_x_∈_Elim sup n→∞

1 nlogK

n_s₍_x₎_.

(4)

Here and hereafter esssup_x∈Eis taken always w.r.tµ. Its proof is quite easy, so omitted.

2.2 Cramer functional

Let (Xn)n≥0be a Markov chain valued in E, defined on (Ω,F,(Fn),(Px)x∈E). Throughout this paper we assume always that its transition kernelP(x,d y)is irreducible, andµis a fixed maximal irreducible measure ofP.

For every V ∈ rB (the space of all real B-measurable functions on E), consider the kernel PV(x,d y):=eV(x)P(x,d y). We have the following Feynman-Kac formula,

(PV₎n_f₍_x_{) =}_Ex_f₍_X n)exp

n−1 X

k=0

V(X_k) !

, 0≤f ∈rB. (5)

It is obvious thatPV _{is irreducible with the maximal irreducible measure}_µ_. Define now our Cramer functional

Λ(V) =−logR(PV). (6)

SinceR(PV₎_∈_[_0,₊_∞₎_by_[_{15, Theorem 3.2}_]_,_Λ(_V₎_>_{−∞. By (5) and Lemma 2.1, we have}

Λ(V) =lim sup n→∞

1 nlogE

ν_s₍_X

n)exp n−1 X

k=0

V(X_k) !

=esssup_x_∈_Elim sup n→∞

1 nlogE

x_s₍_X n)exp

n−1 X

k=0

V(Xk)

! (7)

for every PV-small couple (s,ν). From the above expression we see by Hölder’s inequality that

Λ:rB →(−∞,+∞]is convex.

We can now recall the universal lower bound of large deviation in de Acosta[5]and de Acosta-Ney

[6].

Theorem 2.2. ([5, 6])Let f :E→B_{be a measurable function with values in a separable Banach}

space(B,k · k). Then for every open subset G ofB_{and every initial measure}ν, lim inf

n→∞

1 nlogPν

1 n

n−1 X

k=0

f(X_k)∈G

≥ −inf z∈GΛ

∗

f(z), (8)

(4)

Λ∗_f(z) =sup y∈B′

(〈y,z〉 −Λf(y)). (9)

Here〈y,z〉denotes the duality bilinear relation betweenB_andB′_.

The main objective is to identifyΛ∗_f by means of the Donsker-Varadhan entropy.

2.3 Legendre transform of the Cramer functional on a weighted space

At first we recall the Donsker-Varadhan’s entropyJ:M₁(E)→[0,+∞], whereM₁(E)is the space of all probability measures on(E,B). For anyν∈ M₁(E),

J(ν) =sup u∈U

E log u

Pudν, (10)

whereU ={u∈bB: inf_x∈Eu(x)>0}(bBbeing the space of all real and bounded measurable functions on(E,B)). Consider the modified Donsker-Varadhan’s entropy ([17])

Jµ(ν) =

J(ν), ν≪µ;

+∞, otherwise. (11)

Let us now fix some reference measurable function Φ : E → R _{such that} Φ ≥ 1 everywhere. Introduce the weighted functions space

bΦB:=

V :E→R_{B −}_measurable; _kV_k

Φ:=sup x∈E

|V(x)|

Φ(x) <+∞

. Let

M_b_,Φ(E):=

(maybe signed) measureνon E;

Φd|ν|<+∞

where|ν|=ν+₊_ν−₍_ν+_,_ν−_{are respectively the positive and negative part of}_ν_{in Hahn-Jordan’s}

decomposition). IfΦ =1, we write simplyM_b(E)forMb,Φ(E). Obviously

ν(V) = Z

V dν=:〈ν,V〉

is a bilinear form onbΦB ×Mb,Φ(E). The main result of this section is

Theorem 2.3. Let Φ ≥ 1 be some reference measurable function on E. Let Λ∗,Φ _: _M

b,Φ(E) →

RS_{₊_∞}_{be the Legendre transform of}_Λ_{on b}

ΦB, i.e.,

Λ∗,Φ₍_ν₎

:=sup〈ν,V〉 −Λ(V); V∈bΦB . (12) Then we have for everyν∈Mb,Φ(E),

Λ∗,Φ(ν) = ¨

Jµ(ν), if ν∈ M1(E)

+∞, otherwise. (13)

(5)

Lemma 2.4. (de Acosta[5, lemma 6.1])Define forν∈ M₁(E), J′(ν) =sup

¨Z

log u

Pudν:u∈ B, u>0ever y wher e, ν({u=∞}) =0, (log( u Pu))

−_∈_L1₍_ν₎ «

. Then J′(ν) =J(ν)for allν∈ M₁(E).

Proof of Theorem 2.3. Step 1: ≥in (13). This can be derived from[17, Proposition B.9], but we will follow the argument below proposed by the referee. At first sinceΛ(V) =0 ifV =0,µ−a.s., one deduces easily thatΛ∗,Φ₍_ν_{) = +}_∞_{for all}_ν_{not absolutely continuous w.r.t.}_µ_.

Fix nowν≪µ. For eachu∈bBletV :=log u

Pu. SinceP

V_u₌_{u, we see that}_Λ(_V₎_≤_{0 by Lemma}

2.1. Hence _Z

log u Pudν≤

V dν−Λ(V)≤Λ∗,Φ₍_ν₎_. Taking the supremum over all 1≤u∈bBand recalling thatJ(ν) = sup

1≤u∈bB

log u

Pudν (Donsker-Varadhan’s formula), we get the desired result.

Step 2: Λ∗,Φ₍_ν₎_≤_J

µ(ν)forν ∈M1(E). The argument below follows quite closely that of de

Acosta[5, Proof of (6.3)].

We can assume that Jµ(ν) < +∞ (trivial otherwise). Then ν ≪ µ. For every V ∈ rB with

V−dν <+∞, andλ >Λ(V), we take u:=

+∞

k=0

e−λk(PV)k_s₍_x₎_,

wheresis aPV-small function. Then we haveν([u= +∞]) =0 and u>s, e−λeVPu=u−s. Noting that

log u

Pu≥loge

−λ+V ₌₋_λ₊_V _{≥ −}_λ₋_V−_∈_L1₍_ν₎_,

by de Acosta’s Lemma 2.4 we have

V dν−J(ν)≤

V dν+ Z

logPu u dν

=λ+ Z

loge

−λ+V_Pu u dν

=λ+ Z

logu−s u dν

< λ. Asλ >Λ(V)is arbitrary we have so proved

V dν≤Jµ(ν) + Λ(V), if V−∈L1(ν), ν∈ M1(E). (14)

It yields the desired claim sinceV ∈bΦB =⇒ V ∈L1(ν)forν∈Mb,Φ(E). Combining Step 1 and Step 2, we obtain (13).

(6)

As an application, we have the following result.

Corollary 2.5. Let V be a nonnegative measurable function. If there existsδ≥0such thatΛ(δV)<

+∞, then

Jµ(ν)<+∞ =⇒

V dν <+∞. Proof. By Theorem 2.3, we have

Jµ(ν)≥

δ(V∧n)dν−Λ(δ(V∧n)), ∀n≥1. Thus we have

V∧ndν≤Jµ(ν) + Λ(δ(V∧n))

≤Jµ(ν) + Λ(δV)<+∞

Lettingn→+∞, we get the result.

3 Identification of the rate function

Let Ln:= 1_n

Pn−1

k=0δXk be the empirical measure. Assuming the existence of invariant probability

measureµ, then for everyν ≪ µ,P

ν(Ln ∈ ·)satisfies the weak∗ LDP on(M1(E),τ) with rate

functionJ_µ (by[17, Theorem B.1, Theorem B.5 and Proposition B.9]), whereτ is the topology

σ(M_b(E),bB)restricted toM₁(E). Then for a measurable function f :E →B_{, inspired by the}

contraction principle, the rate function governing the LDP of 1

n n−1 X

k=0

f(Xk) =

f(x)d Ln(x) should be

J_µf(z) =inf{J_µ(ν):

kfkdν <+∞, ν(f) =z}, z∈B_. ₍₁₅₎

That is not completely exact, but not far. The main result of this paper is

Theorem 3.1. For any measurable function f :E→B_whereB_{is a separable Banach space, the rate}

functionΛ∗_f given in Theorem 2.2 (due to de Acosta and Ney[6]) is exactly the lower semi-continuous regularizationJe_µf of z→J_µf(z).

It is based on the following lower semi-continuity ofΛwhich is of independent interest.

Proposition 3.2. LetΦ:E→[1,+∞)be a measurable function. ThenΛ:bΦB →(−∞,+∞]is lower semi-continuous w.r.t. the weak topologyσ(bΦB,Mb,Φ(E)), or equivalently for any V ∈bΦB,

Λ(V) =sup¦〈ν,V〉 −Jµ(ν); ν∈ M1(E),ν(Φ)<+∞ ©

(7)

AsΛ(V)is the rate of the exponential growth of the Feynmann-Kac semigroup(PV₎n_{, its} identi-fication is a fundamental subject in heat theory : formula of type (16) is the counterpart of the famous Rayleigh principle (for the maximal eigenvalue ofL +V whereL is the generator of a symmetric Markov semigroup) and it follows from the LDP of Donsker-Varadhan (when this last holds) by Varadhan’s Laplace integral lemma.

Proof of Theorem 3.1 assuming Proposition 3.2. LetΦ(x):=kf(x)k+1. For everyy∈B′_{(the dual}

space ofB_),_V(x):=〈y,f(x)〉 ∈bΦB. Then by Proposition 3.2,

Λ_f(y) = Λ(V) =sup

¨Z

〈y,f(x)〉dν−Jµ(ν); ν∈ M1(E),ν(kfk)<+∞ «

=supn〈y,z〉 −J_µf(z); z∈B

= (J_µf)∗(y). Let us prove thatJ_µf is convex onB_{. We have only to prove that} _Jf

µ(z)≤[Jµf(z1) +Jµf(z2)]/2 for

anyz,z₁,z₂∈B_{such that}_z_{= (}_z

1+z2)/2. We may assume thatJµf(zk)<+∞, k=1, 2. Given anyǫ >0, there existν₁andν₂such that

ν_k(kfk)<+∞, ν_k(f) =zk, Jµ(νk)≤Jµf(zk) +ǫ for k=1, 2. Letν=ν1+ν2

2 , thenν(f) =z. BecauseJµ(ν)is convex inν∈ M1(E), we have

Jµ(ν)≤

Jµ(ν1) +Jµ(ν2)

2 ≤

J_µf(z1) +Jµf(z2)

2 +ǫ

Hence

µ(z)≤

µ(z1) +Jµf(z2)

2 +ǫ

which completes the proof of the convexity ofJ_µf, forǫ >0 is arbitrary. Consequently by the famous Fenchel’s theorem in convex analysis,

Λ∗_f = (J_µf)∗∗

is the lower semi-continuous regularizationJef

µ ofz→Jµf.

Proposition 3.2 is mainly based on the following continuity of the convergence parameter due to de Acosta.

Lemma 3.3. (de Acosta [5, Theorem 2.1])For each j ≥1 let Ej ∈ B and assume Ej ↑ E. Let B_j ={A∈ B :A⊂ Ej}. Let K be an irreducible kernel on Ej ∈ Bj and for j≥ 1, let Kj be an irreducilbe kernel on Ej ∈ Bj. Assume that for all x ∈ E,A∈ B,Kj(x,A∩Ej)↑ K(x,A). Then R(Kj)↓R(K).

We also require the following general result.

Lemma 3.4. Letµbe aσ-finite measure on(E,B)andΛ:bΦB →(−∞,+∞]be a convex function such that

(i) If V1=V2, µ−a.e., thenΛ(V1) = Λ(V2);

(8)

ThenΛis lower semi-continuous (l.s.c. in short) w.r.t. the weak topologyσ(b_ΦB,M_b_,_Φ(E))iff for every non-decreasing sequence(V_n)in b_ΦBsuch that V(x) =sup_nV_n(x)∈b_ΦB,

Λ(Vn)→Λ(V).

Proof. The necessity is obvious because ifV_n↑V in b_ΦB, thenV_n→V inσ(b_ΦB,M_b_,_Φ(E))by dominated convergence. Let us prove the sufficiency.

Consider the isomorphismV →V/ΦfrombΦBtobBand define ˜

Λ(U):= Λ(UΦ), ∀U∈bB.

Then ˜Λsatisfies again (i) and (ii) onbB. By the isomorphism above, the l.s.c. ofΛonbΦBw.r.t.

σ(bΦB,Mb,Φ(E))is equivalent to the l.s.c. of ˜ΛonbBw.r.t. σ(bB,Mb(E)). In other words we may assume without loss of generality thatΦ =1.

Because of condition (i),Λis well defined on L∞(µ)(which is the dual ofL1(µ)), and the l.s.c. of

ΛonbBw.r.t.σ(bB,Mb(E))is equivalent to that ofΛonL∞w.r.t.σ(L∞,L1).

Below we prove the l.s.c. ofΛon L∞(µ)w.r.t. σ(L∞,L1). By taking an equivalent measure if necessary we may assume thatµis a probability measure. For anyL∈R_{, since}[Λ≤L]is convex, by the Krein-Smulyan theorem(see[4], page 163), it is closed in L∞w.r.t.σ(L∞,L1)iff

[Λ≤L]\B(R)

is closed for everyR>0, whereB(R):={V ∈L∞(µ); kVk_∞≤R}. SinceB(R)equipped with the weak∗-topologyσ(L∞,L1)is metrizable (see[2, Chap.IV, p.111]), for the desired l.s.c., it remains to prove that ifVn→V inσ(L∞,L1)and supnkVnk∞≤R<+∞, then

lim inf

n→∞ Λ(Vn)≥Λ(V).

Taking a subsequence if necessary, we may assume thatl:=lim_n→∞Λ(Vn)exists in[−∞,+∞]. Asµis a probability measure,V_n→V in the weak topology ofL2₍_µ₎_{. By the Mazur theorem (}_[_22,

Chap. V, §1, Theorem 2]), there exists a sequence(U_n), eachU_nis a convex combinationU_nof {V_k,k≥n}such thatU_n→V inL2₍_µ₎_{. By the convexity of}_Λ_,

lim sup n→∞

Λ(Un)≤_nlim_→∞Λ(Vn) =l.

Now taking a subsequence Unk which converges to V, µ−a.e. and setWk = infj≥kUnj. Then

W_k ↑ W andW = V, µ−a.e.. By condition (ii), Λ(W_k)≤ Λ(U_n_k). By the assumed increasing continuity ofΛwe get finally

l≥lim sup k→∞

Λ(Unk)≥ lim

k→∞Λ(Wk) = Λ(W) = Λ(V).

Proof of Proposition 3.2. By Theorem 2.3, the right hand side (r.h.s.) of (16) is exactlyΛ∗∗, the double Legendre transform ofΛbasing on the duality between bΦB andMb,Φ(E). By Fenchel’s theorem, (16) is equivalent to the l.s.c. ofΛw.r.t.σ(bΦB,Mb,Φ(E)), which holds true by Lemma 3.4 and de Acosta’s Lemma 3.3.

We end this section by two corollaries. We begin with the rate function governing the lower bound of large deviation of the occupation measureL_n:=1

Pn−1

k=0δXk.

(9)

Corollary 3.5. For every initial measureνand anyτ-open subset G∈ A, lim inf

n→∞

nlogPν(Ln∈G)≥ −βinf∈GJµ(β). (17) This corollary was proved in de Acosta[5, Theorem 6.3]under an extra assumption guaranteing Jµ =J. This last assumption is also crucial in Jain’s work[10], but difficult to verify out of the

classical absolutely continuous framework of Donsker-Varadhan[8].

If one uses the Donsker-Varadhan’s entropyJ instead ofJµ in (17), the lower bound (17) would

be wrong. A simple counter-example is: E={0, 1}, P(0, 1) =1/2, P(1, 1) =1. Letν =δ1and

G={δ0}. The l.h.s. of (17) is−∞, butJ(δ0) =log 2.

Notice that the above lower bound was proved by the second named author for much more gen-eral essentially irreducible Markov processes but also with a mild technical condition (see [17, Theorem B.1]).

Proof. Letβ₀be an arbitrary element ofGbut fixed. By the definition of theτ-topology, there is a bounded and measurable function f :E→Rd _{(for some}_d≥1) andδ >0 such that

N :={β∈ M1(E);|β(f)−β0(f)|< δ} ⊂G.

Here| · |is the Euclidian norm onRd_{. Hence by Theorem 2.2,}

lim inf n→∞

nlogPν(Ln∈ N)≥ −|z−infz0|<δ

Λ∗_f(z)

wherez0=β0(f). By Theorem 3.1,

inf

|z−z0|<δΛ

∗

f(z) =_|_z₋inf_z0_|_<δJ f

µ(z)≤Jµ(β0).

Therefore we get

the l.h.s. of (17)≥ −Jµ(β0)

where (17) follows forβ0∈Gis arbitrary.

Corollary 3.6. The following equality holds true: R(P) =exp

inf

ν∈M1(E)Jµ(ν)

. (18)

Proof. It follows from Proposition 3.2.(16) withV=0. A pathological phenomenon may happen : R(P)>1.

Example 3.7. Let(Bt)t≥0be a Brownian Motion on a connected complete and stochastic complete

Riemannian manifoldMwith sectional curvature less than−K(K>0) and letP(x,d y) =P_x(B1∈

d y). Pis irreducible with a maximal irreducible measure given by the Riemann volume measure

µ = d x. It is well known thatkPk_L2_(M_,_{d x)} < 1 ([16]). It is obvious (from (3)) that 1 R(P) ≤ kPk_L2_(M_,_{d x)}, and the converse inequality holds by[20, Lemma 5.3]and the symmetry of P on L2(M,d x). ThusR(P) = 1

kPk_L2_(M_,_{d x)}

(10)

4 Some remarks on the upper bound of large deviations

Theorem 4.1. (Ney and Nummelin[14])Let f :E→Rd _{be measuable. The following upper bound}

of large deviation holds lim sup

n→∞

1 nlogPx

1 n

n−1 X

k=0

f(Xk)∈F,Xn∈C

≤ −inf z∈FΛ

∗

f(z), µ−a.e.x∈E (19) for all compact subsets F ofRd_{, where}

(a) either C is P-small set if f is bounded;

(b) or for allλ >0, there are some constant c(λ)>0andν∈ M₁(E)such that

e−λ|f(x)|P(x,d y)>c(λ)ν(d y), ∀x∈C (20) if f is unbounded.

Furthermore if d=1orΛ(δ|f|)<+∞, then (19) holds for all closed subsets F ofRd_.

Indeed for everyV =〈f(x),y〉where y ∈Rd_{, if} _f _{is bounded then every} _{P-small set} _C _{is also}

PV_{-small; if} _f _{is unbounded and}_C_{satisfies (20), then}_C_{is again}_PV_{-small. Thus by Lemma 2.1,}

Λ_f(y) =esssup_x_∈_Elim sup n→∞

1 nlogE

x₁

C(Xn)exp n−1 X

k=0

〈f,y〉(Xk)

, ∀y∈Rd

where the Theorem 4.1 follows by Gärtner-Ellis theorem (see[9]).

[3]constructed an exponentially recurrent Markov chain for which even the level-1 LDP ofLt(f) for some bounded and measurable f fails.

We now present a counter-example for which (19) does not hold forP-smallCbut unbounded f. Example 4.2. (see[19]) Leth(x)be a strictly increasing differentiable function onR+_{such that}

h(0) =0, sup x∈R+

|h′(x)|<+∞,a:=lim sup x→∞

h(x)

x >lim infx→∞

h(x)

x =:b>0

Assume that(ξk)k≥0is a sequence of independent and identically distributed nonnegative random

variables andP₍_ξ₀_> _t_{) =} _e−h(t)_, _t _≥_{0. Then}_{X_k_{= (}_ξ_k_,_ξ_k+₁₎_}_k_≥₀_{is a Markov chain valued in}

E= (R+)2_{, which is Doeblin recurrent, i.e.,}_E_is_{P-small. Let us show that (19) may be wrong with}

C=Efor some unboundedf.

Indeed let f(Xk):=ξk+1−ξk, ∀k≥1, then 1_nSn(f):= 1_n

k=1f(Xk) =_n1(ξn−ξ0). We have for

allr>0,

e−h((n+1)r)(1−e−h(r)) =P₍_ξ

n+1>(n+1)r,ξ1<r)

≤P₍Sn(f)

n >r)≤P(ξn+1>nr) =e

−h(nr) Consequently for allr>0,

lim sup n→+∞

1 nlogP(

S_n(f)

n )>r) =−br, lim infn→+∞

1 nlogP(

S_n(f)

n )>r) =−ar Then(1

(11)

References

[1] Baxter, J.R., Jain, N.C., Varadhan, S.R.S., Some familiar examples for which the large de-viation principle does not hold.Comm. Pure Appl. Math.44(1991), 911-923. MR1127039 MR1127039

[2] N. Bourbaki,"Espaces Vectoriels Topologiques(Livre V)", Chaps. III-V.Hermann, Paris, 1955.

[3] Bryc, W. and Smolenski, W., On the convergence of averages of mixing sequences.J. Theoret. Probab.6(1993), 473-483. MR1230342 MR1230342

[4] Conway, J.B., A Course in Functional Analysis . Springer, Berlin, 1985. MR0768926 MR0768926

[5] A. de Acosta, Large deviations for vector valued additive functionals of a Markov process: Lower bounds.Ann. Proba.16(1988), 925-960. MR0942748 MR0942748

[6] A. de Acosta and P. Ney, Large deviation lower bounds for arbitrary additive functionals of a Markov chain.Ann. Proba.26(1998), 1660-1682. MR1675055 MR1675055

[7] J.D. Deuschel and D.W. Stroock,Large deviations. Pure and Appl. Math.137, Academic Press, Inc., Boston, MA, 1989. MR0997938 MR0997938

[8] M.D. Donsker, S.R.S. Varadhan, Asymptotic evaluation of certain Markov process expectations for large time, I-IV.Comm. Pur. Appl. Math. 28(1975), 1-47 and 279-301 ;29(1976), 389-461 ;36(1983), 183-212 . MR0386024

[9] A. Dembo and O. Zeitouni,Large Deviations Techiniques and Applications.38. Springer-Verlag, New York, 1998. MR1619036 MR1619036

[10] N.C. Jain, Large deviation lower bounds for additive functionals of Markov processes.Ann. Proba.18(1990), 1071-1098. MR1062059 MR1062059

[11] I. Kontoyiannis and S.P. Meyn, Spectral Theory and Limit Theory for Geometrically Ergodic Markov Processes.Ann. Appl. Prob.,13(2003), 304-362. MR1952001 MR1952001

[12] S.P. Meyn, Large Deviation Asymptotics and Control Variates for Simulating Large Functions. Ann. Appl. Prob.,16(2006), 310-339. MR2209344 MR2209344

[13] S.P. Meyn and R.L. Tweedie,Markov Chains and Stochastic Stability.Springer-Verlag, London, 1993. MR1287609 MR1287609

[14] P. Ney and E. Nummelin, Markov additive processes (I). Eigenvalue properties and limit theorems; (II). Large deviations.Ann. Proba.15(1987), 561-592, 593-609. MR0885131 and MR0885132 MR0885131

[15] E. Nummelin, General Irreducible Markov Chains and Non-Negative Operators. Cambridge Tracts in Math.,83, Cambridge Univ. Press, 1984. MR0776608 MR0776608

[16] Shing-Tung Yau and Richard Schoen, Lectures on Differential Geometry, in chinese. Higher Education Press in china, Beijing 2004. MR1333601

[17] L. Wu, Uniformly integrable operators and large deviations for Markov proceeses. J. Funct. Anal.172(2000), 301-376. MR1753178 MR1753178

(12)

[18] L. Wu, Some notes on large deviations of Markov processes .Acta Math. Sin. (Engl. Ser.),16 (2000), 369-394. MR1787093 MR1787093

[19] L. Wu, On large deviations for moving average processes. InProbability, Finance and Insur-ance, pp.15-49, the proceeding of a Workshop at the University of Hong-Kong (15-17 July 2002), Eds: T.L. Lai, H.L. Yang and S.P. Yung. World Scientific 2004, Singapour. MR2189197 MR2189197

[20] L. Wu, Essential spectral radius for Markov semigroups (I) : discrete time case.Probab. Th. Rel. Fields128(2004), 255-321. MR2031227 MR2031227

[21] L. Wu, Large and moderate deviations and exponential convergence for stochastic damping Hamiltonian systems.Stochastic Process. Appl.91(2001), 205-238. MR1807683 MR1807683

[22] K. Yosida, Functional Analysis, Third Version. Grundlehren der mathematischen Wis-senschaften 123, Springer-Verlag 1971.

(1)

Proof of Theorem 3.1 assuming Proposition 3.2. LetΦ(x):=kf(x)k+1. For everyy∈B′_{(the dual}

space ofB_),_V(x):=〈y,f(x)〉 ∈bΦB. Then by Proposition 3.2,

Λ_f(y) = Λ(V) =sup

¨Z

〈y,f(x)〉dν−Jµ(ν); ν∈ M1(E),ν(kfk)<+∞ «

=supn〈y,z〉 −J_µf(z); z∈B o

= (J_µf)∗(y). Let us prove thatJ_µf is convex onB_{. We have only to prove that} _Jf

µ(z)≤[Jµf(z1) +Jµf(z2)]/2 for

anyz,z₁,z₂∈B_{such that}_z_{= (z}

1+z2)/2. We may assume thatJµf(zk)<+∞, k=1, 2. Given anyǫ >0, there existν₁andν₂such that

ν_k(kfk)<+∞, ν_k(f) =zk, Jµ(νk)≤Jµf(zk) +ǫ for k=1, 2. Letν=ν1+ν2

2 , thenν(f) =z. BecauseJµ(ν)is convex inν∈ M1(E), we have

Jµ(ν)≤

Jµ(ν1) +Jµ(ν2)

2 ≤

J_µf(z1) +Jµf(z2)

2 +ǫ

Hence

µ(z)≤

µ(z1) +Jµf(z2)

2 +ǫ

which completes the proof of the convexity ofJ_µf, forǫ >0 is arbitrary. Consequently by the famous Fenchel’s theorem in convex analysis,

Λ∗_f = (J_µf)∗∗ is the lower semi-continuous regularizationJef

µ ofz→Jµf.

Proposition 3.2 is mainly based on the following continuity of the convergence parameter due to de Acosta.

We also require the following general result.

Lemma 3.4. Letµbe aσ-finite measure on(E,B)andΛ:bΦB →(−∞,+∞]be a convex function

such that

(i) If V1=V2, µ−a.e., thenΛ(V1) = Λ(V2);

(2)

ThenΛis lower semi-continuous (l.s.c. in short) w.r.t. the weak topologyσ(b_ΦB,M_b_,Φ(E))iff for every non-decreasing sequence(V_n)in b_ΦBsuch that V(x) =sup_nV_n(x)∈b_ΦB,

Λ(Vn)→Λ(V).

Proof. The necessity is obvious because ifV_n↑V in b_ΦB, thenV_n→V inσ(b_ΦB,M_b_,Φ(E))by dominated convergence. Let us prove the sufficiency.

Consider the isomorphismV →V/ΦfrombΦBtobBand define

Λ(U):= Λ(UΦ), ∀U∈bB.

Then ˜Λsatisfies again (i) and (ii) onbB. By the isomorphism above, the l.s.c. ofΛonbΦBw.r.t.

σ(bΦB,Mb,Φ(E))is equivalent to the l.s.c. of ˜ΛonbBw.r.t. σ(bB,Mb(E)). In other words we may assume without loss of generality thatΦ =1.

Because of condition (i),Λis well defined on L∞(µ)(which is the dual ofL1(µ)), and the l.s.c. of ΛonbBw.r.t.σ(bB,Mb(E))is equivalent to that ofΛonL∞w.r.t.σ(L∞,L1).

Below we prove the l.s.c. ofΛon L∞(µ)w.r.t. σ(L∞,L1). By taking an equivalent measure if necessary we may assume thatµis a probability measure. For anyL∈R_{, since}[Λ≤L]is convex, by the Krein-Smulyan theorem(see[4], page 163), it is closed in L∞w.r.t.σ(L∞_,_L1₎_iff

[Λ≤L]\B(R)

is closed for everyR>0, whereB(R):={V ∈L∞(µ); kVk_∞≤R}. SinceB(R)equipped with the weak∗-topologyσ(L∞_,_L1₎_{is metrizable (see}_{[2, Chap.IV, p.111]), for the desired l.s.c., it remains}

to prove that ifVn→V inσ(L∞,L1)and supnkVnk∞≤R<+∞, then lim inf

n→∞ Λ(Vn)≥Λ(V).

Taking a subsequence if necessary, we may assume thatl:=limn→∞Λ(Vn)exists in[−∞,+∞]. Asµis a probability measure,V_n→V in the weak topology ofL2_{(µ). By the Mazur theorem ([22,}

Chap. V, §1, Theorem 2]), there exists a sequence(U_n), eachU_nis a convex combinationU_nof {V_k,k≥n}such thatU_n→V inL2_{(µ). By the convexity of}_Λ,

lim sup n→∞

Λ(U_n)≤ lim

n→∞Λ(Vn) =l.

Now taking a subsequence Unk which converges to V, µ−a.e. and setWk = infj≥kUnj. Then

W_k ↑ W andW = V, µ−a.e.. By condition (ii), Λ(W_k)≤ Λ(U_n_k). By the assumed increasing continuity ofΛwe get finally

l≥lim sup k→∞

Λ(U_n_k)≥ lim

k→∞Λ(Wk) = Λ(W) = Λ(V).

Proof of Proposition 3.2. By Theorem 2.3, the right hand side (r.h.s.) of (16) is exactlyΛ∗∗, the double Legendre transform ofΛbasing on the duality between bΦB andMb,Φ(E). By Fenchel’s

theorem, (16) is equivalent to the l.s.c. ofΛw.r.t.σ(bΦB,Mb,Φ(E)), which holds true by Lemma

3.4 and de Acosta’s Lemma 3.3.

We end this section by two corollaries. We begin with the rate function governing the lower bound of large deviation of the occupation measureL_n:=1

Pn−1

k=0δXk.

(3)

Corollary 3.5. For every initial measureνand anyτ-open subset G∈ A, lim inf

n→∞ 1

nlogPν(Ln∈G)≥ −βinf∈GJµ(β). (17) This corollary was proved in de Acosta[5, Theorem 6.3]under an extra assumption guaranteing Jµ =J. This last assumption is also crucial in Jain’s work[10], but difficult to verify out of the

classical absolutely continuous framework of Donsker-Varadhan[8].

If one uses the Donsker-Varadhan’s entropyJ instead ofJµ in (17), the lower bound (17) would

be wrong. A simple counter-example is: E={0, 1}, P(0, 1) =1/2, P(1, 1) =1. Letν =δ1and

G={δ0}. The l.h.s. of (17) is−∞, butJ(δ0) =log 2.

Proof. Letβ₀be an arbitrary element ofGbut fixed. By the definition of theτ-topology, there is a bounded and measurable function f :E→Rd _{(for some}_d≥1) andδ >0 such that

N :={β∈ M1(E);|β(f)−β0(f)|< δ} ⊂G.

Here| · |is the Euclidian norm onRd_{. Hence by Theorem 2.2,}

lim inf n→∞

nlogPν(Ln∈ N)≥ −|z−zinf0|<δ

Λ∗_f(z)

wherez0=β0(f). By Theorem 3.1,

inf |z−z0|<δ

Λ∗_f(z) = inf |z−z0|<δ

J_µf(z)≤Jµ(β0).

Therefore we get

the l.h.s. of (17)≥ −Jµ(β0)

where (17) follows forβ0∈Gis arbitrary.

Corollary 3.6. The following equality holds true: R(P) =exp

inf

ν∈M1(E)Jµ(ν)

. (18)

Proof. It follows from Proposition 3.2.(16) withV=0. A pathological phenomenon may happen : R(P)>1.

Example 3.7. Let(B_t)_t≥₀be a Brownian Motion on a connected complete and stochastic complete Riemannian manifoldMwith sectional curvature less than−K(K>0) and letP(x,d y) =P_x(B₁∈ d y). Pis irreducible with a maximal irreducible measure given by the Riemann volume measure µ = d x. It is well known thatkPk_L2₍_M_,_{d x}₎ < 1 ([16]). It is obvious (from (3)) that 1

R(P) ≤

kPk_L2₍_M_,_{d x}₎, and the converse inequality holds by[20, Lemma 5.3]and the symmetry of P on

L2(M,d x). ThusR(P) = 1 kPk_L2₍_M_,_{d x}₎

(4)

4 Some remarks on the upper bound of large deviations

Theorem 4.1. (Ney and Nummelin[14])Let f :E→Rd _{be measuable. The following upper bound}

of large deviation holds lim sup

n→∞ 1 nlogPx

1 n

n−1 X

k=0

f(Xk)∈F,Xn∈C

≤ −inf z∈FΛ

∗

f(z), µ−a.e.x∈E (19) for all compact subsets F ofRd_{, where}

(a) either C is P-small set if f is bounded;

(b) or for allλ >0, there are some constant c(λ)>0andν∈ M₁(E)such that

e−λ|f(x)|P(x,d y)>c(λ)ν(d y), ∀x∈C (20) if f is unbounded.

Furthermore if d=1orΛ(δ|f|)<+∞, then (19) holds for all closed subsets F ofRd_.

Indeed for everyV =〈f(x),y〉where y ∈Rd_{, if} _f _{is bounded then every} _{P-small set} _C _{is also}

PV_{-small; if} _f _{is unbounded and}_C_{satisfies (20), then}_C_{is again}_PV_{-small. Thus by Lemma 2.1,} Λ_f(y) =esssup_x∈Elim sup

n→∞ 1 nlogE

x₁

C(Xn)exp n−1 X

k=0

〈f,y〉(X_k)

, ∀y∈Rd

where the Theorem 4.1 follows by Gärtner-Ellis theorem (see[9]).

Baxter et al. (1991)[1]found for the first time a Doeblin recurrent Markov chain which does not verify the level-2 LDP (but it satisfies the level-1 LDP for L_n(f)for bounded f by Theorems 2.2 and 4.1 since the whole spaceEisP-small in such case). Furthermore Bryc and Smolenski (1993) [3]constructed an exponentially recurrent Markov chain for which even the level-1 LDP ofLt(f) for some bounded and measurable f fails.

We now present a counter-example for which (19) does not hold forP-smallCbut unbounded f. Example 4.2. (see[19]) Leth(x)be a strictly increasing differentiable function onR+_{such that}

h(0) =0, sup x∈R+

|h′(x)|<+∞,a:=lim sup x→∞

h(x)

x >lim infx→∞ h(x)

x =:b>0

Assume that(ξk)k≥0is a sequence of independent and identically distributed nonnegative random

variables andP_(ξ₀_> _{t) =} _e−h(t)_, _t _≥_{0. Then}_{X_k_{= (ξ}_k_,_ξ_k₊₁_)}_k≥₀_{is a Markov chain valued in}

E= (R+)2_{, which is Doeblin recurrent, i.e.,}_E_is_{P-small. Let us show that (19) may be wrong with}

C=Efor some unboundedf.

Indeed let f(X_k):=ξ_k₊₁−ξ_k, ∀k≥1, then 1_nSn(f):= 1_n

k=1f(Xk) =1_n(ξn−ξ0). We have for

allr>0,

e−h((n+1)r)(1−e−h(r)) =P_(ξ

n+1>(n+1)r,ξ1<r)

≤P₍Sn(f)

n >r)≤P(ξn+1>nr) =e −h(nr)

Consequently for allr>0, lim sup

n→+∞ 1 nlogP(

S_n(f)

n )>r) =−br, lim infn→+∞ 1 nlogP(

S_n(f)

n )>r) =−ar Then(1

(5)

References

[1] Baxter, J.R., Jain, N.C., Varadhan, S.R.S., Some familiar examples for which the large de-viation principle does not hold.Comm. Pure Appl. Math.44(1991), 911-923. MR1127039 MR1127039

[2] N. Bourbaki,"Espaces Vectoriels Topologiques(Livre V)", Chaps. III-V.Hermann, Paris, 1955. [3] Bryc, W. and Smolenski, W., On the convergence of averages of mixing sequences.J. Theoret.

Probab.6(1993), 473-483. MR1230342 MR1230342