The pumping lemma for context-free languages
9.3 The pumping lemma for context-free languages
The Pumping Lemma for context-free languages gives a property that is satisfied by all context-free languages. This property is a statement of the form: in sentences exceeding a certain length, two sublists of bounded length can be identified that can be duplicated while retaining a sentence. The idea behind this property is the following. Context-free languages are described by context-free grammars. For each sentence in the language there exists a derivation tree. When sentences have a derivation tree that is higher than the number of nonterminals, then at least one nonterminal will occur twice in a node; consequently a subtree can be inserted as often as desired.
As an example of an application of the Pumping Lemma, consider the context-free grammar with the following productions.
B → fAg The following parse tree represents the derivation of the sentence acfegdb.
If we replace the subtree rooted by the lower occurrence of nonterminal A by the
Pumping Lemmas: the expressive power of languages
subtree rooted by the upper occurrence of A, we obtain the following parse tree.
This parse tree represents the derivation of the sentence acfcfegdgdb. Thus we ‘pump’ the derivation of sentence acfegdb to the derivation of sentence acfcfegdgdb. Repeating this step once more, we obtain a parse tree for the sentence
acfcfcfegdgdgdb We can repeatedly apply this process to obtain derivation trees for all sentences of
the form
a(cf) i e(gd) i b for i ≥ 0. The case i = 0 is obtained if we replace in the parse tree for the sentence
acfegdb the subtree rooted by the upper occurrence of nonterminal A by the subtree rooted by the lower occurrence of A:
This is a derivation tree for the sentence aeb. This step can be viewed as a negative pumping step.
The proof of the following lemma is given in Section 9.4.
Theorem 2: Context-free Pumping Lemma
Let L be a context-free language. Then there exist
c, d
: c, d ∈ IIN
9.3 The pumping lemma for context-free languages
there exist
u, v, w, x, y : z = uvwxy and |vx | > 0 and |vwx | ≤ d
The Pumping Lemma is a tool with which we prove that a given language is not context-free. The proof obligation is to show that the property shared by all context- free languages does not hold for the language under consideration.
Theorem 2 enables us to prove that a language L is not context-free by showing that for all
there exists
u, v, w, x, y : z = uvwxy and |vx | > 0 and |vwx | ≤ d
there exists
i ∈ IIN
: uv i wx i y 6∈ L
As an example, we will prove that the language T defined by
T = {a n b n c n | n > 0}
is not context-free. Proof: Let c, d ∈ IIN.
Take z = a r r b c r with r = max(c, d).
Let u, v, w, x, y be such that z = uvwxy, |vx| > 0 and |vwx | ≤ d Note that our choice for r guarantees that substring vwx has one of the following shapes:
• vwx consists of just a’s, or just b’s, or just c’s. • vwx contains both a’s and b’s, or both b’s and c’s.
So vwx does not contain a’s, b’s, and c’s. Take i = 0, then
• If vwx consists of just a’s, or just b’s, or just c’s, then it is impossible to write
the string uwy as a s s b c s for some s, since only the number of terminals of one
kind is decreased. • If vwx contains both a’s and b’s, or both b’s and c’s it lies somewhere on the
border between a’s and b’s, or on the border between b’s and c’s. Then the string uwy can be written as
uwy = a s b t c r r p uwy q = a b c
for some s, t, p, q, respectively. At least one of s and t or of p and q is less than r. Again this list is not an element of T .
Exercise 9.5 . Prove that the following language is not context-free
{a k 2 | k ≥ 0}
Pumping Lemmas: the expressive power of languages
Exercise 9.6 . Prove that the following language is not context-free
{a i | i is a prime number }
Exercise 9.7 . Prove that the following language is not context-free
{ww | w ∈ {a, b} ∗ }