Constructions on grammars

2.7 Constructions on grammars

  This section introduces some constructions on grammars that are useful when spec- ifying larger grammars, for example for programming languages. Furthermore, it gives an example of a larger grammar that is transformed in several steps.

  The BNF notation, introduced in section 2.2.1, was first used in the early sixties when the programming language ALGOL 60 was defined and until now it is the way of defining programming languages. See for instance the Java Language Grammar. You may object that the Java grammar contains more ”syntactical sugar” than the grammars that we considered thus far (and to be honest, this also holds for the Algol

  60 grammar): one encounters nonterminals with postfixes ’ ?’, ’+’ and ’’. This extended BNF notation, EBNF, is introduced because the definition of a pro-

  EBNF

  gramming language requires a lot of nonterminals and adding (superfluous) nonter- minals for standard construction such as:

  • one or zero occurrences of nonterminal P , (P ?), • one or more occurrences of nonterminal P , (P + ), • and zero or more occurrences of nonterminal P , (P ∗ ),

  decreases the readability of the grammar. In other texts you will sometimes find [P ] instead of P ?, and {P } instead of P ∗ . The same notation can be used for languages, grammars, and sequences of terminal and nonterminal symbols instead of single nonterminals. This section defines the meaning of these constructs.

  We introduced grammars as an alternative for the description of languages. Design- ing a grammar for a specific language may not be a trivial task. One approach is to decompose the language and to find grammars for each of its constituent parts.

  In definition 3 we defined a number of operations on languages using operations on sets. Here we redefine these operations using context-free grammars.

  Definition 10: Language operations

  Suppose we have grammars for the languages L and M , say G L = (T, N L ,R L ,S L ) and G M = (T, N M ,R M ,S M ). We assume that the sets N L and N M are disjoint. Then

  • L∪M is generated by the grammar (T, N, R, S) where S is a fresh nonterminal,

  N=N L ∪N M ∪ {S} and R = R L ∪R M ∪ {S → S L ,S→S M ) • L M is generated by the grammar (T, N, R, S) where S is a fresh nonterminal,

  N=N L ∪N M ∪ {S} and R = R L ∪R M ∪ {S → S L S M } •L ∗ is generated by the grammar (T, N, R, S) where S is a fresh nonterminal,

  N=N L ∪ {S} and R = R L ∪ {S → , S → S L S}. •L + is generated by the grammar (T, N, R, S) where S is a fresh nonterminal,

  N=N L ∪ {S} and R = R L ∪ {S → S L ,S→S L S}

  2 The nice thing about the above definitions is that the set-theoretic operation at the level of languages (i.e. sets of sentences) has a direct counterpart at the level

  of grammatical description. A straightforward question to ask is now: can I also

  Context-Free Grammars

  define languages as the difference between two languages or as the intersection of two languages? Unfortunately there are no equivalent operators for composing grammars that correspond to such intersection and difference operators.

  Two of the above constructions are important enough to also define them as grammar operations. Furthermore, we add a new construction for choice.

  Definition 11: Grammar operations

  Let G = (T, N, R, S) be a context-free grammar and let S 0 be a fresh nonterminal.

  Then

  G ∗ = (T, N ∪ {S 0 }, R ∪ {S 0 → ,S 0 → SS 0 }, S 0 ) star G

  G + = (T, N ∪ {S 0 }, R ∪ {S 0 → S, S 0 → SS 0 }, S 0 ) plus G

  G? = (T, N ∪ {S 0 }, R ∪ {S 0 → ,S 0 → S}, S 0 )

  optional G

  2 The definition of P ?, P + , and P ∗ is very similar to the definitions of the operations on grammars. For example, P ∗ denotes zero or more occurrences of nonterminal P , so Dig ∗ denotes the language consisting of zero or more digits.

  Definition 12: EBNF for sequences

  Let P be a sequence of nonterminals and terminals, then

  L(P ∗ ) = L(Z) with Z → | PZ L(P + ) = L(Z) with Z → P | PZ

  L(P ?) = L(Z) with Z→ |P where Z is a new nonterminal in each definition. 2 Because the concatenation operator for sequences is associative, the operators + and

  ∗ can also be defined symmetrically:

  L(P ∗ ) = L(Z) with Z → | ZP L(P + ) = L(Z) with Z → P | ZP

  There are many variations possible on this theme:

  L(P ∗ Q) = L(Z) with Z → Q | PZ

  2.7.1 SL: an example

  To illustrate EBNF and some of the grammar transformations given in the previous section, we give a larger example. The following grammar generates expressions in

  a very small programming language, called SL.

  Expr → if Expr then Expr else Expr Expr → Expr where Decls Expr → AppExpr

  AppExpr → AppExpr Atomic | Atomic

  Atomic → Var | Number | Bool | (Expr )

  Decls → Decl Decls → Decls ; Decls

  Decl → Var = Expr

  2.7 Constructions on grammars

  where the nonterminals Var , Number , and Bool generate variables, number expres- sions, and boolean expressions, respectively. Note that the brackets around the Expr in the production for Atomic, and the semicolon in between the Decls in the second production forDecls are also terminal symbols. The following ‘program’ is a sentence of this language:

  if true then funny true else false where funny = 7 It is clear that this is not a very convenient language to write programs in.

  The above grammar is ambiguous (why?), and we introduce priorities to resolve some of the ambiguities. Application binds stronger than if, and both application and if bind stronger then where. Using the introduction of priorities grammar trans- formation, we obtain:

  Expr → Expr1 Expr → Expr1 where Decls

  Expr1 → Expr2 Expr1 → if Expr1 then Expr1 else Expr1 Expr2 → Atomic Expr2 → Expr2 Atomic

  where Atomic, and Decls have the same productions as before. The nonterminal Expr2 is left recursive. Removing left recursion gives the following

  productions for Expr2 .

  Expr2 → Atomic | Atomic Z

  Z → Atomic | Atomic Z Since the new nonterminal Z has exactly the same productions as Expr2 , these

  productions can be replaced by

  Expr2 → Atomic | Atomic Expr2 So Expr2 generates a nonempty sequence of atomics. Using the +-notation intro-

  duced in this section, we may replace Expr2 by Atomic + . Another source of ambiguity are the productions for Decls. Decls generates a

  nonempty list of declarations, and the separator ; is assumed to be associative. Hence we can apply the associative separator transformation to obtain

  Decls → Decl | Decl ; Decls or, according to (2.1),

  Decls

  → (Decl ;) ∗ Decl

  which, using an omitted rule for that star-operator ∗, may be transformed into

  Decls

  → Decl (;Decl ) ∗

  Context-Free Grammars

  The last grammar transformation we apply is left factoring. This transformation applies to Expr , and gives

  Expr → Expr1 Z

  Z → | where Decls Since nonterminal Z generates either nothing or a where clause, we may replace Z

  by an optional where clause in the production for Expr .

  Expr → Expr1 (where Decls)? After all these grammar transformations, we obtain the following grammar.

  Expr → Expr1 (where Decls)?

  Expr1 → Atomic + Expr1 → if Expr1 then Expr1 else Expr1

  Atomic → Var | Number | Bool | (Expr )

  Decls → Decl (;Decl ) ∗

  Exercise 2.25 . Give the EBNF notation for each of the basic languages defined in section

  Exercise 2.26 . What language is generated by G? ? Exercise 2.27 . In case we define a language by means of a predicate it is almost trivial to

  define the intersection and the difference of two languages. Show how.

  Exercise 2.28 . Let L 1 = {a n b n c m | n, m ∈ IIN} and L 2 = {a n b m c m | n, m ∈ IIN}.

  • Give grammars for L 1 and L 2 .

  • Is L 1 ∩L 2 context free, i.e. can you give a context-free grammar for this

  language?

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

19 819 7

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

1 29 9

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

7 202 3

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

0 63 87

The Correlation between students vocabulary master and reading comprehension

16 145 49

The correlation intelligence quatient (IQ) and studenst achievement in learning english : a correlational study on tenth grade of man 19 jakarta

0 57 61

An analysis of moral values through the rewards and punishments on the script of The chronicles of Narnia : The Lion, the witch, and the wardrobe

1 59 47

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

8 140 133

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

9 128 37

Transmission of Greek and Arabic Veteri

0 1 22