Implementation of lookahead

10.2.5 Implementation of lookahead

  Function lookaheadp takes a grammar and a production, and returns the lookahead set of the production. It is defined in terms of four functions. Each of the first three functions will be defined in a separate subsection below, the fourth function is defined in this subsection.

  LL Parsing

  •

  isEmpty :: (Ord s,Symbol s) => CFG s -> s -> Bool Function isEmpty takes a grammar and a nonterminal and determines whether

  or not the empty string can be derived from the nonterminal in the grammar. (This function was called empty in Definition 3.)

  •

  firsts :: (Ord s, Symbol s) => CFG s -> [(s,[s])] Function firsts takes a grammar and computes the first set of each symbol

  (the first set of a terminal is the terminal itself).

  •

  follow :: (Ord s, Symbol s) => CFG s -> [(s,[s])] Function follow takes a grammar and computes the follow set of each non-

  terminal (so it associates a list of symbols with each nonterminal).

  •

  lookSet :: Ord s

  (s -> Bool) -> -- isEmpty (s -> [s]) -> -- firsts? (s -> [s]) -> -- follow? (s, [s])

  -> -- production

  [s]

  -- lookahead set

  Note that we use the operator ?, see Section 6.4.2, on the firsts and follow association lists. Function lookSet takes a predicate, two functions that given

  a nonterminal return the first and follow set, respectively, and a production, and returns the lookahead set of the production. Function lookSet is intro- duced after the definition of function lookaheadp.

  Now we define:

  lookaheadp :: (Symbol s, Ord s) => CFG s -> (s,[s]) -> [s] lookaheadp grammar =

  lookSet (isEmpty grammar) ((firsts grammar)?) ((follow grammar)?) We will exemplify the definition of function lookSet with the grammar exGrammar,

  with the following productions:

  S→ AaS | B | CB A→ SC | B→ A|b C→ D

  D→ d Consider the production S → AaS. The lookahead set of the production contains

  the set of symbols which can appear as the first terminal symbol of a sequence of symbols derived from A. But, since the nonterminal symbol A can derive the empty string, the lookahead set also contains the symbol a.

  Consider the production A → SC. The lookahead set of the production contains the set of symbols which can appear as the first terminal symbol of a sequence of symbols derived from S. But, since the nonterminal symbol S can derive the empty string, the lookahead set also contains the set of symbols which can appear as the first terminal symbol of a sequence of symbols derived from C.

  10.2 LL Parsing: Implementation

  Finally, consider the production B → A. The lookahead set of the production contains the set of symbols which can appear as the first terminal symbol of a sequence of symbols derived from A. But, since the nonterminal symbol A can derive the empty string, the lookahead set also contains the set of terminal symbols which can follow the nonterminal symbol B in some derivation.

  The examples show that it is useful to have functions firsts and follow in which, for every nonterminal symbol n, we can look up the terminal symbols which can appear as the first terminal symbol of a sequence of symbols in some derivation from n and the set of terminal symbols which can follow the nonterminal symbol n in a sequence of symbols occurring in some derivation respectively. It turns out that the definition of function follow also makes use of a function lasts which is similar to the function firsts, but which deals with last nonterminal symbols rather than first terminal ones.

  The examples also illustrate a control structure which will be used very often in the following algorithms: we will fold over right-hand sides. While doing so we compute sets of symbols for all the symbols of the right-hand side which we encounter and collect them into a final set of symbols. Whenever such a list for a symbol is computed, there are always two possibilities:

  • either we continue folding and return the result of taking the union of the set

  obtained from the current element and the set obtained by recursively folding over the rest of the right-hand side

  • or we stop folding and immediately return the set obtained from the current

  element. We continue if the current symbol is a nonterminal which can derive the empty

  sequence and we stop if the current symbol is either a terminal symbol or a non- terminal symbol which cannot derive the empty sequence. The following function makes this statement more precise.

  foldrRhs :: Ord s =>

  (s -> Bool) -> (s -> [s]) -> [s] ->

  [s]

  foldrRhs p f start = foldr op start

  where op x xs = f x ‘union‘ if p x then xs else [] The function foldrRhs is, of course, most naturally defined in terms of the function

  foldr. This function is somewhere in between a general purpose and an application specific function (we could easily have made it more general though). In the exercises we give an alternative characterisation of foldRhs. We will also need a function scanrRhs which is like foldrRhs but accumulates intermediate results in a list. The function scanrRhs is most naturally defined in terms of the function scanr.

  scanrRhs :: Ord s =>

  (s -> Bool) -> (s -> [s]) ->

  LL Parsing

  [s] -> [[s]]

  scanrRhs p f start = scanr op start

  where op x xs = f x ‘union‘ if p x then xs else [] Finally, we will also need a function scanlRhs which does the same job as scanrRhs

  but in the opposite direction. The easiest way to define scanlRhs is in terms of scanrRhs and reverse.

  scanlRhs p f start = reverse . scanrRhs p f start . reverse We now return to the function lookSet.

  lookSet :: Ord s =>

  (s -> Bool) -> (s -> [s]) ->

  (s,[s]) -> [s]

  lookSet p f g (nt,rhs) = foldrRhs p f (g nt) rhs The function lookSet makes use of foldrRhs to fold over an right-hand side. As

  stated above, the function foldrRhs continues processing an right-hand side only if it encounters a nonterminal symbol for which p (so isEmpty in the lookSet in- stance lookaheadp) holds. Thus, the set g nt (follow?nt in the lookSet instance lookaheadp) is only important for those right-hand sides for nt that consist of non- terminals that can all derive the empty sequence. We can now (assuming that the definitions of the auxiliary functions are given) use the function lookaheadp instance of lookSet to compute the lookahead sets of all productions.

  look nt rhs = lookaheadp exGrammar (nt,rhs) ? look ’S’ "AaS"

  dba ? look ’S’ "B" dba ? look ’S’ "CB"

  d ? look ’A’ "SC" dba ? look ’A’ ""

  ad ? look ’B’ "A" dba ? look ’B’ "b"

  b ? look ’C’ "D"

  d ? look ’D’ "d"

  d

  10.2 LL Parsing: Implementation

  It is clear from this result that exGrammar is not an LL(1)-grammar. Let us have

  a closer look at how these lookahead sets are obtained. We will have to use the functions firsts and follow and the predicate isEmpty for computing intermediate results. The corresponding subsections explain how to compute these intermediate results.

  For the lookahead set of the production A → AaS we fold over the right-hand side AaS. Folding stops at ’a’ and we obtain

  firsts? ’A’ ‘union‘ firsts? ’a’ ==

  "dba" ‘union‘ "a" ==

  "dba" For the lookahead set of the production A → SC we fold over the right-hand side

  SC. Folding stops at C since it cannot derive the empty sequence, and we obtain

  firsts? ’S’ ‘union‘ firsts? ’C’ ==

  "dba" ‘union‘ "d" ==

  "dba" Finally, for the lookahead set of the production B → A we fold over the right-hand

  side A In this case we fold over the complete (one element) list and and we obtain

  firsts? ’A’ ‘union‘ follow? ’B’ ==

  "dba" ‘union‘ "d" ==

  "dba" The other lookahead sets are computed in a similar way.

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

19 819 7

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

1 29 9

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

7 202 3

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

0 63 87

The Correlation between students vocabulary master and reading comprehension

16 145 49

The correlation intelligence quatient (IQ) and studenst achievement in learning english : a correlational study on tenth grade of man 19 jakarta

0 57 61

An analysis of moral values through the rewards and punishments on the script of The chronicles of Narnia : The Lion, the witch, and the wardrobe

1 59 47

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

8 140 133

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

9 128 37

Transmission of Greek and Arabic Veteri

0 1 22