Implementation of first and last
10.2.7 Implementation of first and last
Function firsts takes a grammar, and returns for each symbol of the grammar (so also the terminal symbols) the set of terminal symbols with which a sentence derived from that symbol can start. The first set of a terminal symbol is the terminal symbol itself.
The first set of each symbol consists of that symbol itself, plus the (first) symbols that can be derived from that symbol in one or more steps. So the first set can be computed by an iterative process, just as the function isEmpty.
Consider the grammar exGrammar again. We start the iteration with
[(’S’,"S"),(’A’,"A"),(’B’,"B"),(’C’,"C"),(’D’,"D") ,(’a’,"a"),(’b’,"b"),(’d’,"d") ]
Using the productions of the grammar we can derive in one step the following lists of first symbols.
[(’S’,"ABC"),(’A’,"S"),(’B’,"Ab"),(’C’,"D"),(’D’,"d")] and the union of these two lists is
[(’S’,"SABC"),(’A’,"AS"),(’B’,"BAb"),(’C’,"CD"),(’D’,"Dd") ,(’a’,"a"),(’b’,"b"),(’d’,"d")]
In two steps we can derive
[(’S’,"SAbD"),(’A’,"ABC"),(’B’,"S"),(’C’,"d"),(’D’,"")] and again we have to take the union of this list with the previous result. We repeat
this process until the list doesn’t change anymore. For exGrammar this happens when:
[(’S’,"SABCDabd") ,(’A’,"SABCDabd") ,(’B’,"SABCDabd") ,(’C’,"CDd") ,(’D’,"Dd") ,(’a’,"a") ,(’b’,"b") ,(’d’,"d") ]
Function firsts is defined as the fixedPoint of a step function that iterates the first computation one more step. The fixedPoint starts with the list that contains all symbols paired with themselves.
firsts :: (Symbol s, Ord s) => CFG s -> [(s,[s])] firsts grammar =
fixedPoint (firstStepf grammar) (startSingle grammar) startSingle :: (Ord s, Symbol s) => CFG s -> [(s,[s])]
startSingle grammar = map (\x -> (x,[x])) (symbols grammar)
LL Parsing
The step function takes the old approximation and performs one more iteration step. At each of these iteration steps we have to add the start list with which the iteration started again.
firstStepf :: (Ord s, Symbol s) =>
CFG s -> [(s,[s])] -> [(s,[s])]
firstStepf grammar approx =
(startSingle grammar)
‘combine‘ (compose (first1 grammar) approx)
combine :: Ord s => [(s,[s])] -> [(s,[s])] -> [(s,[s])] combine xs = foldr insert xs
where insert (a,bs) [] = [(a,bs)]
insert (a,bs) ((c,ds):rest)
| a == c
= (a, union bs ds) : rest
| otherwise = (c,ds) : (insert (a,bs) rest)
compose :: Ord a => [(a,[a])] -> [(a,[a])] -> [(a,[a])] compose r1 r2 = [(a, unions (map (r2?) bs)) | (a,bs) <- r1]
Finally, function first1 computes the direct first symbols of all productions, taking into account that some nonterminals can derive the empty string, and combines the results for the different nonterminals.
first1 :: (Symbol s, Ord s) => CFG s -> [(s,[s])] first1 grammar =
map (\(nt,fs) -> (nt,unions fs))
(group (map (\(nt,rhs) -> (nt,foldrRhs (isEmpty grammar)
single [] rhs
) (prods grammar)
where group groups elements with the same first element together
group :: Eq a => [(a,b)] -> [(a,[b])] group = foldr insertPair []
insertPair :: Eq a => (a,b) -> [(a,[b])] -> [(a,[b])] insertPair (a,b) []
= [(a,[b])]
insertPair (a,b) ((c,ds):rest) =
if a==c then (c,(b:ds)):rest else (c,ds):(insertPair (a,b) rest) function single takes an element and returns the set with the element, and unions
returns the union of a set of sets. Function lasts is defined using function firsts. Suppose we reverse the right-hand
sides of all productions of a grammar. Then the first set of this reversed grammar is the last set of the original grammar. This idea is implemented in the following functions.
10.2 LL Parsing: Implementation
reverseGrammar :: Symbol s => CFG s -> CFG s reverseGrammar =
\(s,al) -> (s,map (\(nt,rhs) -> (nt,reverse rhs)) al) lasts :: (Symbol s, Ord s) => CFG s -> [(s,[s])]
lasts = firsts . reverseGrammar