The type ‘Parser’

3.1 The type ‘Parser’

  The goals of this section are:

  • develop a type Parser that is used to give the type of parsing functions; • show how to obtain this type by means of several abstraction steps.

  The parsing problem is (see Section 2.8): Given a grammar G and a string s, de- termine whether or not s ∈ L(G). If s ∈ L(G), the answer to this question may

  be either a parse tree or a derivation. For example, in Section 2.4 you have seen a grammar for sequences of s’s:

  S → SS | s

  A parse tree of an expression of this language is a value of the datatype SA (or a value of S, SA2, SA3, see Section 2.6, depending on what you want to do with the result of the parser), which is defined by

  data SA = BesideA SA SA | SingleA

  A parser for expressions could be implemented as a function of the following type: type Parser = String -> SA

  For parsing substructures, a parser can call other parsers, or call itself recursively. These calls do not only have to communicate their result, but also the part of the input string that is left unprocessed. For example, when parsing the string sss, a parser will first build a parse tree BesideA SingleA SingleA for ss, and only then build a complete parse tree

  BesideA (BesideA SingleA SingleA) SingleA using the unprocessed part s of the input string. As this cannot be done using

  a global variable, the unprocessed input string has to be part of the result of the parser. The two results can be grouped in a tuple. A better definition for the type Parser is hence:

  type Parser = String -> (SA,String) Any parser of type Parser returns an SA and a String. However, for different

  grammars we want to return different parse trees: the type of tree that is returned depends on the grammar for which we want to parse sentences. Therefore it is better to abstract from the type SA, and to make the parser type into a polymorphic type. The type Parser is parametrised with a type a, which represents the type of parse trees.

  type Parser a = String -> (a,String) For example, a parser that returns a structure of type Oak (whatever that is) now

  has type Parser Oak. A parser that parses sequences of s’s has type Parser SA. We might also define a parser that does not return a value of type SA, but instead the number of s’s in the input sequence. This parser would have type Parser Int.

  Parser combinators

  Another instance of a parser is a parse function that recognises a string of digits, and returns the number represented by it as a parse ‘tree’. In this case the function is also of type Parser Int. Finally, a recogniser that either accepts or rejects sentences of a grammar returns a boolean value, and will have type Parser Bool.

  Until now, we have assumed that every string can be parsed in exactly one way. In general, this need not be the case: it may be that a single string can be parsed in various ways, or that there is no way to parse a string. For example, the string ”sss” has the following two parse trees:

  BesideA (BesideA SingleA SingleA) SingleA BesideA SingleA (BesideA SingleA SingleA)

  As another refinement of the type definition, instead of returning one parse tree (and its associated rest string), we let a parser return a list of trees. Each element of the result consists of a tree, paired with the rest string that was left unprocessed after parsing. The type definition of Parser therefore becomes:

  type Parser a = String -> [(a,String)] If there is just one parsing, the result of the parse function is a singleton list. If no

  parsing is possible, the result is an empty list. In case of an ambiguous grammar, the result consists of all possible parsings.

  list of

  This method for parsing is called the list of successes method, described by Wadler

  suc-

  [14]. It can be used in situations where in other languages you would use so-called

  cesses

  backtracking techniques. In the Bird and Wadler textbook it is used to solve combi- natorial problems like the eight queens problem [3]. If only one solution is required rather than all possible solutions, you can take the head of the list of successes.

  lazy

  Thanks to lazy evaluation, not all elements of the list are computed if only the first

  evalua-

  value is needed, so there will be no loss of efficiency. Lazy evaluation provides a

  tion

  backtracking approach to finding the first solution. Parsers with the type described so far operate on strings, that is lists of characters.

  There is however no reason for not allowing parsing strings of elements other than characters. You may imagine a situation in which a preprocessor prepares a list of tokens (see Section 2.8), which is subsequently parsed. To cater for this situation we refine the parser type once more: we let the type of the elements of the input string be an argument of the parser type. Calling it b, and as before the result type

  a, the type of parsers is now defined by:

  type Parser b a = [b] -> [(a,[b])] or, if you prefer meaningful identifiers over conciseness:

  type Parser symbol result = [symbol] -> [(result,[symbol])] We will use this type definition in the rest of this chapter. This type is defined in

  listing 1, the first part of our parser library. The list of successes appears in result type of a parser. Each element of this list is a possible parsing of (an initial part of) the input. We will hardly use the generality provided by the Parser type: the type of the input b (or symbol) will almost always be Char.

  3.2 Elementary parsers

  -- The type of parsers type Parser symbol result = [symbol] -> [(result,[symbol])]

  Listing 1: ParserType.hs

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

19 819 7

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

1 29 9

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

7 202 3

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

0 63 87

The Correlation between students vocabulary master and reading comprehension

16 145 49

The correlation intelligence quatient (IQ) and studenst achievement in learning english : a correlational study on tenth grade of man 19 jakarta

0 57 61

An analysis of moral values through the rewards and punishments on the script of The chronicles of Narnia : The Lion, the witch, and the wardrobe

1 59 47

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

8 140 133

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

9 128 37

Transmission of Greek and Arabic Veteri

0 1 22