Step 4: Deciding on the types
4.4 Step 4: Deciding on the types
We want to write a parser for travel schemes, that is, we want to write a function ts of type
ts :: Parser ? ? The question marks should be replaced by the input type and the result type, re-
spectively. For the input type we can choose between at least two possibilities: characters, Char or tokens Token. The type of tokens can be chosen as follows:
data Token = Station_Token Station | Time_Token Time type Station = String
type Time
= (Int,Int)
We will construct a parser for both input types in the next subsection. So ts has one of the following two types.
ts :: Parser Char ? ts :: Parser Token ?
For the result type we have many choices. If we just want to compute the total travelling time, Int suffices for the result type. If we want to compute the total travelling time, the total waiting time, and a nicely printed version of the travelling scheme, we may do several things:
• define three parsers, with Int (total travelling time), Int (total waiting time),
and String (nicely printed version) as result type, respectively;
Grammar and Parser design
• define a single parser with the triple (Int,Int,String) as result type; • define an abstract syntax for travelling schemes, say a datatype TS, and define
three functions on TS that compute the desired results. The first alternative parses the input three times, and is rather inefficient compared
with the other alternatives. The second alternative is hard to extend if we want to compute something extra, but in some cases it might be more efficient than the third alternative. The third alternative needs an abstract syntax. There are several ways to define an abstract syntax for travelling schemes. The first abstract syntax corresponds to definition (4.1) of grammar TS .
data TS1 = Single1 Station
| Cons1 Station Time Time TS1
where Station and Time are defined above. A second abstract syntax corresponds to the grammar for travelling schemes defined in (4.2).
type TS2 = ([(Station,Time,Time)],Station) So a travelling scheme is a tuple, the first component of which is a list of triples
consisting of a departure station, a departure time, and an arrival time, and the second component of which is the final arrival station. A third abstract syntax corresponds to the second grammar defined in Section 4.1:
data TS3 = Single3 Station
| Cons3 (Station,Time,[(Time,Station,Time)],Time,Station)
Which abstract syntax should we take? Again, this depends on what we want to do with the abstract syntax. Since TS2 and TS1 combine departure and arrival times in
a tuple, they are convenient to use when computing travelling times. TS3 is useful when we want to compute waiting times since it combines arrival and departure times in one constructor. Often we want to exactly mimic the productions of the grammar in the abstract syntax, so if we use (4.1) for the grammar for travelling schemes, we use TS1 for the abstract syntax. Note that TS1 is a datatype, whereas TS2 is a type. TS1 cannot be defined as a type because of the two alternative productions for TS . TS2 can be defined as a datatype by adding a constructor. Types and datatypes each have their advantages and disadvantages; the application determines which to use. The result type of the parsing function ts may be one of types mentioned earlier (Int, etc.), or one of TS1, TS2, TS3.