Construction of simple recursive descent parsers
10.1 Construction of simple recursive descent parsers
For the kinds of language that satisfy the rules discussed in the last chapter, parser construction turns out to be remarkably easy. The syntax of these languages is governed by production rules of the form non-terminal allowable string where the allowable string is a concatenation derived from the basic symbols or terminals of the language other non-terminals the actions of meta-symbols such as { }, [ ], and | . We express the effect of applying each production by writing a procedure or void function in C ++ terminology to which we give the name of the non-terminal that appears on its left side. The purpose of this routine is to analyse a sequence of symbols, which will be supplied on request from a suitable scanner lexical analyser, and to verify that it is of the correct form, reporting errors if it is not. To ensure consistency, the routine corresponding to any non-terminal S: may assume that it has been called after some globally accessible variable Sym has been found to contain one of the terminals in FIRSTS. will then parse a complete sequence of terminals which can be derived from S, reporting an error if no such sequence is found. In doing this it may have to call on similar routines to handle sub-sequences. will relinquish parsing after leaving Sym with the first terminal that it finds which cannot be derived from S, that is to say, a member of the set FOLLOWS. The shell of each parsing routine is thus PROCEDURE S; S string BEGIN we assert Sym FIRSTS Parsestring we assert Sym FOLLOWS END S; where the transformation Parsestring is governed by the following rules: a If the production yields a single terminal, then the action of Parse is to report an error if an unexpected terminal is detected, or more optimistically to accept it, and then to scan to the next symbol. Parse terminal IF IsExpectedterminal THEN GetSym ELSE ReportError END b If we are dealing with a single production that is, one of the form A = B, then the action of Parse is a simple invocation of the corresponding routine ParseSingleProduction A B This is a rather trivial case, just mentioned here for completeness. Single productions do not really need special mention, except where they arise in the treatment of longer strings, as discussed below. c If the production allows a number of alternative forms, then the action can be expressed as a selection Parse 1 | 2 | ... n CASE Sym OF FIRST 1 : Parse 1 ; FIRST 2 : Parse 2 ; ...... FIRST n : Parse n END in which we see immediately the relevance of Rule 1. In fact we can go further to see the relevance of Rule 2, for to the above we should add the action to be taken if one of the alternatives of Parse is empty. Here we do nothing to advance Sym - an action which must leave Sym, as we have seen, as one of the set FOLLOWS - so that we may augment the above in this case as Parse 1 | 2 | ... n | CASE Sym OF FIRST 1 : Parse 1 ; FIRST 2 : Parse 2 ; ...... FIRST n : Parse n ; FOLLOWS : do nothing ELSE ReportError END d If the production allows for a nullable option, the transformation involves a decision Parse [ ] IF Sym FIRST THEN Parse END e If the production allows for possible repetition, the transformation involves a loop, often of the form Parse { } WHILE Sym FIRST DO Parse END Note the importance of Rule 2 here again. Some repetitions are of the form S { } which transforms to Parse ; WHILE Sym FIRST DO Parse END On occasions this may be better written REPEAT Parse UNTIL Sym FIRST f Very often, the production generates a sequence of terminal and non-terminals. The action is then a sequence derived from a and b, namely Parse 1 2 ... n Parse 1 ; Parse 2 ; ... Parse n10.2 Case studies
Parts
» compilers and compiler generators 1996
» Objectives compilers and compiler generators 1996
» Systems programs and translators
» The relationship between high-level languages and translators
» T-diagrams Classes of translator
» Phases in translation compilers and compiler generators 1996
» Multi-stage translators compilers and compiler generators 1996
» Interpreters, interpretive compilers, and emulators
» Using a high-level host language Porting a high level translator
» Bootstrapping Self-compiling compilers compilers and compiler generators 1996
» The half bootstrap compilers and compiler generators 1996
» Bootstrapping from a portable interpretive compiler A P-code assembler
» Simple machine architecture compilers and compiler generators 1996
» Addressing modes compilers and compiler generators 1996
» Machine architecture Case study 1 - A single-accumulator machine
» Instruction set Case study 1 - A single-accumulator machine
» A specimen program Case study 1 - A single-accumulator machine
» An emulator for the single-accumulator machine
» A minimal assembler for the machine
» Machine architecture Case study 2 - a stack-oriented computer
» Instruction set Case study 2 - a stack-oriented computer
» Specimen programs Case study 2 - a stack-oriented computer
» An emulator for the stack machine
» Syntax, semantics, and pragmatics
» Languages, symbols, alphabets and strings Regular expressions
» Grammars and productions compilers and compiler generators 1996
» Classic BNF notation for productions
» Simple examples compilers and compiler generators 1996
» Phrase structure and lexical structure -productions
» Semantic overtones The British Standard for EBNF Lexical and phrase structure emphasis
» One- and two-pass assemblers, and symbol tables
» Source handling Towards the construction of an assembler
» Lexical analysis Towards the construction of an assembler
» Syntax analysis Towards the construction of an assembler
» Symbol table structures The first pass - static semantic analysis
» The second pass - code generation
» Symbol table structures The first pass - analysis and code generation
» Error detection compilers and compiler generators 1996
» Simple expressions as addresses
» Improved symbol table handling - hash tables
» Macro processing facilities compilers and compiler generators 1996
» Conditional assembly compilers and compiler generators 1996
» Relocatable code compilers and compiler generators 1996
» Further projects compilers and compiler generators 1996
» Equivalent grammars Case study - equivalent grammars for describing expressions
» Useless productions and reduced grammars -free grammars
» Ambiguous grammars compilers and compiler generators 1996
» Type 1 Grammars Context-sensitive The Chomsky hierarchy
» The relationship between grammar type and language type
» EBNF description of Clang A sample program
» Context sensitivity Deterministic top-down parsing
» Terminal successors, the FOLLOW function, and LL1 conditions for non -free grammars
» Further observations Restrictions on grammars so as to allow LL1 parsing
» Alternative formulations of the LL1 conditions
» The effect of the LL1 conditions on language design
» Construction of simple recursive descent parsers
» Case studies compilers and compiler generators 1996
» Syntax error detection and recovery
» Construction of simple scanners
» LR parsing compilers and compiler generators 1996
» Automated construction of scanners and parsers
» Embedding semantic actions into syntax rules
» Attribute grammars compilers and compiler generators 1996
» Synthesized and inherited attributes
» Classes of attribute grammars
» Case study - a small student database
» Installation Input file preparation
» Execution Output from CocoR Assembling the generated system
» Case study - a simple adding machine
» Character sets Comments and ignorable characters
» Pragmas User names The scanner interface
» Syntax error recovery Parser specification
» Grammar checks Semantic errors
» Interfacing to support modules The parser interface
» Essentials of the driver program Customizing the driver frame file
» Case study - Understanding C declarations
» Case study - Generating one-address code from expressions
» Case study - Generating one-address code from an AST
» Case study - How do parser generators work?
» Project suggestions compilers and compiler generators 1996
» Overall compiler structure compilers and compiler generators 1996
» A hand-crafted source handler
» A hand-crafted scanner Lexical analysis
» A CocoR generated scanner Efficient keyword recognition
» A hand-crafted parser A CocoR generated parser
» Syntax error handling in hand-crafted parsers
» Error reporting The symbol table handler
» Other aspects of symbol table management - further types
» The code generation interface
» Code generation for a simple stack machine
» Machine tyranny Other aspects of code generation
» Abstract syntax trees as intermediate representations
» Simple optimizations - constant folding
» Simple optimizations - removal of redundant code
» Generation of assembler code
» Source handling, lexical analysis and error reporting Syntax
» The scope and extent of identifiers
» Symbol table support for the scope rules
» Parsing declarations and procedure calls
» Hypothetical machine support for the static link model
» Code generation for the static link model
» Hypothetical machine support for the display model
» Code generation for the display model
» Relative merits of the static link and display models
» Syntax and semantics compilers and compiler generators 1996
» Symbol table support for context-sensitive features
» Actual parameters and stack frames
» Hypothetical stack machine support for parameter passing
» Semantic analysis and code generation in a hand-crafted compiler
» Semantic analysis and code generation in a CocoR generated compiler
» Scope rules Language design issues
» Function return mechanisms Language design issues
» Context sensitivity and LL1 conflict resolution Fundamental concepts
» Parallel processes, exclusion and synchronization
Show more