93 Logic Programming and Constraint Logic Programming Jacques Cohen

  X Programming Languages

  The area of Programming Languages includes programming paradigms, language imple- mentation, and the underlying theory of language design. Today’s prominent paradigms include imperative (with languages like COBOL, FORTRAN, and C), object-oriented (C++ and Java), functional (Lisp, Scheme, ML, and Haskell), logic (Prolog), and event-driven (Java and Tcl/Tk). Scripting languages (Perl and Javascript) are a variant of imperative programming for Web applications. Event-driven programming is useful in Web-based and embedded applications, and concurrent programming serves applications in parallel computing environments. This section also provides a balanced treatment of the under- lying theories of language design and implementation such as type systems, semantics, memory management, and compilers.

  90 Imperative Language Paradigm Michael J. Jipping and Kim Bruce Introduction Data Bindings: Variables, Type, Scope, and Lifetime • • • Control Structures Best Practices Research Issues and Summary

  91 The Object-Oriented Language Paradigm Raimund Ege

• • •

Introduction Underlying Principles Best Practices Language Implementation Issues Research Issues

  92 Functional Programming Languages Benjamin Goldberg • • Introduction History of Functional Languages The Lambda Calculus: Foundation • • of All Functional Languages Pure Versus Impure Functional Languages SCHEME: A

  • Functional Dialect of LISP Standard ML: A Strict Polymorphic Functional • • Language Nonstrict Functional Languages HASKELL: A Nonstrict Functional Language Research Issues in Functional Programming

  93 Logic Programming and Constraint Logic Programming Jacques Cohen • • Introduction An Introductory Example Features of Logic Programming • • • Languages Historical Remarks Resolution and Unification Procedural

  • • • • Interpretation: Examples Impure Features Constraint Logic Programming Recent • • • Developments in CLP (2002) Applications Theoretical Foundations Metalevel • • • Interpretation Implementation Research Issues Conclusion

  94 Scripting Languages Robert E. Noonan and William L. Bynum • • • • Introduction Perl Tcl/Tk PHP Summary

  95 Event-Driven Programming Allen B. Tucker and Robert E. Noonan Foundations: The Event Model The Event-Driven Programming • • • Paradigm Applets Event Handling Example: A Simple GUI

  • Interface Event-Driven Applications
  •   96 Concurrent/Distributed Computing Paradigm Andrew P. Bernat and Patricia Teller

    • • •

    Introduction Hardware Architectures Software Architectures Distributed

    • • Systems Formal Approaches Existing Languages with Concurrency
    • • • Features Research Issues Summary

      97 Type Systems Luca Cardelli • • • • Introduction The Language of Type Systems First-Order Type Systems First-Order Type Systems for Imperative Languages Second-Order Type • • • • Systems Subtyping Equivalence Type Inference Summary and Research Issues

      David A. Schmidt

      98 Programming Language Semantics

    • • •

      

    Introduction A Survey of Semantics Methods Semantics of Programming

    • • Languages Applications of Semantics Research Issues in Semantics

      99 Compilers and Interpreters Kenneth C. Louden • • Introduction Underlying Principles Best Practices • • Incremental Compilation Research Issues and Summary

      100 Runtime Environments and Memory Management Robert E. Noonan and William L. Bynum • •

    Introduction Runtime Stack Management Pointers and Heap Management

    • • Garbage Collection Summary

      90 Imperative Language Paradigm

      90.1 Introduction

      90.2 Data Bindings: Variables, Type, Scope, and Lifetime • • Binding Time Variables Types Scope Execution Units: Expressions, Statements, Blocks, and Programs

      90.3 Control Structures Conditional Structures Iterative Structures Unconstrained Control Structures: Goto and Exceptions Procedural Abstraction Data Abstraction

      90.4 Best Practices

      Michael J. Jipping

      Data Bindings: Variables, Types, Scope, and Lifetime Hope College

    • • •

      

    Execution Units

    Control Structures Procedural
    • Abstraction Data Abstraction and Separate Compilation

      Kim Bruce

      Williams College

      90.5 Research Issues and Summary

      90.1 Introduction

      In the 1940s, John von Neumann pioneered the design of basic computer architecture by structuring computers into two major units: a central processing unit (CPU), responsible for computations, and a data storage unit, or memory. This architecture is demand driven, based on a command and instruction- oriented computing model. The basic unit cycle of execution, typically composed of a single instruction, consists of four steps: 1. Obtain the addresses of the result and operands.

      2. Obtain the operand data from the operand location(s).

      3. Compute the result data from the operand data.

      4. Store the result data in the result location. Note in this sequence how separation of the execution unit from the memory unit has structured the sequence. Data must be located and piped from memory, operated on, and transferred back to memory to be available for the next operation. All operations in a von Neumann machine operate this way, in a stepwise, structured manner. The von Neumann model has been the basis of nearly every computer built since the 1940s.

      Imperative programming languages are modeled after the von Neumann model of machine execution and were invented to provide the abstractions of machine components and actions in order to make it easier to program computers. Abstractions such as variables (which model memory cells), assignment statements (which model data transfer), and other language statements are all abstractions of the basic von Neumann approach. In this chapter we address the fundamental principles underlying imperative programming languages and examine the way the constructs of imperative languages are represented in several languages. We devote special attention to features of more modern imperative programming languages, among them support for abstract data types and newer control constructs such as iterators and exception handling. Ex- amples in this chapter are given in a variety of imperative programming languages, including FORTRAN, Pascal, C, C++, MODULA-2, and Ada 83. In the Best Practices section we explore in more detail the languages FORTRAN IV (chosen for historical reasons), C and C++ (its imperative parts), and Ada 83.

    90.2 Data Bindings: Variables, Type, Scope, and Lifetime

      In this section we discuss some of the fundamental properties of imperative programming languages. In particular, we address issues related to binding time, the properties of variables, types, scope, and lifetime.

      90.2.1 Binding Time

      We will find it useful to classify many of the differences in programming languages based on the notion of binding time. A binding is the association of an attribute to a name. The time at which a binding takes place is an important consideration. There are many times when a binding can occur. Some of these follow: r

      Language definition: when the language is designed. An example is the binding of the constant r name true to the corresponding Boolean value. Language implementation: when a compiler or interpreter is written. An example is the binding of r the representation of values of various types. Compile time: when a program is being translated into machine language. For example, the type of a variable in a statically typed language is bound at compile time. In statically typed languages, r overloaded functions are bound at compile time.

      Load time: when the executable machine language image of the program is loaded into the memory r for execution by the execution unit. The location of global variables is bound at load time. Procedure or function invocation time: the time a program is being executed. Actual parameters are bound to formal parameters and local variables are bound to locations at procedure invocation r time.

      Run time: any time during the execution of a program. A new value can be bound to a variable at run time. In dynamically typed languages, overloaded functions are bound at run time. As we examine fundamental issues in the definition of imperative programming languages, we will keep in mind the distinctions between languages based on differences in binding time.

      90.2.2 Variables

      Imperative languages support computation by executing commands whose purpose is to change the underlying state of the computer on which they are executed. The state of a computer encompasses the contents of memory and also includes both data which are about to be read from outside of the computer and data which have been output.

      Variables are central to the definition of imperative languages as they are objects whose values are

      dependent on the contents of memory. A variable is characterized by its attributes, which generally include its name, location in memory, value, type, scope, and lifetime.

      Depending on context, the meaning of a variable may be considered to be either its value or its location. For instance, in the assignment statement, x := x + 1, the meaning of the occurrence of the variable x to the left of the assignment symbol is its location (sometimes called the l-value of x), whereas the meaning of the occurrence on the right side is its value, that is, the value stored at the location corresponding to x (sometimes called the r-value). The location of global variables is bound at load time, whereas the location of local variables and reference parameters is typically bound at procedure entry. The value of the variable can be changed at any point during execution of the program.

    90.2.3 Types

      Types in programming languages are abstractions which represent sets of values and the operations and relations which are applicable to them. Types can be used to hide the representation of the primitive values of a language, allow type checking at either compile time or run time, help disambiguate overloaded operators, and allow the specification of constraints on the accuracy of computations. Types also can play an important role in compiler optimization.

      Types in a programming language include both simple and composite types. The use of simple types such as integer, real, Boolean, and character types allows the user to abstract away from the actual com- puter representation of these values, which may differ from computer to computer. The operations on simple types may or may not be supported directly by the underlying hardware. For instance, many early microprocessors supported only real or floating-point operations in software.

      Some languages (e.g., those derived from Pascal) allow the programmer to define their own simple

      

    enumerated types by simply listing the values of the type. The ordering of elements in this enumeration is

    significant as these types typically support successor and predecessor functions as well as ordering relations.

      Later we will discuss mechanisms for supporting abstract data types, another way of constructing types which can be used as though they were primitive to a language.

      Many languages support the creation of subrange types, which allows a programmer to define a new type as a copy of a type with a subset of its values. The new type comes equipped with the same operators as its parent type and is usually compatible with the original type.

      Composite or structured data types can be created from simple types using type constructors. Typical

      composite types include arrays, records (or structures), variant records (or unions), sets, subranges, pointer types, and, in a few languages, function or procedure types. For instance, arrays are typically constructed from two types: a subrange type which provides the set of indices of the array, and another type representing the values stored in the array. Not all languages support all these type constructors. For instance, function and procedure types are provided by MODULA-2 but are not available in Ada 83. Many languages support strings as special types of composite types, for instance, as arrays of characters, but they may also be provided as builtin types.

      Most imperative languages bind types to variables statically. These bindings are usually specified in declarations, but some languages, such as FORTRAN, allow implicit declaration of variables, with the type binding determined by the name of the identifier (e.g., in FORTRAN if the name starts with I through N then the variable is an integer, otherwise real).

      An important issue in type-checking programming languages is type equivalence. When do two terms have equivalent types? The two extremes in the definitions of type equivalence are structural and name equivalence: r

      Structural equivalence: Two types are said to be structurally (or domain) equivalent if they have the same structure. That is, they are built from the same type constructors and builtin types in the r same way.

      Name equivalence: Two types are name equivalent if they have the same name. The language C uses structural equivalence, whereas Ada 83 uses name equivalence. There are also a range of possibilities between these two extremes. For instance, Pascal and MODULA-2 use declaration

      

    equivalence: two types are declaration equivalent if they are name equivalent or they lead back to the same

    structure declaration by a series of redeclarations.

      Inequivalent types may be compatible in certain situations. For instance, two types are assignment compatible if an expression of one type may be assigned to a variable of another. For instance, in Pascal a subrange of integer is assignment compatible with integer, even though the types are not equivalent.

      An application of these ideas can be found in the rules for determining whether a particular actual parameter may be used in a procedure call for a particular formal parameter. In Pascal, if the formal parameter is a reference parameter then the actual parameter must be a variable of equivalent type. If the formal parameter is a value parameter then the actual parameter must be assignment compatible.

      As mentioned earlier, some languages support the creation of subrange types. The new subrange type is usually assignment compatible with the original type. Because of this compatibility, the new type is called a subtype of the parent in Ada. Another mechanism available in Ada, called derived typing, defines a new type by constructing an exact copy of a type that already exists. However, the resulting new type is distinct and is not type equivalent or even assignment compatible with the existing type.

      The type equivalence rules are the cause of one of the greatest limitations in the use of Pascal. If a formal parameter has an array type, then the actual parameter must have an equivalent type. In particular, the subscript ranges of the two arrays must be identical. Thus, it is impossible to write a procedure in Pascal which can be used to sort different-sized arrays of real numbers. (Actually, the current ANSI standard Pascal provides a special mechanism to allow exceptions to this rule.)

      Ada escapes from this problem by designating some properties of types to be static, while others are dynamic. For example, in a type defined to be a subrange of integers, the underlying static type is integer while the subrange bounds are a dynamic property. Only the static properties of types are considered at compile time by the type checker, whereas restrictions due to dynamic properties are checked at run time.

      Consider the following Ada declarations as an example of type bindings: type COINS is (PENNY, NICKEL, DIME, QUARTER); subtype SILVER is COINS range (NICKEL..QUARTER); type CHANGE is new COINS; C1, C2; COINS; S: SILVER; CH: CHANGE;

      COINS is an enumerated type, defined by the programmer to allow assignments such as

      C1 := DIME;

      

    SILVER is a subrange of COINS, which includes only the values NICKEL, DIME, and QUARTER. CHANGE

    is a derived type taken from COINS.

      Because Ada employs name equivalence, only C1 and C 2 are equivalent, but S is assignment compatible with them. If Ada used structural equivalence, then variables C1, C2, and CH would be equivalent.

    90.2.4 Scope

      The scope of a binding is the area or section of a program in which that particular binding is effective. The method and extent of scope rules that define a binding scope will, to a large degree, affect the usefulness and applicability of a language. If, for instance, the rules allow the scope of a binding to be determined by the execution path of a program, the language might be more flexible, yet the code becomes harder to understand.

      Scope rules are tied tightly to concepts of binding time. Static scope rules determine the scope of a binding at compile time and are based on the lexical structure of the program. Dynamic scope rules determine the scope of a binding at run time. Thus, an occurrence of a variable name in a procedure may refer to one variable the first time it is evaluated yet refer to an entirely different variable the next time, depending on the execution path at run time. Most imperative languages use static scope rules. with TEXT_IO; use TEXT_IO; procedure SCOPED is package INT_IO is new INTEGER_IO (integer); use INT_IO; I,J: integer; procedure P is begin put (J); new_line; end P; begin

      J := 0; I := 10; declare -- Block 1

      J: integer; begin j := I; -- reference point A P; end; put (J); new_line; declare -- Block 2

      I: Integer begin I := 5 J := I + 1; -- reference point B P; end; put (J); new_line; end;

    FIGURE 90.1 Scoping rules in Ada.

      As an example of scope rules in Ada, consider the code in Figure 90.1. Static scope rules are determined by the program block structure, which does not change while the program runs. Therefore, the call to procedure P prints the variable J defined in the outer, main program, no matter where it is called from. Likewise, the assignment in block 1 at reference point A changes J from the block and not from the main program. Dynamic scope rules, on the other hand, typically follow dynamic call paths to determine variable bindings. If Ada used dynamic scope rules, the first call to P from block 1 would print the value 10 corresponding to the J from block 1, whereas the second call to P would print the value 3 corresponding to the J from the main program.

    90.2.5 Execution Units: Expressions, Statements, Blocks, and Programs

      An expression is a program phrase which returns a value. Expressions are built up from constants and variables using operators. As described earlier, variables may represent two values, depending on context: their location and the value stored at that location. Operators may be builtin, like the arithmetic and comparison operators, or may be user-defined functions.

      Reflecting the sequential order of von Neumann computation, an imperative language specifies the order in which operations are evaluated. Typically, evaluation order is determined by precedence rules. A typical precedence rule set for arithmetic expressions might be the following: 1. Subexpressions inside parentheses are evaluated first (according to the precedence rules).

      2. Instances of unary negation are evaluated next.

      3. Then, multiplication (∗) and division (/) operators are evaluated in left to right order.

      4. Finally, addition (+) and subtraction (−) are evaluated left to right. Although procedure rules are commonly used by imperative languages, some languages use other conven- tions to avoid precedence rules. For example, PostScript uses postfix notation for expressions, while LISP uses prefix notation. APL evaluates all expressions from right to left without regard to precedence, using only parentheses to change the evaluation order.

      The fundamental unit of execution in an imperative programming language is the statement. A statement is an abstraction of machine language instructions, grouped together to form a single logical activity. The simplest and most fundamental statement in imperative programming languages is the assignment statement. This statement, typically written in the form x := e or x = e with x a variable (or other expression representing a location) and e an expression, is usually interpreted by evaluating e and copying its value into the location represented by x. This is known as the copy semantics for assignment.

      Less common are languages which use the sharing interpretation of assignment. In these languages, variables generally represent references to objects which contain the actual values. The assignment x := y would then be interpreted as binding the object referred to by y to x rather than its value. Since both variables refer to the same object, they share the same value. If the value of one is changed, the value of the other will also change. This is the sharing semantics for assignment.

      Declarations and statements may be grouped together to form a block. Procedure and function bodies are represented as blocks, whereas control structures (discussed subsequently) can also be understood as acting on blocks of statements (generally without declarations). The most general form of a block contains a declarative section, which contains the declarations that define the bindings that are effective in the block, and an executable section, which contains the statements over which the binding is to hold, i.e., the scope of the declarations.

      In so-called block-structured languages (including most languages descended from ALGOL 60, e.g., Pascal, Ada, and C), blocks may be nested. Within any block, therefore, there can be two kinds of bindings in force: local bindings, which are specified by the declarative sections associated with the block, and

      

    nonlocal bindings (also known as global bindings), which are bindings defined by declarative sections of

    blocks within which the specific block is nested.

      Consider again the code from Figure 90.1 . The first two assignments of the main program assign J from the main program the value 0 and I from the main program the value 10. The next assignment assigns the value 10, derived from the global I, to the variable J from the first inner block. When the definition of the second inner block is encountered, the variable I is found in the local scope, while J is found in the

      

    outer scope, that of the main program. The value 6 will be printed for J at the end of the main program.

    90.3 Control Structures

      By adopting the semantics of the basic execution cycle of a von Neumann architecture, an imperative language adopts a strict sequential ordering for its statements. By default, the next statement to execute is the next physical statement in the program. Control structures in imperative languages provide ways to alter this strict sequential ordering. The most common control structures are conditional structures and

      

    iterative structures. Unconstrained control structures are also allowed in most languages through the use of

    goto statements.

    90.3.1 Conditional Structures

      Conditional control structures (also known as selection statement ) determine whether or not a block of statements is executed based on the result of one or several tests. These structures fall into one of two classes:

    90.3.1.1 If Statements

      All imperative languages include some form of if statement. This control structure provides a text and a single statement or statement block to be executed if the test evaluates to a true value. Optionally, the programmer may provide another block of statements which can be executed only if the test evaluates to false. The following is a simple example from Ada: if (x = 2) then y := 3; else y := 6; end if; The variable y is set to either 3 or 6 depending on the value of x. In most languages, if statements can be nested within other control structures, including other if statements. However, nested if statements can result in awkward, deeply nested code. Thus, many languages provide a special construct (e.g., elsif in Ada) to represent else if constructs without requiring further nesting. The two Ada examples given next are equivalent semantically, though the first, which uses elsif, is easier to read than the second, which uses nested conditionals: if (x = 2) then if (x = 2) then y := 3; y := 3; elsif (x = 3) then else y := 15; if (x = 3) then elsif (x = 5) then y := 15; y :=18; else else if (x = 5) then y := 6; y := 18; end if; else y := 6; end if; end if; end if;

    90.3.1.2 Case Statements This conditional combines case-by-case expression examination with a restricted multiway conditional.

      This conditional may be seen to be simply a syntactic convenience, but in many cases its implementation results in a much faster determination at run time of the actual block of code to be executed. Consider the following case statement from Ada: case y is when 2 => y := 3; when 3 => y := 15; when 15 => y := 18; when others => y := 6; end case;

      An expression (y in this case) of an ordinal type occurs after the keyword case. Each when clause contains a guard, which is a list of one or more constants of the same type as the expression. Most languages require that there be no overlap between these guards. The expression after the keyword case is evaluated, and the resulting value is compared to the guards. The block of statements connected with the first matched alternative is executed. If the value does not correspond to any of the guards, the statements in the others clause is executed. Note that the semantics of this example is identical to that of the previous example.

      The case statement may be implemented in the same way as a multiway if statement, but in most languages it will be implemented via table lookup, resulting in a constant time determination of which block of code is to be executed. C’s switch statement differs from the case previously described in that if the programmer does not explicitly exit at the end of a particular clause of the switch, program execution will continue with the code in the next clause.

    90.3.2 Iterative Structures

      One of the most powerful features of an imperative language is the specification of iteration or statement repetition. Iterative structures can be classified as either definite or indefinite, depending on whether the number of iterations to be executed is known before the execution of the iterative command begins: r

      Indefinite iteration: The different forms of indefinite iteration control structures differ by where the test for termination is placed and whether the success of the test indicates the continuation or termination of the loop. For instance, in Pascal the while-do control structure places the test before the beginning of the loop body (a pretest), and a successful test determines that the execution of the loop shall continue (a continuation test). Pascal’s repeat-until control structure, on the other hand, supports a posttest, which is a termination test. That is, the test is evaluated at the end of the loop and a success results in termination of the loop.

      Some languages also provide control structures which allow termination anywhere in the loop. The following example is from Ada: loop

      ... exit when test; ... end loop The exit when test statement is equivalent to if test then exit.

      A few languages also provide a construct to allow the programmer to terminate the execution of the body of the loop and proceed to the next iteration (e.g., C’s continue statement), whereas some provide a construct to allow the user to exit from many levels of nested loop statements (e.g., r Ada’s named exit statements). Definite iteration: The oldest form of iteration construct is the definite or fixed-count iteration form, whose origins date back to FORTRAN. This type of iteration is appropriate for situations where the number of iterations called for is known in advance. A variable, called the iteration control

      variable (ICV), is initialized with a value and then incremented or decremented by regular intervals

      for each iteration of the loop. A test is performed before each loop body execution to determine if the ICV has gone over a final, boundary value. Ada provides fixed-count iteration as a for loop; an example is shown next. for i in 1..10 loop y := y + i; z := z * i; end loop;

      Here, i is initialized to 1, and incremented by 1 for each iteration of the loop, until it exceeds 10. Note that this type of loop is a pretest iterative structure and is essentially syntactic sugar for an equivalent while loop.

      An ambiguity that may arise with a for loop is what value the iteration control variable has after termination of the loop. Most languages specify that the value is formally undetermined after termination of the loop, though in practice it usually contains either the upper limit of the ICV or the value assigned which first passes the boundary value. Ada eliminates this ambiguity by treating the introduction of the control variable as a variable declaration for a block containing only the for loop. Some modern programming languages have introduced a more general form of for loop called an

      

    iterator construct. Iterators allow the programmer to control the scheme for providing the iteration control

      variable with successive values. The following example is from CLU [Liskov et al. 1977]. We first define the iterator: string_chars = iter (s : string) yields (char); index: Int := 1; limit: Int := string$size (s); while index <= limit do yield (string$fetch(s, index)); index := index + 1; end; end string_chars; which can be used in a for loop as follows: for c: char in string_chars(s) do LoopBody end; When the for loop controlled by an iterator is encountered, control is passed to the iterator, which runs until a yield statement is executed. The value associated with the yield statement is used as the initial value of the iterator control variable c, and the body of the loop is executed. Control is then passed back to the iterator, which resumes execution with the statement following the yield. Control is passed to the loop body each time a yield statement is executed and back to the iterator each time the loop body finishes execution. Thus, iterators behave as a restricted form of coroutine, passing control back and forth between the two blocks of code. The loop is terminated when the iterator runs to completion. In the preceding examples this will occur when index > limit.

    90.3.3 Unconstrained Control Structures: Goto and Exceptions

      Unconstrained control structures, generally known as goto constructs, cause control to be passed to the statement labeled by the identifier or line number given in the goto statement. Dijkstra [1968] first ques- tioned the use of goto statements in his famous letter, “Goto statement considered harmful,” to the editor of the Communications of ACM. The controversy over the goto mostly centers on readability of code and handling of the arbitrary transfer of control into and out of otherwise structured sections of program code.

      For example, if a goto statement passes control into the middle of a loop block, how is the loop to be initialized, especially if it is a fixed-count loop? Even worse, what happens when a goto statement causes control to enter or exit in the middle of a procedure or function? The problems with readability arise because a program with many goto statements can be very hard to understand if the dynamic (run time) flow of control of the program differs significantly from the static (textual) layout of the program. Programs with undisciplined use of gotos have earned the name of spaghetti code for their similarity in structure to a plate of spaghetti.

      Although some argue for the continued importance of goto statements, most languages either greatly restrict their use (e.g., do not allow gotos into other blocks) or eliminate them altogether. In order to handle situations where gotos might be called for, other, more restrictive language constructs have been introduced to make the resulting code more easily readable. These include the continue and exit statements (particularly labeled exit statements) referred to earlier.

      Another construct which has been introduced in some languages in order to replace some uses of the goto statement is the exception. An exception is a condition or event that requires immediate action on the part of the program. An exception is raised or signaled implicitly by an event such as arithmetic overflow or an index out of range error, or it can be explicitly raised by the programmer.

      The raising of an exception results in a search for an exception handler, a block of code defined to handle the exceptional condition and (hopefully) allow normal processing to resume. The search for an appropriate handler generally starts with the routine which is executing when the exception is raised. If no appropriate handler is found there, the search continues with the routine which called the one which contained the exception. The search continues through the chain of routine calls until an appropriate handler is found, or the end of call chain is passed without finding a handler.

      If no handler is found the program terminates, but if a handler is found the code associated with the handler is executed. Different languages support different models for resuming execution of the program. The termination model of exception handling results in termination of the routine containing the handler, with execution resuming with the caller of that routine. The continuation model typically resumes execution at the point in the routine containing the handler which occurs immediately after the statement whose execution caused the exception.

      The following is an example of the use of exceptions in Ada (which uses the termination model): procedure pop(s: stack) is begin if empty(s) then raise emptyStack else ... end; procedure balance (parens: string) return boolean is pStack: stack begin ... if ... then pop(s) ... exception when emptyStack => return false end

      Many variations on exceptions are found in existing languages. However, the main characteristics of exception mechanisms are the same. When an exception is raised, execution of a statement is abandoned and control is passed to the nearest handler. (Here “nearest” refers to the dynamic execution path of the program, not the static structure.) After the code associated with the handler is executed, normal execution of the program resumes.

      The use of exceptions has been criticized by some as introducing the same problems as goto statements. However, it appears that disciplined use of exceptions for truly exceptional conditions (e.g., error handling) can result in much clearer code than other ways of handling these problems.

      We complete our discussion of control structures by noting that, although many control structures exist, only a very few are actually necessary. At the one extreme, simple conditionals and a goto statement are sufficient to replace any control structure. On the other hand, it has been shown [Boehm and Jacopini 1966] that a two-way conditional and a while loop are sufficient to replace any control structure. This result has led some to point out that a language has no need for a goto statement; indeed, there are languages that do not have one.

      90.3.4 Procedural Abstraction

      Support for abstraction is very useful in programming languages, allowing the programmer to hide details and definitions of objects while focusing on functionality and ease of use. Procedural abstraction [Liskov and Guttag 1986] involves separating out the details of an execution unit into a procedure and referencing this abstraction in a program statement or expression. The result is a program that is easier to understand, write, and maintain.

      The role of procedural abstraction is best understood by considering the relationships between the four levels of execution units described earlier: expressions, statements, blocks, and programs. A state- ment can contain several expressions; a block contains several statements; a program may contain sev- eral blocks. Following this model, a procedural abstraction replaces one execution unit with another one that is simpler. In practice, it typically replaces a block of statements with a single statement or expression.

      The definition of a procedure binds the abstraction to a name and to an executable block of statements called the body. These bindings are compile-time, declarative bindings. In Ada, such a binding is made by specifying code such as the following: procedure area (height, width: real; result: out real) is begin result := height * width; end;

      The invocation of a procedure creates an activation of that procedure at run time. The activation record for a procedure contains data bound to a particular invocation of a procedure. It includes slots for parameters, local variables, other information necessary to access nonlocal variables, and data to enable the return of control to the caller. In languages supporting recursive procedures, more than one activation record can exist at the same time for a given procedure. In those languages, the lifetime of the activation record is the duration of the procedure activation.

      Although scoping rules provide access to nonlocal variables, it is generally preferable to access nonlocal information via parameter passing. Parameter-passing mechanisms can be classified by the direction in which the information flows: in parameters, where the caller passes data to the procedure, but the procedure does not pass data back; out parameters, where the procedure returns data values to the caller, but no data are passed in; and in out parameters, where data flow in both directions.

      Formal parameters are specified in the declaration of a procedure. The actual parameters to be used in

      the procedure activation are specified in the procedural invocation. The procedure passing mechanism creates an association between corresponding formal and actual parameters. The precise information flow which occurs during procedure invocation depends on the parameter passing mechanism.

      The association or mapping of formal to actual parameters can be done in one of three ways. The most common method is positional parameter association, where the actual parameters in the invocation are matched, one by one in a left-to-right fashion, to the formal parameters in the procedural definition.

      

    Named parameter association also can be used, where a name accompanies each actual parameter and

      determines to which formal parameter it is associated. Using this method, any ordering can be used to specify parameter values. Finally, default parameter association can be used, where some actual parameter values are given and some are not. In this case, the unmatched formal parameters are simply given a default value, which is generally specified in the formal parameter declaration.

      Note that in a procedural invocation, the actual parameter for an in parameter may be any expression of the appropriate type, since data do not flow back, but the actual parameter for either an out or an in out parameter must be a variable, because the data that are returned from a procedural invocation must have somewhere to go.

      Parameter passing is usually implemented as being one of copy, reference, and name. There are two copy parameter passing mechanisms. The first, labeled call-by-value, copies a value from the actual to the formal parameter before the execution of the procedure’s code. This is appropriate for in parameters. A second mode, called call-by-result, copies a value from the formal parameter to the actual parameter after the termination of the procedure. This is appropriate for out parameters. It is also possible to combine these two mechanisms, obtaining call-by-value-result, providing a mechanism which is appropriate for in out parameters.

      The call-by-reference passes the address of the actual parameter in place of its value. In this way, the transfer of values is not by copying but occurs by virtue of the formal parameter and the actual parameter referencing the same location in memory. Call-by-reference makes the sharing of values between the formal and actual a two-way, immediate transfer, because the formal parameter becomes an alias for the actual parameter.

      

    Call-by-name was introduced in ALGOL 60 and is the most complex of the parameter passing mech-

      anisms described here. Although it has some theoretical advantages, it is both harder to implement and generally more difficult for programmers to understand. In call-by-name, the actual parameter is re- evaluated every time the formal parameter is referenced. If any of the constituents of the actual parameter expression has changed in value since the last reference to the formal parameter, a different value may be returned at successive accesses of the formal parameter. This mechanism also allows information to flow back to the main program with an assignment to a formal parameter. Although call-by-name is no longer used in most imperative languages, a variant is used in functional languages which employ lazy evaluation (see Chapter 92 ).

      Several issues crop up when we consider parameters and their use. The first is a problem called aliasing, where the same memory location is referenced with two or more names. Consider the following Ada code: procedure MAIN is a: integer; procedure p(x, y: in out integer) is begin a := 2; x := y + a; end; begin a := 10; p(a,a); ... end; During the call of p(a, a) the actual parameter a is bound to both of the formal parameters x and y.

      Because x and y are in out parameters, the value for a will change after the procedure returns. It is not clear, however, which value a will have after the procedure call. If the parameter passing mechanism is call-by-value-result then the semantics of this program depend on the order in which values are copied back to the caller. If they are copied into the parameters from left to right, the value of a will be 10 after the call. The results with call-by-reference will be unambiguous (though perhaps surprising to the programmer), with the value of a being 4 after the call. In Ada, a parameter specified to be passed as in out may be passed using either call by value-result or call by reference. The preceding code provides an example where, because of aliasing, these parameter passing mechanisms give different answers. Ada terms such programs to be erroneous and considers them not to be legal, even though the compiler may not be able to detect such programs.

      Most imperative programming languages support the use of procedures as parameters (Ada is one of the few exceptions). In this case the parameter declaration must include a specification of the number and types of parameters of the procedure parameter. MODULA-2, for example, supports procedure types which may be used to specify procedural parameters. There are few implementation problems in supporting procedure parameters, though the implementation must ensure that nonlocal variables are accessed properly in the procedure passed as a parameter.

      There are two kinds of procedural abstractions. One kind, usually known simply as a procedure, is an abstraction of a program statement. Its invocation is like a statement, and control passes to the next statement after the invocation. The other type is called a value returning procedure or function. Functions are abstractions for an operand in an expression. They return a value when invoked, and, upon return, evaluation of the expression containing the call continues.