ra a
Equational Logic as . Programming Language
Vaeyi: OF DY anne
Thae WALI 12
Institute of nology)
auage byiMich
JO Donnell)
nis OOOK Provide
GESGripuOM OF the! t! tohs; Gesiqnh) ana ifn
ifninleny
IVE 1OgIc prog
iniWwhich computauons produced! di=
are initions, Ltk
rectiy from)
quatione ind) Prolog}
quavOndl! pr miming languag (Nom) thre atalprogrammen should
description) of the
Oye
concept th
is) be
OWEnU) prog ramming)
technigues not avaliable in conventional,
Uitive logic
ge) the! power
On Of Compute 5 Hopkins Unive from) Purdu
Son Lerghtom
EQUATIONAL LOGIC asa
PROGRAMMING LANGUAGE
Michael J. O’Donnell
The MIT Press Cambridge, Massachusetts
London, England
Publisher’s Note
This format is intended to reduce the cost of publishing certain works in book form and to shorten the gap between editorial preparation and final publication.
Detailed editing and composition have been avoided by photographing the text of this book directly from the author’s prepared copy.
Second printing, 1986 © 1985 by The Massachusetts Institute of Technology All rights reserved. No part of this book may be reproduced in any form by any
electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher.
This book was composed using the UNIX tools eqn, troff, and ms, set on the APS- 5 phototypesetter, and printed and bound in the United States of America.
Library of Congress Cataloging in Publication Data O'Donnell, Michael J., 1952- Equational logic as a programming language.
(MIT Press series in the foundations of computing) Bibliography: p.
Includes index.
1. Programming languages (Electronic computers)
2. Logic, Symbolic and mathematical. 3. Equations, Theory of. I. Title. II. Series.
QA76.7.036 1985 001.64°24 84-29507
ISBN 0-262-15028-X
To Julie and Benjamin @h P| facy
Table of Contents
Preface
1. 2.
Introduction Using the Equation Interpreter Under UNIX (ep and ei)
Presenting Equations to the Equation Interpreter
The Syntax of Terms (/oadsyntax)
1. Standmath: Standard Mathematical Notation -- 13
2. LISP.M: Extended LISP Notation -- 14 3. Lambda: A Lambda Calculus Notation -- 15
4. Inner Syntaxes (for the advanced user with a large problem) -- 17
Restrictions on Equations Predefined Classes of Symbols 1. integer_numerals -- 22 2. truth_values -- 22 3. characters -- 22
4. atomic_symbols -- 22
Predefined Classes of Equations
1. Functions on atomic_symbols -- 25
2. Integer Functions -- 25
3. Character Functions -- 25 Syntactic Qualifications on Variables Miscellaneous Examples
1. List Reversal -- 30
2. Huffman Codes -- 31
13
20 22
24
27 30
3. 4 5. 6. 7. 8. 9.
Contents
Quicksort -- 33
. Toy Theorem Prover -- 34
An Unusual Adder -- 39 Arbitrary-Precision Integer Operations -- 41 Exact Addition of Real Numbers -- 46 Polynomial Addition -- 51
The Combinator Calculus -- 54
10. Beta Reduction in the Lambda Calculus -- 55 11. Lucid -- 62 10. Errors, Failures, and Diagnostic Aids 68
1.
2 3. 4
Context-Free Syntactic Errors and Failures -- 69
. Context-Sensitive Syntactic Errors and Failures -- 69
Semantic Errors and Failures -- 70
. Producing a Lexicon to Detect Inappropriate
Uses of Symbols (e/) -- 71
Producing a Graphic Display of Equations In Tree Form (es) -- 71
Trace Output (ez) -- 73
Miscellaneous Restrictions -- 74
11. History of the Equation Interpreter Project 75
12. Low-Level Programming Techniques 78
1.
2. 3.
A Disciplined Programming Style Based on Constructor Functions -- 78
Simulation of LISP Conditionals -- 84
Two Approaches to Errors and Exceptional Conditions -- 87
Contents
13.
14.
15.
16.
17.
18.
4, Repairing Overlaps and Nonsequential Constructs -- 90 Use of Equations for Syntactic Manipulations 1. An Improved Notation for Context-Free Grammars -- 100 2. Terms Representing the Syntax of Terms -- 112
3. Example: Type-Checking in a Term Language -- 115 Modular Construction of Equational Definitions High-Level Programming Techniques
1, Concurrency -- 132
2. Nondeterminism vs. Indeterminacy -- 134
3. Dataflow -- 137
4. Dynamic Programming -- 145
Implementing Efficient Data Structures
in Equational Programs
1. Lists -- 151
2. Arrays -- 157
3. Search Trees and Tables -- 161
Sequential and Parallel Equational Computations
1. Term Reduction Systems -- 177 2. Sequentiality -- 180 3. Left-Sequentiality -- 183
Crucial Algorithms and Data Structures for Processing Equations 1. Representing Expressions -- 187 2. Pattern Matching and Sequencing -- 191 1. Bottom-Up Pattern Matching -- 194
98
124 132
151
177
187
3.
4.
2. Top-Down Pattern Matching -- 199 3. Flattened Pattern Matching -- 205
Selecting Reductions in Nonsequential Systems of Equations -- 210
Performing a Reduction Step -- 212
19. Toward a Universal Equational Machine Language
7. 8.
- Reduction Systems -- 223
The Combinator Calculus, With Variants -- 226 Simulation of One Reduction System by Another -- 235 The Relative Powers of S-K, S-K-D, and S-K-A -- 244 The S-K Combinator Calculus Simulates All Simply Strongly Sequential Term Reduction Systems -- 248
The S-K-D Combinator Calculus Simulates All Regular Term Reduction Systems -- 252
The Power of the Lambda Calculus -- 256 Unsolved Problems -- 260
20. Implementation of the Equation Interpreter
1. Basic Structure of the Implementation -- 262
2. A Format for Abstract Symbolic Information -- 266
3. Syntactic Processors and Their Input/Output Forms -- 270 Bibliography
Index
Contents
220
262
277 285
Series Foreword
Theoretical computer science has now undergone several decades of develop- ment. The “classical” topics of automata theory, formal languages, and computa- tional complexity have become firmly established, and their importance to other theoretical work and to practice is widely recognized. Stimulated by technological advances, theoreticians have been rapidly expanding the areas under study, and the time delay between theoretical progress and its practical impact has been decreas- ing dramatically. Much publicity has been given recently to breakthroughs in cryptography and linear programming, and steady progress is being made on pro- gramming language semantics, computational geometry, and efficient data struc- tures. Newer, more speculative, areas of study include relational databases, VLSI theory, and parallel and distributed computation. As this list of topics continues expanding, it is becoming more and more difficult to stay abreast of the progress that is being made and increasingly important that the most significant work be distilled and communicated in a manner that will facilitate further research and application of this work.
By publishing comprehensive books and specialized monographs on the theoretical aspects of computer science, the series on Foundations of Computing provides a forum in which important research topics can be presented in their entirety and placed in perspective for researchers, students, and practitioners alike. This volume, by Michael J. O’Donnell, presents an elegant and powerful interpre- tive system for programming in terms of abstract logical equations. The language is similar to Prolog, in that it is descriptive rather than procedural, but unlike Pro- log its semantic description allows an efficient implementation that strictly adheres to the given semantics. The presentation provides the definition of the language, many examples of its use, and discussion of the relevant underlying theory. It is essential reading for anyone interested in the latest ideas about nonprocedural pro- gramming and practical programming language semantics.
Michael R. Garey
Preface
This book describes an ongoing equational programming project that started in 1975. Principal investigators on the project are Christoph Hoffmann and Michael O'Donnell. Paul Chew, Paul Golick, Giovanni Sacco, and Robert Strandh partici- pated as graduate students. I am responsible for the presentation at hand, and the opinions expressed in it, but different portions of the work described involve each of the people listed above. I use the pronoun "we" throughout the remainder, to indi- cate unspecified subsets of that group. Specific contributions that can be attributed to one individual are acknowledged by name, but much of the quality of the work is due to untraceable interactions between several people, and should be credited to the group.
The equational programming project never had a definite pseudocommercial goal, although we always hoped to find genuinely useful applications. Rather than seeking a style of computing to support a particular application, we took a clean, simple, and elegant style of computing, with particularly elementary semantics, and asked what it is good for. As a result, we adhered very strictly to the original con- cept of computing with equations, even when certain extensions had obvious prag- matic value. On the other hand, we were quite willing to change the application. Originally, we envisioned equations as formal descriptions of interpreters for other programming languages. When we discovered that such applications led to outra- geous overhead, but that programs defined directly by equations ran quite competi- tively with LISP, we switched application from interpreter generation to program-
ming with equations.
We do not apologize for our fanaticism about the foundations of equational
programming, and our cavalier attitude toward applications. We believe that good
Preface
mathematics is useful, but not always for the reasons that motivated its creation (non-Euclidean geometry is a positive example, the calculus a negative one). Also, while recognizing the need for programming languages that support important applications immediately, we believe that scientific progress in the principles of pro- gramming and programming languages is impeded by too quick a reach for appli- cations. The utility of LISP, for example, is unquestionable, but the very adjust- ments to LISP that give it success in many applications make it a very imprecise vehicle for understanding the utility of declarative programming. We would rather discover that pure equational programming, as we envision it, is unsuitable for a particular application, than to expand the concept in a way that makes it harder to
trace the conceptual underpinnings of its success or failure.
Without committing to any particular type of application, we must experiment with a variety of applications, else our approach to programming is pure specula- tion. For this purpose, we need an implementation. The implementation must per- form well enough that some people can be persuaded to use it. We interpret this constraint to mean that it must compete in speed with LISP. Parsers, program- ming support, and the other baggage possessed by all programming languages, must be good enough not to get in the way, but the main effort should go toward demonstrating the feasibility of the novel aspects, rather than solving well under-
stood problems once again.
The equational programming project has achieved an implementation of an interpreter for equational programs. The implementation runs under Berkeley UNIX* 4.1 and 4.2, and is available from the author for experimental use. The current distribution is not well enough supported to qualify as a reliable tool for
*UNIX is a trademark of AT&T.
Preface
important applications, but we have hopes of producing such a stronger implemen- tation in the next few years. Sections 1 through 10 constitute a user’s manual for the current implementation. The remainder of the text covers a variety of topics relating to the theory supporting equational programming, the algorithmic and organizational problems solved in its implementation, and the special characteris- tics of equational programming that qualify it for particular applications. Some sections discuss work in progress. The intent is to give a solid intuition for all the identifiable aspects of the project, from its esoteric theoretical foundations in logic to its concrete implementation as a system of programs, and its potential applica-
tions.
Various portions of the work were supported by a Purdue University XL grant, by the National Science Foundation under grants MCS-7801812 and MCS- 8217996, and by the National Security Agency under grant 84H0006. The Purdue University Department of Computer Sciences provided essential computing resources for most of the implementation effort. I am grateful to Robert Strandh and Christoph Hoffmann for critical readings of the manuscript, and to AT&T Bell Laboratories for providing phototypesetting facilities. Typesetting was accom-
plished using the troff program under UNIX.
EQUATIONAL LOGIC
asa
PROGRAMMING LANGUAGE
1. Introduction (adapted from HO82b)
Computer scientists have spent a large amount of research effort developing the semantics of programming languages. Although we understand how to implement Algol-style procedural programming languages efficiently, it seems to be very difficult to say what the programs mean. The problem may come from choosing an implementation of a language before giving the semantics that define correctness of the implementation. In the development of the equation interpreter, we reversed the process by taking clean, simple, intuitive semantics, and then looking for
correct, efficient implementations.
We suggest the following scenario as a good setting for the intuitive semantic of computation. Our scenario covers many, but not all, applications of computing
(e.g., real-time applications are not included).
A person is communicating with a machine. The person gives a sequence of assertions followed by a question. The machine responds with an answer or by
never answering.
The problem of semantics is to define, in a rigorous and understandable way, what it means for the machine’s response to be correct. A natural informal definition of correctness is that any answer that the machine gives must be a logical conse- quence of the person’s assertions, and that failure to give an answer must mean that there is no answer that follows logically from the assertions. If the language for giving assertions is capable of describing all the computable functions, the undecidability of the halting problem prevents the machine from always detecting those cases where there is no answer. In such cases, the machine never halts. The style of semantics based on logical consequence leads most naturally to a style of
programming similar to that in the descriptive or applicative languages such as
2 1. Introduction
LISP, Lucid, Prolog, Hope, OBJ, SASL and Functional Programming languages, although Algol-style programming may also be supported in such a way. Compu- tations under logical-consequence semantics roughly correspond to "lazy evaluation"
of LISP [HM76, FW76].
Semantics based on logical consequence is much simpler than many other styles of programming language semantics. In particular, the understanding of " Jogical-consequence semantics does not require construction of particular models through lattice theory or category theory, as do the semantic treatments based on the work of Scott and Strachey or those in the abstract-data-types literature using initial or final algebras. If a program is given as a set of assertions, then the logi- cal consequences of the program are merely all those additional assertions that must be true whenever the assertions of the program are true. More precisely, an equation A=B is a logical consequence of a set E of equations if and only if, in every algebraic interpretation for which every equation in E is true, A=B is also true (see [(0’D77] Chapter 2 and Section 14 of this text for a more technical treat- ment). There is no way to determine which one of the many models of the pro- gram assertions was really intended by the programmer: we simply compute for him all the information we possibly can from what we are given. For those who prefer to think of a single model, term algebras or initial algebras may be used to construct one model for which the true equations are precisely the logical conse-
quences of a given set of equations.
We use the language of equational logic to write the assertions of a program. Other logical languages are available, such as the first-order predicate calculus, used in Prolog [Ko79a]. We have chosen to emphasize the reconciliation of strict
adherence to logical consequences with good run-time performance, at the expense
1. Introduction 3
of generality of the language. Current implementations of Prolog do not always discover all of the logical consequences of a program, and may waste much time searching through irrelevant derivations. With our language of equations, we lose some of the expressive power of Prolog, but we always discover all of the logical consequences of a program, and avoid searching irrelevant ones except in cases that inherently require parallel computation. Hoffmann and O’Donnell survey the issues involved in computing with equations in [HO82b]. Section 17 discusses the
question of relevant vs. irrelevant consequences of equations more specifically. Specializing our computing scenario to equational languages:
The person gives a sequence of equations followed by a question, "What is E?" for some expression E. The machine responds with an equation "E=F," where
F is a simple expression.
For our equation interpreter, the "simple expressions" above must be the normal forms: expressions containing no instance of a left-hand side of an equation. This assumption allows the equations to be used as rewriting rules, directing the replace- ment of instances of left-hand sides by the corresponding right-hand sides. Sec- tions 2 and 3 explain how to use the equation interpreter to act out the scenario above. Our equational computing scenario is a special case of a similar scenario developed independently by the philosophers Belnap and Steel for a logic of ques-
tions and answers [BS76].
The equation interpreter accepts equations as input, and automatically pro- duces a program to perform the computations described by the equations. In order to achieve reasonable efficiency, we impose some fairly liberal restrictions on the form of equations given. Section 5 describes these restrictions, and Sections 6-8
and 10 present features of the interpreter. Section 15 describes the computational
4 1. Introduction
power of the interpreter in terms of the procedural concepts of parallelism, non- determinism, and pipelining. Typical applications for which the equation interpreter should be useful are:
1. We may write quick and easy programs for the sorts of arithmetic and list- manipulating functions that are commonly programmed in languages such as LISP. The "lazy evaluation" implied by logical-consequence semantics allows us to describe infinite objects in such a program, as long as only finite portions are actually used in the output. The advantages of this capability, discussed in (FW76, HM76], are similar to the advantages of pipelining between corou- tines in a procedural language. Definitions of large or infinite objects may also be used to implement a kind of automatic dynamic programming (see
Section 15.4).
2. We may define programming languages by equations, and the equation proces- sor will produce interpreters. Thus, we may experiment with the design of a programming language before investing the substantial effort required to pro-
duce a compiler or even a hand-coded interpreter.
3. Equations describing abstract data types may be used to produce correct implementations automatically, as suggested by [cs78, Wa76], and imple- mented independently in the OBJ language [FGJM85].
4. Theorems of the form A=B may sometimes be proved by receiving the same answer to the questions "What is A?" and "What is B?" [KB70, HO88] dis- cuss such theorem provers. REVE [Le83, FG84] is a system for developing
theorem-proving applications of equations.
1. Introduction 5
5. Non-context-free syntactic checking, and semantics, such as compiler code- generation, may be described formally by equations and used, along with the conventional formal parsers, to automatically produce compilers (see Section
13).
The equation interpreter is intended for use by two different classes of user, in somewhat different styles. The first sort of user is interested in computing results for direct human consumption, using well-established facilities. This sort of user should stay fairly close to the paradigm presented in Section 2, should take the syntactic descriptions as fixed descriptions of a programming language, and should skip Section 20, as well as other sections that do not relate to the problem at hand. The second sort of user is building a new computing product, that will itself be used directly or indirectly to produce humanly readable results. This sort of user will almost certainly need to modify or redesign some of the syntactic processors, and will need to read Sections 13 and 20 rather closely in order to understand how to combine equationally-produced interpreters with other sorts of programs. The second sort of user is encouraged to think of the equation interpreter as a tool, analogous to a formal parser constructor, for building whichever parts of his pro- duct are conveniently described by equations. These equational programs may then be combined with programs produced by other language processors to perform those tasks not conveniently implemented by equations. The aim in using equations should be to achieve the same sort of self-documentation and ease of modification that may be achieved by formal grammars, in solving problems where context-free
manipulations are not sufficiently powerful.
2. Using the Equation Interpreter Under UNIX (ep and ei)
Use of the equation interpreter involves two separate steps: preprocessing and interpreting. The preprocessing step, like a programming language compiler, analyzes the given equations and produces machine code. The interpreting step, which may be run any number of times once preprocessing is done, reduces a given _term to normal form.
Normal use of the equation interpreter requires the user to create a directory
containing 4 files used by the interpreter. The 4 files to be created are: 1. definitions - containing the equations;
2. pre.in - an input parser for the preprocessor;
3. int.in - an input parser for the interpreter;
4. int.out - an output pretty-printer for the interpreter.
The file definitions, discussed in Section 3, is usually typed in literally by the user. The files pre.in, int.in and int.out, which must be executable, are usually produced
automatically by the command loadsyntax, as discussed in Section 4.
To invoke the preprocessor, type the following command to the shell ep Equnsdir
where Equnsdir is the directory in which you have created the 4 files above. If no directory is given, the current directory is used. Ep will use Equnsdir as the home for several temporary files, and produce in Equnsdir an executable file named interpreter. Because of the creation and removal of temporary files, the user should avoid placing any extraneous files in Equnsdir. Two of the files produced by ep are not removed: def.deep and def.in. These files are not strictly necessary for
operation of the interpreter, and may be removed in the interest of space
2. Using the Equation Interpreter 7
conservation, but they are useful in building up complex definitions from simpler ones (Section 14) and in producing certain diagnostic output (Section 10). To
invoke the interpreter, type the command:
ei Equnsdir
A term found on standard input will be reduced, and its normal form placed on the
standard output.
A paradigmatic session with the equation interpreter has the following form:
mkdir Equnsdir
loadsyntax Equnsdir
edit Equnsdir/definitions using your favorite editor
ep Equnsdir
edit input using your favorite editor
ei Equnsdir <input The sophisticated user of UNIX may invoke ei from his favorite interactive editor, such as ned or emacs, in order to be able to simultaneously manipulate the input
and output.
In more advanced applications, if several equation interpreters are run in a pipeline, repeated invocation of the syntactic processors may be avoided by invok- ing the interpreters directly, instead of using ei. For example, if Equ.1, Equ.2, Equ.3 are all directories in which equational interpreters have been compiled, the
following command pipes standard input through all three interpreters:
Equ.lfint.in | Equ.1finterpreter | Equ.2sinterpreter | Equ.3finterpreter | Equ.3/int.out;
Use of ei for the same purpose would involve 4 extra invocations of syntactic pro- cessors, introducing wasted computation and, worse, the possibility that superficial
aspects of the syntax, such as quoting conventions, may affect the results. If
8 2. Using the Equation Interpreter
Equ.1, Equ.2, and Equ.3 are not all produced using the same syntax, careful con- sideration of the relationship between the different syntaxes will be needed to make sense of such a pipe.
After specifying the directory containing definitions, the user may give the size
2!5_-1=32767:
of the workspace to be used in the interpreter. This size defaults to the largest that can be addressed in one 16-bit word with a sign bit. The , workspace size limits the size of the largest expression occurring as an intermediate step in any reduction of an input to normal form. The effect of the limit is blurred somewhat by sharing of equivalent subexpressions, and by allocation of space for
declared symbols even when they do not actually take part in a particular computa-
tion. For example, to reduce the interpreter workspace to half of the default, type
ep Equnsdir 16384
The largest workspace usable in the current implementation is 2319147483646. The limiting factor is the Berkeley Pascal compiler, which will not process a constant bigger than 231_1=2147483647, and which produces mysteriously incorrect assembly code for an allocation of exactly that much. On current VAX Unix implementations, the shell may often refuse to run sizes much
larger than the default because of insufficient main memory. In such a case, the
user will see a message from the shell saying "not enough core" or “too big".
3. Presenting Equations to the Equation Interpreter
Input to the equation interpreter, stored in the file definitions, must be of the fol- lowing form: Symbols
symbol_descriptor; symbol_descriptor;
symbol_descriptor,,. For all variable,, variable, --- variable,:
equation; equation;
equationy.
The principal keywords recognized by the preprocessor are Symbols, For all, and Equations, appearing at the beginning of a line. Equations is an alternative to For all used in the unusual case that there are no variables in the equations. Cap- italization of these keywords is optional, and any number of blanks greater than 0 may appear between For and all. The only other standard keywords are include, where, end where, is, are, in, either, or, and end or. The special symbols used by the preprocessor are ":", ";", ".",",", "", and "". Particular term syntaxes (see Sec- tion 4) may entail other keywords and special symbols. Blanks are required only where necessary to separate alphanumeric strings. Any line beginning with ""is a
comment, with no impact on the meaning of a specification.
symbol_descriptors indicate one or more symbols in the language to be
10 3. Presenting Equations
defined, and give their arities. Intuitively, symbols of arity 0 are the constants of
the language, and symbols of higher arity are the operators. A symbol_descriptor
is either of the form symbol, symbol, ... symbol,,: arity m21 or of the form
include symbol class, ... symbol_class, n2\
Syntactically, symbols and symbol_classes are identifiers: strings other than key- words beginning with an alphabetic symbol followed by any combination of alpha- betic symbols, base-ten digits, "_", and "-". Identifiers are currently limited to 20 characters, a restriction which will be removed in future versions. A symbol_class indicates the inclusion of a large predefined class of symbols. These classes are dis- cussed in Section 6. Symbols that have been explicitly declared in the Symbols
section are called literal symbols, to distinguish them from members of the
predefined classes.
variables are identifiers, of the same sort as symbols. An equation is either of
the form
term, = term of the form
term, = term, where qualification or of the form
include equation _class, ++ ,equation_class,,
3. Presenting Equations 11
The syntax of terms is somewhat flexible, and is discussed in Section 4. qualifications are syntactic constraints on substitutions for variables, and are dis- cussed in Section 8. equation_classes are identifiers indicating the inclusion of a
large number of predefined equations. These classes are discussed in Section 7.
For larger problems, the notation presented in this section will surely not be satisfactory, because it provides no formal mechanism for giving structure to a large definition. Section 14 describes a set of operators that may be applied to one or more equational definitions to produce useful extensions, modifications, and com- binations of the definitions. The idea for these definition-constructing operators comes from work on abstract data types by Burstall and Goguen, implemented in the language OBJ [BG77]. Users are strongly encouraged to start using these operators as soon as a definition begins to be annoyingly large. The current version does not implement operators on definitions, so most users will not want to attack
large problems until a more advanced version is released.
The syntax presented in this section is collected in the BNF below. <program> ::= Symbols <symbol descriptor list>. For all <variable list >:<equation list>. <symbol descriptor list > ::= <symbol descriptor >;...3 <symbol descriptor >
<symbol descriptor > ::= <symbol list>:<arity> |
include <symbol class list> <symbol class list> ::= <symbol class>,..., <symbol class>
<symbol class > ::= atomic_symbols | integer_numerals | truth_yalues
12 3. Presenting Equations
<symbol list> = <symbol>,..., <symbol> <symbol> ::= <identifier >
<arity > = <number>
<variable list> = <variable>,..., <variable>
_ <variable> ::= <identifier >
<equation list > = <equation>;... 3 <equation>
<equation> = <term> = <term> |
<term> = <term> where <qualification> end where |
include <equation class list> <qualification> ::= <qualification item list > <qualification item list> s:= <qualification item>,... ,<qualification item >
<qualification item> ::= <variable> is <qualified term> |
<variable list> are <qualified term>
<qualified term> ::= in<symbol class> | <term> | <qualified term> where <qualification> end where |
either <qualified term list> end or
<qualified term list> := <qualified term>or ... or <qualified term>
4. The Syntax of Terms (Joadsyntax)
Since no single syntax for terms is acceptable for all applications of the equation interpreter, we provide a library of syntaxes from which the user may choose the one best suited to his application. The more sophisticated user, who wishes to custom-build his own syntax, should see Section 20 on implementation to learn the
requirements for parsers and pretty-printers. To choose a syntax from the current library, type the command
loadsyntax Equnsdir Syntax
where Equnsdir is the directory containing the preprocessor input, and Syntax is the name of the syntax to be seen by the user. Loadsyntax will create the appropriate pre.in, int.in, and int.out files in Equnsdir to process the selected syn- tax. If syntax is omitted, LISP.M is used by default. If Equnsdir is also omitted,
the current directory is used.
In order to distinguish atomic symbols from nullary literal symbols in input to the interpreter, the literal symbols must be written with an empty argument list. Thus, in Standmath notation, a(Q) is a literal symbol, and a is an atomic symbol. In LISP.M, the corresponding notations are a[] and a. This regrettable notational
clumsiness should disappear in later versions.
4.1 Standmath: Standard Mathematical Notation
The Standmath syntax is standard mathematical functional prefix notation, with arguments surrounded by parentheses and separated by commas, such as S (g(a,b),c,h(e)). Empty argument lists are allowed, as in fQ. This syntax is used as the standard of reference (but is not the default choice), all others are
described as special notations for Standmath terms.
14 4. Syntax of Terms
4.2 LISP.M: Extended LISP Notation LISP.M is a liberal LISP notation, which mixes M-expression notation freely with S-expressions [McC60]. Invocation of LISP.M requires declaration of the nullary symbol nil and the binary symbol cons. An M-expression accepted by LISP.M may be in any of the following forms:
atomic_symbol
nilO
(M-expr,; M-expr, +++ M-expr,,) m20
(M—expr, M-expr, +++ M-expr,_,. M-expr,) n>1
function(M—expr,; M—expr,;:+*M —expr,] p20
(M-expr, +++ M—expr,_, . M-expr,)
is a notation for
cons(M—expr,, +++ cons(M—expr,_), M—expr,) -*- )
(M-expr, +++ M-expr,,)
is a notation for
cons (M—expr,, cons(M—expr>, +++ cons(M—expr,, nilQ) --- )) function|M—expr\; +++ M—expry]
is a notation for
function(M—expr,, --- M-expr,)
4.3. Lambda 15
4.3 Lambda: A Lambda Calculus Notation
Lambda notation is intended for use in experiments with evaluation strategies for the lambda calculus. This notation supports the most common abbreviations con- veniently, while allowing unusual sorts of expressions to be described at the cost of less convenient notation. Because of the highly experimental nature of this syntax, less attention has been given to providing useful error messages. Since this lambda notation was developed to support one particular series of experiments with reduc-
tion strategies, it will probably not be suitable for all uses of the lambda calculus.
\x.E
is a notation for
Lambda(cons(x, nil0), E)
where x must be an atomic symbol representing a variable.
(E F)
is a notation for
AP(E, F)
In principle, the notations above are sufficient for describing arbitrary lambda terms, but for convenience, multiple left-associated applications may be given with
only one parenthesis pair. Thus,
(E, E,E; “29 E,,) n22
16 4. Syntax of Terms
is a notation for
AP( +++ AP(AP(E,, Ey), Ey), *** En)
Similarly, many variables may be lambda bound by a single use of "\". Thus,
\x NX. °°° x, E n21
is a notation for
Lambda (cons (x), cons (x2, +++ nilQ) +++ ),E)
Notice that the list of variables is given as a LISP list, rather than the more con-
ventional representation as
Lambda(x,, Lambda(x», +++ Lambda(x,,E) +++ ))
It is easy to write equations to translate the listed-variable form into the more con- ventional representation, but the listed form allows reduction strategies to take advantage of nested Lambdas. In order to write equations manipulating lists of
variables, it is necessary to refer to a list of unknown length. So,
\xy XQ °¢* x,1rem.E n20
is a notation for
Lambda (cons (x, cons (x, +++ rem) ++: ), E)
That is, rem above represents the remainder of the list beyond x; ---x,. In the
special case where n=0,
4.3. Lambda 17
\:list.E
is a notation for
Lambda(list, E)
In order to deal with special internal forms, such as de Bruijn notation [deB72],
the form
\iE
is allowed as a notation for
Lambda(i, E)
where i is an integer numeral. If function symbols other than Lambda and AP must be introduced, a bracketed style of function application may be used, in
which
flEy o> Eyl n>0
is a notation for
S (Ej, SA E,)
4.4 Inner Syntaxes (for the advanced user with a large problem)
Independently of the surface syntax in which terms are written, it may be helpful to use different internal representations of terms for different purposes. For exam-
ple, instead of having a number of function symbols of different arities, it is some-
18 4. Syntax of Terms
times convenient to use only one binary symbol, AP, representing function applica- tion, and to represent all other functions by nullary symbols. Application of a function to multiple arguments is represented by a sequence of separate applica- tions, one for each argument. The translation from standard notation to this appli-
cative notation is often called Currying. For example, the term
S(gla, b), h())
is Curried to
AP(AP(f, AP(AP(g, a), b), AP(h, c))).
Since the performance of the pattern-matching techniques used by the equation interpreter is affected by the internal representation of the patterns, it may be important to choose the best such representation in order to solve large problems. The current version of the system is not particularly sensitive to such choices, but earlier versions were, and later versions may again be so. In order to use an alter-
nate internal representation, type
loadsyntax Equnsdir Outersynt Innersynt
where Outersynt is one of the syntaxes described in Sections 4.1-4.3, and Innersynt is the name of the chosen internal representation. Currently, only two internal representations are available. Standmath is the standard mathematical notation,
so
loadsyntax Equnsdir Outersynt Standmath
4.4. Inner Syntaxes
is equivalent to
loadsyntax Equnsdir Outersynt
The other internal representation is Curry, described above.
19
5. Restrictions on Equations
In order for the reduction strategies used by the equation interpreter to be correct according to the logical-consequence semantics, some restrictions must be placed on the equations. The user may learn these restrictions by study, or by trial and error, since the preprocessor gives messages about each violation. Presently, 5 restrictions are enforced:
‘1. No variable may be repeated on the left side of an equation. For instance, if yy =y is prohibited, because of the 2 instances of y on the left side.
2. Every variable appearing on the right side of an equation must also appear on the left. For instance, f(x)=y is prohibited. 3. Two different left sides may not match the same expression. So the pair of
equations
gx) =0; glx,l) =0
is prohibited, because both of them apply to g (0,1).
4. When two (not necessarily different) left-hand sides match two different parts of the same expression, the two parts must not overlap. E.g., the pair of equa-
tions
first(pred(x)) = predfunc; pred(succ(x)) = x
is prohibited, since the left-hand sides overlap in first (pred (succ (0)).
5. It must be possible, in a left-to-right preorder traversal of any term, to iden- tify an instance of a left-hand side without traversing any part of the term
below that instance. This property is called left-sequentiality. For example,
5. Restrictions on Equations 21
the pair of equations Se, a), y) = 0; gb, =1
is prohibited, since after scanning f(g it is impossible to decide whether to look at the first argument to g in hopes of matching the b in the second equa-
tion, or to skip it and try to match the first equation.
Violations of left-sequentiality may often be avoided by reordering the argu- ments to a function. For example, the disallowed equations above could be replaced by f(g(a,x),y) = 0 and g(c,b) = 1. Left-sequentiality does not neces- sarily imply that leftmost-outermost evaluation will work. Rather, it means that in attempting to create a redex at some point in a term, the evaluator can determine whether or not to perform reductions within a leftward portion of the term without looking at anything to the right. Left-sequentiality is discussed in more detail in
Sections 17 and 18.3.
All five of these restrictions are enforced by the preprocessor. Violations pro- duce diagnostic messages and prevent compiling of an interpreter. The left- sequentiality restriction (5) subsumes the nonoverlapping restriction (4), but later versions of the system will remove the sequentiality constraint. Later versions will also relax restriction (3) to allow compatible left-hand sides when the right-hand
sides agree.
6. Predefined Classes of Symbols
It is sometimes impossible to list in advance all of the symbols to be processed by a particular set of equations. Therefore, we allow 4 predefined classes of symbols to be invoked by name. These classes consist entirely of constants, that is, nullary
symbols.
- 6.1. integer_numerals The integer_numerals include all of the sequences of base-10 digits, optionally pre- ceded by "=". Numerals are limited to fit in a single machine word: the range -2147483647 to +2147483647 on the current VAX implementation. Later versions will use the operators of Section 14 to provide arbitrary precision integer arith-
metic.
6.2. truth_values
The truth_values are the symbols true and false. They are included as a
predefined class for standardization.
6.3. characters
The characters are ASCII characters, presented in single or double quotes. The only operations available are conversions between characters and integer numerals. Later versions will use the operators of Section 14 to pro- vide arbitrarily long character strings, and some useful string-manipulating opera-
tions.
6.4. atomic_symbols
The atomic_symbols are structureless symbols whose only detectable relations are
equality and inequality. Every identifier different from true and false, and not
6. Predefined Classes of Symbols 23
having any arguments, is taken to be an atomic symbol. _In order to distinguish nullary literal symbols from atomic symbols, the literal symbols are given null strings of arguments, such as /it() (in Standmath notation) and /it{] (in LISP.M notation). Currently, atomic_symbols are limited to lengths from 0 to 20. Later versions will use the operators of Section 14 to provide arbitrarily long
atomic_symbols.
Section 7 describes predefined functions which operate on these classes of sym-
bols.
7. Predefined Classes of Equations
The predefined classes of equations described in this section were introduced to provide access to selected machine instructions, particularly those for arithmetic operations, without sacrificing the semantic simplicity of the equation interpreter, and without introducing any new types of failure, such as arithmetic overflow. Only those operations that are extremely common and whose implementations in “machine instructions bring substantial performance benefits are included. The intent is to provide a minimal set of predefined operations from which more power- ful operations may be defined by explicitly-given equations. So, every predefined operation described below has the same effect as a certain impractically large set of equations, and the very desirable extensions of these sets of equations to handle multiple-word objects are left to be done by explicitly-given equations in later ver- sions.
For each predefined class of symbols, there are predefined classes of equations defining standard functions for those symbols. Some of the functions produce values in another class than the the class of the arguments. Predefined classes of equations allow a user to specify a prohibitively large set of equations concisely, and allow the implementation to use special, more efficient techniques to process those equations than are used in general. When a predefined class of functions is invoked, all of the relevant function symbols and classes of symbols must be declared as well. We will describe the functions defined for each class of symbols. The associated class of equations is the complete graph of the function. For exam- ple, the integer function add has the class of equations containing add (0,0)=0,
add (0,1)=1, ... add(1,0)=1, add (1,1)=2, ... .
7.1 Functions on atomic_symbols 25
7.1. Functions on atomic_symbols
equatom equ(x,y) = true if x=y,
false otherwise
7.2. Integer Functions
multint multiply(x,y) = x * y
divint divide(x,y) = the greatest integer <x/y if y#*0
modint modulo(x,y) = x — (ysdivide(x,y)) if y #0, x otherwise
addint add(x,y) =x +y
subint subtract(x,y) =x — y
equint equ(x,y) = true if x=y,
false otherwise
lessint less(x,y) = true ifx<y,
false otherwise
An expression starting with the function divide will not be reduced at all if the second argument is 0. Thus, the output will give full information about the erroneous use of this function. Similarly, additions and multiplications leading to overflow will simply not be performed. Later versions will perform arbitrary preci-
sion arithmetic (see Section 9.6), removing this restriction.
7.3. Character Functions
equchar equ(x,y) = true if x=y,
false otherwise
26 7. Predefined Classes of Equations
intchar char(i) = the ith character in a standard ordering
charint seqno(x) = the position of x in a standard ordering
An application of char to an integer outside of the range 0 to 27 — 1 = 127, or an application of seqno to a string of length other than 1 will not be reduced. Later versions will use the operations of Section 14 to provide useful string-manipulating
operations for arbitrarily long character strings,
8. Syntactic Qualifications on Variables
Even with a liberal set of predefined functions, there will arise cases where the set of equations that a user wants to include in his definition is much too large to ever type by hand. For example, in defining a LISP interpreter, it is important to define the function atom, which tests for atomic symbols. The natural set of equations to define this function includes atom (cons(x,y))=false, atom(a)=true, atom (b)=true, ... atom (aa)=true, atom (ab)=true, ..... We would like to abbre- viate this large set of equations with the following two:
atom(cons(x,y)) = false;
atom(x) = true where x is either
in atomic_symbols or in integer_numerals
end or end where
Notice that the qualification placed on the variable x is essentially a syntactic,
rather than a semantic, one. In general, we allow equations of the form:
term = term where qualification end where
A qualification is of the form
WV
qualification_item,, +++ qualification_item,, m
and qualification_items are of the forms
variable is qualification_term
variable,, -++ variable, are qualification_term
and qualification_terms are of the forms
in predefined_symbol_class
28 8. Syntactic Qualifications
term qualification_term where qualification end where
either qualification_term, or -+~ qualification_term, end or
Examples illustrating the forms above: atompair_or_atom(x) = true where x is either cons(y,z) where y,z are in atomic_symbols end where or in atomic_symbols end where; atom _int_pair(x) = true where x is cons(y,z) where y is in atomic_symbols, x is in integer_numerals end where end where
If the same variable is mentioned in two different nested qualifications, the inner-
most qualification applies.
The interpretation of the restrictions on equations in Section 5 is not obvious in the presence of qualified equations. Restrictions 1 and 2, regarding the appear- ance of variables on left and right sides of equations, are applied to the unqualified equations, ignoring the qualifying clauses. Restrictions 4 and 5, regarding possible interactions between left sides, are applied to the results of substituting variable qualifications for the instances of variables that they qualify. For example, the
equation
S(x) = y where x is g(y) end where
is prohibited, because the variable y is not present on the unqualified left side, and
the pair of equations
8. Syntactic Qualifications 29
SC) = 0 where x is g(y) end where; g(x) = 1;
is prohibited because of the overlap in f(g(a)). In general, a variable occurring in a where clause is local to that clause, so g(x,y) =z where x is y is equivalent to g(x,y) =z, rather than g(x,x) =z. The details of interactions between variable bindings and where clauses certainly need more thought, but fortunately the subtle
cases do not occur very often.
9. Miscellaneous Examples
This section contains examples of complete equational programs that do not fit any specific topic, but help give a general feeling for the capabilities of the interpreter. The first ones are primitive, and should be accessible to every reader, but later ones, such as the /ambda—calculus example, are intended only for the reader
whose specialized interests agree with the topic.
9.1. List Reversal
The following example, using the LISP.M syntax, is chosen for its triviality. The operation of reversal (rev) is defined using the operation of adding an element to the end of a list (addend). A trace of this example shows that the number of steps to reverse a list of length n is proportional to 2. Notice that the usual LISP opera- tors car and cdr (first element, remaining elements of a list) are not needed, because of the ability to nest operation symbols on the left-hand sides of equations. This example has no advantage over the corresponding LISP program, other than transparency of notation. It is easy to imagine a compiler that would translate equational programs of this sort into LISP in a very straightforward way. Symbols : List constructors
cons: 2;
nil: 0; : Operators for list manipulation
rev: 1;
addend: 2;
include atomic_symbols.
For all x,y,z:
rev[Q] = 0; revi(x . we = addendlrevlyJ; x];
9.1. List Reversal 31
addend[Q; x] = (x); addend[{(x . y); z] = (x . addendly; z)).
The following equations redefine list reversal in such a way that the equation interpreter will perform a linear-time algorithm. Just like the naive quadratic time version above, these equations may be compiled into a LISP program in a very straightforward way.
Symbols
cons: 2;
nil: 0;
rev: 1;
apprev. 2;
include atomic_symbols. For all x,y,z:
revix] = apprevix; OJ;
: apprevlx; z] is the result of appending z to the reversal of x. appreviQ); z] = 2; apprevi(x . y); z] = apprevly; (x . z)].
9.2. Huffman Codes
The following definition of an operator producing Huffman codes [Hu52, AHU83] as binary trees is a little bit clumsier than the list reversals above to translate into LISP, since the operator Huff, a new constructor combining a partially-constructed Huffman tree with its weight, would either be omitted in a representing S- expression, or encoded as an atomic symbol. Either way, the list constructing operator is overloaded with two different intuitive meanings, and the expressions
become a bit harder to read.
32 9. Miscellaneous Examples
The following equations produce Huffman codes in the form of binary trees constructed with cons. To produce the Huffman tree for the keys K,,Ko,°°: Ky _with weights w),W2,°°‘W, in decreasing numerical order, evaluate the term BuildHuff ((Hufflwy;K,]- >> Hufflw,:K, D1. Symbols
: List construction operators cons: 2; nil: 0;
+ Hufflw; t] represents the tree t (built from cons) having weight w. Huff: 2;
: Tree building operators BuildHufg: 1; Insert: 2; Combine: 2;
: Arithmetic and logical symbols and operators add: 2; less: 2; include truth_values;
include integer_numerals; include atomic_symbols.
For all weight!, weight2, treel, tree2, x, y, remainder, item:
: if is the standard conditional function, and add, less 2 are the standard arithmetic operation and test. ifltrue; x; y] = x; iflfalse; x; yl = y;
include addint, lessint;
: BuildHuffllist] assumes that its argument is a list of weighted trees, in : decreasing order by weight, and combines the trees into a single tree : representing the Huffman code for the given weights.
BuildHuffl(Hufflweight!; treel])] = treel; BuildHuffl(x y . remainder)] =
BuildHufflinsert[remainder; Combinelx; y]]] where x, y are Hufflweight!; tree1] end where;
9.2. Huffman Codes 33
: Insertllist; tree] inserts the given weighted tree into the given list of weighted trees according to its weight. Insert assumes that the list is in decreasing order by weight.
Insert[0; item] = (item); ders) He ives abies tree2] . remainder); Hufflweight1; tree1]] = iflless pian ; weight2];
(Hufflweight1; treel] Hufflweight2; tree2] . remainder); (Hufflweight2; tree2] . Insertlremainder; Huff[weight1; tree1]))];
: Combinelt1; t2] is the combination of the weighted trees t1 and t2 resulting : from hanging t1 and t2 from a common root, and adding their weights.
Combines affivciencts treel]: Hufflweight2; tree2]] = Huffladdlweight1; weight2J; (tree! . tree2)].
9.3. Quicksort
The following equational program sorts a list of integers by the Quicksort pro- cedure [Ho62, AHU74]. This program may also be translated easily into LISP. The fundamental idea behind Quicksort is that we may sort a list / by choosing a value i (usually the first value in /) splitting / into the lists / <, /., and /, of ele- ments <i, =i, and >i, respectively. Then, sort /< and /, (J. is already sorted), and append them to get the sorted version of /. Quicksort sorts a list of n elements in time O(nlogn) on the average. Symbols : List construction operators
cons: 2;
nil: 0; : List manipulation operators
smaller, larger: 2;
append: 2;
sort: 1;
2 ea and arithmetic operators and symbols if: 3;
34 9. Miscellaneous Examples
less: 2; include integer_numerals, truth_values.
For all i, j, a, b, rem: sort] = 0; sortl(i . rem)] = appendlsort[smallerli; reml]; append[(i); sortllargerli; rem]]]]:
z smaller[i; al is a list of the elements of a smaller than or equal to the integer i.
“smallerli; 0] = 0; smallerli; (j . rem)] = ifllessli; jJ; smallerli; rem]; (j . smaller[i; rem])]:
: largerli; al is a list of the elements of a larger than the integer i. largerli; 0] = 0; largerli; (j . rem)] = ifllessli; jl; G . largerli; rem); largerli; reml]-
: appendla; b] is the concatenation of the lists a and b. append[(); a] = a; appendl(i . rem); a] = (i . appendlrem; al);
‘if, less, and greater are the standard logical and arithmetic operations. ifltrue; a; b] = a; iflfalse; a; b] = b;
include lessint.
9.4. A Toy Theorem Prover
The fact that sorting a list / with an equational program is equivalent to proving sort{I] =1' based on certain assumptions, leads one to consider the similarity between sorting by exchanges, and proving equalities based on commutativity and
associativity. Given the axioms x+y =y+x and (x+y)+z =x+(y+z), we
9.4. A Toy Theorem Prover 35
quickly learn to recognize that for any two additive expressions, Z, and E>, con- taining the same summands possibly in a different order, EF; = E,. One way to formalize that insight is to give a procedure that takes such E£, and E, as inputs, checks whether they are indeed equal up to commutativity and associativity, and produces the proof of equality if there is one. The proof of equality of two terms by commutativity and associativity amounts to a sorting of the summands in one or both terms, with each application of the commutativity axiom corresponding to an
interchange.
In the following program, comparel[a;b] takes two additive expressions a and b, and produces a proof that a=b from commutativity and associativity, if such ; proof exists. Additive expressions are constructed from numbered variables, with vli] representing the ith variable v;, and the syntactic binary operator plus. Proofs are represented by lists of expressions, with the convention that each expres- sion in the list transforms into the next one by one application of commutativity or associativity to some subexpression. Since equality is symmetric, proofs are correct whether they are read forwards or backwards. The proof of a=b starts by proving a=a', where a' has the same variables as a, combined in the standard form vj, +0,,+ +++, _ +4) ++), with i;<ij4;. This proof of a=a' is the value of stand[a]. A similar proof of b=b' is produced by stand[b]. Finally, stand{a] is concatenated with the reversal of stand[b]. If the standard forms of a and b are not syntactically identical, then there is a false step in the middle of the con- catenated proofs, and that step is marked with the special operator falsestep[].
The interesting part of the procedure above is the stand operation, proving equality of a term with its standard form. That proof works recursively on an
expression a+b, first standardizing a and b individually, then applying the
36 9. Miscellaneous Examples
following transformations to combine the standardized a and 6 into a single stan- dard form.
1. (v+a)+(vj;+5) associates to vt (a+(v,+b)) when i<j
2. (v,ta)+(vj+b) commutes to (vj+b)+(v,+a) associates to v,+(b+(v;+a)) when i>j
3. (vta)+v; associates to v,+(atv,) when i <j
4, (v,ta)+v; commutes to vj;+(v,+a) when i>j
5. vj+(vj,+b) associates to (v,+v,)+b
commutes to (v;+v;)+b associates to vj+(v;+b) when i>j 6. v,+v; commutes to v;+v, when i>j
In cases 1-3 above, more transformations must be applied to subexpressions. In the following program, the merge operator performs the transformations described above.
Symbols
: Constructors for lists cons: 2; nil: 0; include integer_numerals;
: Constructors for additive expressions plus: 2; we I;
: Errors in proofs falsestep: 0;
/ Operators for testing and proving equality under commutativity, associativity compare: 2; stand: 1; merge. 1; plusp: 2; appendp: 2;
: Standard list and arithmetic operators
9.4. A Toy Theorem Prover
addend. 2;
equ: 2; include truth_values.
For all a, b, c, d, i, j, rem, reml, rem2:
comparela; b] = appendplstandla]J; standlbJ];
standivii]] = (Vii);
stand[plusla; b]] = mergelplusplstand[a]; stand[blll;
mergel(a b . rem)] = (a. mergel(b . rem))); mergel(a)] = mergelal;
: Case 6 mergelpluslviiJ; vii] = iflless[j; il; commute vi and vj
(pluslvii]; vGll_ plusivijl; vil;
else
no change (pluslviil; vi{IDI; | : Case 5 | eee aes pluslv{jJ; bil] = | iflless[j; il,
Colushilll pluslv{jj; bi] | ¥ associate vi with vj plus[pluslviiJ; v[jl]; b]
commute vi and vj pluslpluslvGj]; vill; 6] .
associate vi with b
plusp[{WGjD; mergelpluslviil: bill);
else
10 change plusp[WhD); visvaeltichelt blll
37
38 9. Miscellaneous Examples
mergelplus{pluslviil; al; viii] = iflless[j, il;
se 4 (plusiplus(vliJ; al: vii] . ; commute vita and vj
pluspl@[j); mergelpluslviil; all); ee Ci °* (pluslplustotil: al: vil.
associate a with vj
plusp{(Vli); mergelplusta; vG]I)I;
mergelplus[pluslvlil; al; plusivijJ; b]i] = iflless[j; il;
2 si (plus[plus{viil; al; plusivij]; bl]
commute vita and vj+b plus[pluslvijJ; b]; pluslviil; all .
associate b with vita plusp[{VGjD; mergelplus[b; plusivli]; alll);
else
I 7 olusliaiivabplacbinbi
associate a with vj+b plusp{(viil); mergelplusla; pluslvijJ; b1IIDI:
: plusplp; ql transforms proofs p of E1=F1 and q of E2=F2 into a proof : of plus[E1; E2] = plus[F1; F2]. pluspl(a); rem] = pluspla; rem]:
pluspl(a b . rem1); (c . rem2)] = (plusla; c] . pluspl(b . rem1); (c . rem2))); pluspla; 0] = 0
where a is either vli] or pluslb; c] end or end where;
pluspla; (b . rem)] = (plusta; b] . pluspla; rem) where a is either lil or pluslb; c] end or end where;
: appendplp; q] appends proofs p and the reversal of q, coalescing the last lines / Uf they are the same, and indicating an error otherwise.
9.4. A Toy Theorem Prover 39
appendplI(a); (b)] = iflequla; b]; (a); (a falsestepf] b)J: appendpl(a b . rem); c] = (a. appendpl(b . rem); cl); appendpl(a); (b c . rem)] = addendlappendpl(a); (c . rem)]; bl:
: addendlfl; al adds the element a to the end of the list I. addend[Q); a] = (a); addend[(a . rem); b] = (a. addendlrem; bl);
: equ is extended to additive terms, as a test for syntactic equality. equlviil; vj] = equli; ji; equlplusla; bl]; pluslc; dl] = andlequta; cl; equlb; dll; equlvii]; pluslc; dj] = false; equlplusta; b]; v[j]] = false;
include equint;
: if, and, less are standard operators. ifltrue; a; b] = a; iflfalse; a; b] = b; andla; b] = ifla; b; false];
include lessint.
9.5. An Unusual Adder
The following example gives a rather obtuse way to add two numbers. The intent of the example is to demonstrate a programming technique supported by the equa- tion interpreter, but not by LISP, involving the definition of infinite structures. We hope that this silly example will clarify the technique, while more substantial
examples in Sections 9.7, 15.3 and 15.4 will show its value in solving problems
40 9. Miscellaneous Examples
elegantly. The addition program below uses an infinite list of infinite lists, in which the ith member of the jth list is the integer i+j. In order to add two nonnegative integers, we select the answer out of this infinite addition table. The outermost evaluation strategy guarantees that only a finite portion of the table will actually be produced, so that the computation will terminate.
Symbols
: List constructors.
nil: 0;
include integer_numerals; : List utilities.
element: 2;
first; 1;
tail: 1;
inclist: 1;
: Standard arithmetic operators. add, subtract, equ: 2;
if- 3; include truth_values; : Unusual integer list, addition table and operator. intlist: 0; addtable: 0; weirdadd: 2. For all firstl . DJ = x; taill( . D] = 1; elementli; I] = ifle ae ol;
firstll] OS ieua: UJ; tail fill;
weirdaddli; j] = elementli; element[j; addtable[]II;
addtablef] = (intlistf] . inclistladdtablel]]);
9.5. An Unusual Adder 41
intlist] = (0. inclistlintlist QD;
inclistli] = addli; 1] where i is in integer_numerals end where;
inclist{(G.. DJ = (inclistli] . inclist{D;
include addint, subint, equint.
9.6. Arbitrary-Precision Integer Operations
The equation interpreter implementation provides the usual integer operations as primitives, when these operations are applied to integers that may be represented in single precision, and when the result of the operation is also single precision. In order to provide arbitrary precision integer operations, we extend these primitive
sets of equations with some additional explicit equations.
The following equations define arbitrary-precision arithmetic on positive integers in a straightforward way. A large base is chosen, for example base 2!5=32768, and the constructor extend is used to represent large numbers, with the understanding that extend(x,i) represents x*base+i, for 0<i<base. longadd and /ongmult are the binary operators for addition and multiplication. Addition follows the grade school algorithm of adding digits from right to left, keeping track of a carry. Multiplication also follows the usual algorithm for hand calculation, adding up partial products produced by multiplying one digit of the second multi- plicand with the entire first multiplicand.
Symbols : Constructors for arbitrary-precision integers. extend: 2;
include integer_numerals;
: Base for arithmetic.
42 9. Miscellaneous Examples
base: 0;
: Single precision arithmetic operators. add: 2; multiply: 2;
: Arbitrary-precision arithmetic operators. longadd: 2; longmult: 2;
: Operators used in defining the arithmetic operators. sum: 3;
acarry: 3;
carryadd: 3;
prod: 3;
mearry: 3;
carrymult: 3.
For all x, y, 2, i,j, &: base() = 32768;
longadd(x, y) = carryadd(x, y, 0);
carryadd(i, j, k) = if(equ(acarry(i, j, k), 0), sum(i, j, k), extend(acarry(i, j, k), sum(i, j, k))) where i, j are in integer_numerals end where;
carryadd(extend(x, i), j, k) = carryadd(extend(x, i), extend(0, j), k) where j is in integer_numerals end where;
carryadd(i, extend(y, j), k) = carryadd(extend(0, i), extend(y, j), k) where i is in integer_numerals end where;
carryadd(extend(x, i), extend(y, j), k) = extend (longadd(x, y, acarry(i, j, k)), sum(, j, kJ); sum(i, j, k) = mod(add(i, add@, k)), baseQ); acarry(i, j, k) = div(add(i, add(j, k)), baseQ); longmult (x, j) = carrymult(x, j, 0) where j is in integer_numerals end where;
longmult(x, extend(y, j)) =
9.6. Arbitrary-Precision Integers 43
longadd(carrymult(x, j, 0), extend(longmult(x, y), 0));
carrymult(i, j, k) = if(equ(mcarry(i, j, k), 0), prod (i, é: iD extend(mcarry(i, j, k), prod(i, j, k)) where i is in integer_numerals end where; carrymult(extend(x, i), j, k) = extend (carrymult(x, j, mcarry(i, j, k)), prod(i, j, k)); prodi(i, j, k) = mod(multiply(i, multiplyG, k)), baseQ);
mearry(i, j, k) = div(multiply(i, multiplyG, k)), baseQ);
include addint, divint, modint, multint.
The simple equational program above has several objectionable qualities. First, in order to make the operations sum, acarry, prod, and mcarry really work, we must choose a base much smaller than the largest integer representable in sin- gle precision. In particular, to allow evaluation of prod and mcarry in all cases, the base used by the program may not exceed the square root of full single preci- sion. This use of a small base doubles the sizes of multiple-precision integer representations. By redefining the offending operators, we may allow full use of single precision, but only at the cost of a substantial additional time overhead. For
example, sum would have to be defined as
sum(i, j, k) = addmod(i, addmod(j, k, baseQ), baseQ); addmod (i, j, k) = if(less(i, subtracttk, j)),
add(i, j), add (subtract (maxi, j), k), min(@i, j)));
and prod would be even more complex, because of the possibility of zeroes. Even if
we accept the reduced base for arithmetic, extra time is required to add or multiply
44 9. Miscellaneous Examples
two single-precision numbers with a single-precision result, since we must check for the nonexistent carry. Finally, it is distasteful to introduce new operators longadd
and longmult when our intuitive idea is to extend add and multiply.
In order to avoid these objections, we need a slightly better support from the predefined operations add and multiply. When faced with an expression add (a,8), where a and 8 are single-precision numerals, but their sum requires dou- ble precision, the current version of the equation interpreter merely declines to reduce. In order to support multiple-precision arithmetic cleanly and efficiently, the implementation must be modified so that add and multiply produce results whenever their arguments are single-precision numerals, even if the results require double precision. The double precision results will be represented by use of the extend operator in the program above. The only technical problem to be solved in providing this greater support is the syntactic one: what status does the extend operator have - need it be declared by the user? This syntactic problem is a spe- cial case of a very general need for modular constructs, including facilities to com- bine sets of equations and to hide certain internally meaningful symbols from the user. Rather than solve the special case, we have postponed this important
improvement until the general problem of modularity is solved (see Section 14).
Given an appropriate improvement to the predefined arithmetic operations, arbitrary precision may be provided by the following equational program. lowdigit picks off the lowest order digit of an extended-precision numeral, highdigits pro- duces all but the lowest order digit.
Symbols : Constructors for arbitrary-precision integers.
extend: 2; include integer_numerals;
9.6. Arbitrary-Precision Integers 45
: Arithmetic operators. add: 2; multiply: 2; : Operators used in defining the arithmetic operators. lowdigit: 1; highdigits: 1.
For all x, y, z, i, j, k:
add (extend (x, i), extend(y, j)) = add(extend(add(x, y), 0), add(i, j));
add(extend(x, i), j) = extend(add(x, highdigits(add(i, j))), lowdigit(add(i, j))) where j is in integer_numerals end where;
add(i, extend(y, j)) = extend(add(y, highdigits(add(i, j))), lowdigit(add(i, j))) where i is in integer_numerals end where; lowdigit(extend(x, i) = i;
highdigits(extend(x, i) = x;
multiply (x, extend(y, j)) = add(multiply(x, j), extend(multiply(x, y), 0)); multiply (extend(x, i), j) = add(multiply(i, j), extend(multiply(x, j), 0)) where j is in integer numerals end where;
include addint, multint.
This improved equational program answers all of the objections to the first one, and is substantially simpler. Notice that, whenever an operation is applied to single- precision arguments, yielding a single-precision result, only the predefined equa- tions for the operation are applied, so there is no additional time overhead for those operations. Negative integers may be handled through a negate operator, or by negating every digit in the representation of a negative number. The second solu- tion wastes one bit in each digit, but is more compact for single-precision negative
numbers, and avoids additional time overhead for operations on single-precision
46 9. Miscellaneous Examples
negative numbers.
9.7. Exact Addition of Real Numbers
Another interesting example of programming with infinite lists involves exact com- putations on the constructive real numbers [Bi67, Br79, My72]. In principle, a constructive real number is a program enumerating a sequence of rational intervals converging to a real number. Explicit use of these intervals is quite clumsy com- pared to the more intuitive representation of reals as infinite decimals. Unfor- tunately, addition is not computable over infinite decimals. Suppose we try to add 0.99:++ to 1.00---. No matter how many digits of the two summands we have seen, we cannot decide for sure whether the sum should be 1.--+ or 2.---. If he sequence of 9s in the first summand ever drops to a lower digit, then the sum must be of the form 1.--- ; if the sequence of Os in the second summand ever rises to a higher digit, then the sum must be of the form 2.-+-. As long as the 9s and Qs continue, we cannot reliably produce the first digit of the sum. Ironically, in exactly the case where we can never decide whether to use 1.--- or 2.---, either one would be right, since 1.99--- = 2.00---. One aspect of the problem is that conventional infinite decimal notation allows multiple representations of certain numbers, such as 1.99--+ =2.00---, but requires a unique representation of others, such as 0.11---. The solution is to generalize the notation so that every number has multiple representations, by allowing individual digits to be negative as well as positive. This idea was proposed for bit-serial operations on varying-
precision real numbers [Av61, At75, O179].
Let the infinite list of integers (dy d, d,---), be used to represent the real
number S'd;*107'. do is the integer part, and there is an implicit decimal point i=0
9.7. Exact Addition of Real Numbers 47
between dg and d;. In conventional base 10 notation, each d; for i21 would be limited to the range [0,9]. Suppose that the range [—9,+9] is used instead ([-5,+5] suffices, in fact, but leads to a clumsier program). As a result, every real number has more than one representation. In particular, the intervals correspond- ing to finite decimals overlap, so that every real fumiber is in the interior of arbi- trarily small intervals. For a conventional finite decimal a, let Ip.(a) denote the interval of real numbers having conventional representations beginning with a. Similarly I_99(a) denotes the interval of real numbers having extended representa- tions beginning with a, where a is a finite decimal with digits from —9 to 9.
The problem with the conventional notation is that certain real numbers do not lie in the interiors of any small intervals Ij9(a), but only on the endpoints. When generating the decimal expansion of a real number x, it is not safe to specify an interval with x at an endpoint, since an arbitrarily small correction to x may take us out of that interval. For example, 1.1 is only an endpoint of the intervals Ing (1-1) = (1.1, 1.2], Ipg(1-10) = (1.1, 1.11], Ip9(1-100) = [1, 1.101], etc., and the smallest interval Io.9(a) with 1 in its interior is Ip.9(1) = [1,2]. On the other hand, 1.1 is in the interior of each interval I_99(1.1) = [1, 1.2], Lo 9(1.10) = [1.09, 1.11], T.99(1.100) = [1.009, 1.101], etc., because the larger number of digits stretches these intervals to twice the width of the conventional ones, yielding enough overlaps of intervals to avoid the singularities of the conventional notation.
The notation described above is a fixed point notation. Infinite floating point decimals may also be defined, allowing dp to be restricted to the range [—9,+9] as well. Such an extension of the notation makes the programs for arithmetic opera- tions more complex, but does not introduce any essential new ideas. Conventional
computer arithmetic on real numbers truncates the infinite representation of a real
48 9. Miscellaneous Examples
to some finite precision. The equation interpreter is capable of handling infinite
lists, so, except for the final output, it may manipulate exact real numbers.
In order to program addition of infinite-precision real numbers, as described above, we mimic a program in which the two input numbers are presented to a pro- cess called addlist, which merely adds corresponding elements in the input lists to
_ produce the output list. Notice that the output from addlist represents the real sum of the two inputs, but has digits in the range [-18,+18]. The output from addlist goes to a process called compress, which restores the list elements to the range —9 to +9. The output from compress is the desired result. The function add is defined by the composition of addlist and compress. Notice that, while addlist produces one output digit for every input pair, compress must see more than one input digit in order to produce a single output digit. Looking at d; and d;4;, where d; has already been compressed to the range [—8,+8], compress adjusts d;,, into the range [-8,+8] by adding or subtracting 10, if necessary, and compensating by adjusting d,; by +1. Notice that it is important to first place a digit in [—8,+8], so that it may be used in the adjustment of the next digit and stay in [—9,+9].
In order to use the addition procedure, we need to provide some interesting definitions of infinite-precision real numbers, and also a function called standard to produce the output in conventional base-10 with finite precision. standard takes two arguments: a single integer i, and an infinite-precision real r. The result is the standard base-10 representation of r to i significant digits. Notice that standard
may require i+1 digits of input in order to produce the first digit of output.
9.7. Exact Addition of Real Numbers
Symbols
: List constructors cons: 2; nil; 0;
: List manipulation operators first: 1; tail: 1;
: Real arithmetic operator add: 2;
: Other operators needed in the definitions addlist: 2; compress: 1; stneg, stpos: 3; revneg, revpos: 2; rotate: I;
: Input and output operators repeat: 2; standard: 2; : Standard arithmetic and logical operators and symbols. ift 3; equ, less, greater: 2; include integer_numerals, truth_values.
For all x, y, i, j, k, l, a, 6: : add is extended to infinite-precision real numbers.
addl(i . x); G . y)] = compressladdlistlG . x); G . y)II;
: addlist adds corresponding elements of infinite lists.
addlistl(i . x); Gj . yj] = (addli; j] . addlistlx; y));
: compress normalizes an infinite list of digits in the range [-18, +18]
: into the range [-9, +9].
compressl(i j . x)] =
iflless[j; -8]; (subtractli; 1] . compress[(addlj; 10] . x))); iflgreater[j; 8]; (addli; 1] . compressl(subtractlj; 10] . x);
(i. compresslG . x) DI;
49
50 9. Miscellaneous Examples
: repeatlx; y] is the infinite extension of the decimal expansion x : by repeating the finite sequence y.
repeatl(i . x); y] = (i. repeatlx; yl); repeatl(); y] = (firstly] . repeat[0); rotately]));
: rotately] rotates the first element of the list y to the end. . rotatel(i)] = (i; rotatel(i j . x] = (j. rotatel(i . x)));
2 standardli; a] is the normal base 10 expansion of the first i digits of a.
standard[j; (i. a)] = iflequlj; 0]; 0; iflequli: OJ]; (0. standard{subtractlj; 1]: al); ifllessli; 0]; stneglj; G.. a); Ol; stposfj; (i. a; OI:
: stnegli; a; b] translates the first i digits of a into normal base 10 notation, backwards, and appends b, assuming that a is part of a negative number. : stposli; a; b] does the same thing for positive numbers.
stneglj; (i. a); b] = iflequlj; Ol:
revneglb; QJ; Hnesubiractih LU; a; @. bd: stposlj; (i. a); b] = iflequlj; 0]:
revposlb; QJ; stposlsubtractlj; 1]; a; (i. bd];
: revnegla; b] reverses the finite decimal expansion a, borrowing and carrying so as to make each digit nonpositive, finally appending the list b. : revposla; b] does the same, making each digit nonnegative.
revnegl(; b] = b; revnegl(i . a); b] = ifflessli; 1]: revnegla; (i. b)J: revnegl(add[firstla]; 1] . taillal); (addli; -10] . b)II;
revposlQ); b] = b;
9.7. Exact Addition of Real Numbers 51
revposl(i . a); b] = ifllessli; 0]; revposl(add[firstlal; -1] . taillal); (addli; 10] . b)I; revposla; (i . bJII;
: first, tail, if, add, subtract, equ, less, are standard list, : conditional, and arithmetic operators.
first. a] = i; taill(i . a] = a;
ifltrue; x;y] =x; — iflfalse; x; y] = y;
greaterli; j] = less[subtract[0; i]; subtractl0; j]] where i, j are in integer_numerals end where;
include addint, subint, equint, lessint.
In this example, producing output in standard form was much more difficult than performing the addition. Other arithmetic operations, however, such as multiplica-
tion, are much more difficult to program.
9.8. Polynomial Addition
Polynomial addition is a commonplace program in LISP, with polynomials represented by lists of coefficients. The equation interpreter allows polynomial sums to be computed in the same notation that we normally use to write polynomi- als, with no distinction between the operator add that applies to integer numerals, the operator add that applies to polynomials, and the operator add used to con- struct polynomials. In LISP, the first would be PLUS, the second a user defined function, perhaps called POLYPLUS and the third would be encoded by a particu-
lar use of cons.
In effect, the equational programs shown below for "adding polynomials" are really just simplifying polynomial expressions into a natural canonical form. The
Horner—rule form for a polynomial of degree n in the variable X is
52 9. Miscellaneous Examples
CotXe(cy+Xe +++ +X4e,°°*). The list (cgc,***c,), typically used to represent the same polynomial in LISP, is a very simple encoding of the Horner- rule form. In the equation interpreter, we may use the Horner-rule form literally. The resulting program simplifies terms of the form 7,+7 , where each of a, and 7 is in Horner-rule form, to an equivalent Horner-rule form. Notice that the symbol add in the following programs may be an active symbol or a static constructor for polynomials, depending on context. Also, notice that the variable X over which the polynomials are expressed is not a variable with respect to the equational program, but an atomic_symbol. Symbols add: 2; multiply: 2; include integer_numerals, atomic_symbols. For all i, j, a, b: add(add(i, multiply(X, a)), add(j, pee b)) = add(add(i, j), multiply(X, add(a, b))) where i, j are in integer_numerals end where; add(i, add(j, multiply(X, b))) = add(add(i, J), multiply (X, b)) where i, j are in integer_numerals end where; add(add(i, multiply (X, a add(add(i, j), multiply (X, a)
where i, j are in integer_numerals end where;
include addint.
The program above is satisfyingly intuitive, but does not remove high-order 0 coefficients. Thus, (1+X* (2+X*3))+(1+X«(2+X*—3)) reduces to 2+X*(4+X*0) instead of the more helpful 2+X+4. Getting rid of the high-order zeroes is tricky,
since the natural equations X+0 = 0 and a+0 =a suffer from overlaps with the
9.8. Polynomial Addition 53
other equations. One solution, show below, is to check for zeroes before construct- ing a Horner-rule form, rather than eliminating them afterwards. Symbols
add: 2;
multiply. 2;
ift 3;
equ: 2;
and: 2;
include integer_numerals, atomic_symbols, truth_values.
For all i, j, a, b,c, a:
add(add(i, a), add(j, b)) = add(add(i, j), add(a, b)) where i, j are in integer_numerals, a, b are multiply(c, d) end where;
add(i, add(j, b)) = add(addi, j), b) where i, j are in integer_numerals, b is multiply(c, 2) end wiere;
add(add(i, a), j) = add(addii, j), a) where i, j are in integer_numerals, a is multiply(c, d) end where;
add(multiply(X, a), multiply(X, b)) = if(equ(add(a,b), 0), 0, multiply (X, add(a, b)));
equ(add(i, multiply(X, a)), addG, multiply(X, b))) = and(equ(i, j), equ(a, b)) where i, j are in integer_numerals end where;
equ(i, add(j, multiply(X, b))) = false where i, j are in integer_numerals end where;
equ(add(i, multiply (X, a)), j) = false where i, j are in integer_numerals end where,
if(true, a, b) = a; if(false, a, b) = b; and(a, b) = if(a, 6, false);
include addint, equint.
54 9. Miscellaneous Examples
It is amusing to consider other natural forms of polynomials, such as the power—series form, cgtX°+c +X 1+.-+++4¢,. This corresponds to the representa- tion of polynomials as lists of exponent-coefficient pairs. For dense polynomials, the exponents waste space, but for sparse polynomials the omission of internal zeroes may make up for the inclusion of exponents: as in 1+X!, A nice equa- tional programming challenge is to produce an elegant program for addition of
polynomials in power-series form.
9,9. The Combinator Calculus
Weak reduction in the combinator calculus [CF58, St72] is a natural sort of com- putation to describe with equations. The following equations use the Lambda syn-
tax of Section 4.3 to allow the abbreviation
(a, ay *** a,)
for the expression
AP(AP( +++ AP(a, a4), °** ), ay)
The symbol Lambda from Section 4.3 is not used in the combinator calculus. Symbols
AP: 2;
S, K, I: 0;
include atomic_symbols. For all x, y, z:
(Sxy2 = (xz yz);
(Kx y) = x;
9.9, The Combinator Calculus 55
(ix) =x.
This example, and the polynomial addition of Section 9.8, differ from the first ones in that the only symbol that can construct complex expressions, AP, appears (implicitly) at the head of left-hand sides of equations. In many interesting sys- tems of terms, there are one or more symbols that do not appear at the heads of left-hand sides, so that they may be used to construct structures that are stable with respect to reduction. These stable structures may be analyzed and rearranged by other operators. For example, in LISP, the symbol cons is a constructor, and an expression made up only of cons and atomic symbols (i.e., an S-expression), is always in normal form. It is helpful to notice the existence of constructors when they occur, but the example above illustrates the usefulness of allowing systems
without constructors. The use of constructors is discussed further in Section 12.1.
9.10. Beta Reduction in the Lambda Calculus This example should only be read by a user with previous knowledge of the lambda calculus. The reader needs to read de Bruijn’s article [deB72] in order to under- stand the treatment of variables. The object is to reduce an arbitrary lambda term to normal form by a sequence of 6-reductions. A number of rather tricky prob- lems are encountered, but some of the usual problems encountered by other imple- mentations of 6-reduction are avoided by use of the equation interpreter. This example uses the Lambda notation of Section 4.3.
The lambda calculus of Church [Ch41] presents several sticky problems for the design of an evaluator. First, the problem of capture of bound variables
appears to require the use of the a-rule
56 9. Miscellaneous Examples
\xE — \y.Ely/x]
to change bound variables. There is no simple way to generate the new variables needed for application of the a-rule in an equational program. In more conven- tional languages, variable generation is simple, but its presence clutters up the pro-
gram, and causes the outputs to be hard to read.
De Bruijn [deB72] gives a notation for lambda terms in which an occurrence of a bound variable is represented by the number of lambda bindings appearing between it and its binding occurrence. This notation allows a simple and elegant solution of the technical problem of capture, but provides an even less readable out- put. We represent each bound variable by a term var[x,i], where x is the name of the variable and i is the de Bruijn number. The first set of equations below translates a lambda term in traditional notation into this modified de Bruijn nota- tion. De Bruijn notation normally omits the name of the bound variable immedi- ately after an occurrence of lambda, but we retain the name of the variable for readability. We write the de Bruijn form of a lambda binding as \:x.E (the x appears directly as an argument to the lambda) to distinguish from the traditional notation \x.E (in which the first argument to lambda is the singleton list contain- ing x).
Symbols : Operators constructing lambda expressions Lambda: 2; AP: 2; : Constructors for lists cons: 2; nil: 0;
: varlx, i] represents the variable with de Bruijn number i, named x. var: 2;
9.10. Beta Reduction 57
: bindvar carries binding instances of variables to the corresponding bound : instances, computing de Bruijn numbers on the way.
bindvar: 3; : Arithmetical and logical operators and symbols
ift 3;
equ: 2;
add: 2;
include atomic_symbols, truth_values, integer_numerals.
For all x, y, E, F, i, j:
: Multiple-argument lambda bindings are broken into sequences of lambdas.
\x yerem.E = \x.\y:rem.E;
: Single-argument lambda bindings are encoded in de Bruijn notation. \x.E = \:x.bindvarlE, x, OJ; : bindvar[E, x, i] attaches de Bruijn numbers to all free instances of the variable x in the lambda-term E, assuming that E is embedded in exactly i
lambda bindings within the binding instance of x.
bindvarlx, y, i] = iflequlx, yJ, varlx, i], x] where x is in atomic_symbols end where;
bindvarlvarlx, jl, y, i] = varlx, jl: bindvar{(E F), y, i] = (bindvarlE, y, i] bindvarlF, y, iD; bindvark;x.E, y, i] = \:x.bindvarlE, y, addli, 1]]
where x is in atomic_symbols end where;
‘if is the standard conditional function, equ the standard equality test, and add : the standard addition operator on integers.
ifltrue, E, F] = E; iflfalse, E, F] = F:
include equatom, addint, equint.
In order to perform evaluation of a lambda term in de Bruijn notation, the
transformation described above must be done logically prior to the actual -
58 9. Miscellaneous Examples
reduction steps. In principle, equations for de Bruijn notation and 6-reduction could be combined into one specification, but it seems to be rather difficult to avoid overlapping left-hand sides in such a combined specification (see Section 5, restric- tion 4). At any rate, it makes logical sense to think of the transformation to de Bruijn notation as a syntactic preprocessing step, rather than part of the seman- tics of B-reduction. Therefore, we built a custom syntactic preprocessor for the B-
reduction equations. After executing
loadsyntax Lambda
int.in (the syntactic preprocessor for the command ei) is the shell script:
#! binlsh
SYSTEM/Syntax/Outersynt/Lambdalint.in SYSTEM | SYSTEM{Syntax/Commoniint.in.trans SYSTEM | SYSTEM{Syntax/Commoniint.infin SYSTEM ;
where SYSTEM is the directory containing equational interpreter system libraries,
differing in different installations. we edited int.in to look like
#! finlsh
SYSTEM(Syntax/Outersynt/Lambdalint.in SYSTEM | SYSTEM|Syntax/Commoniint.in.trans SYSTEM | SYSTEM{Syntax/Commoniint.in.fin SYSTEM | DEBRUIJN&nterpreter;
where DEBRUIJN is the directory in which we constructed the transformation to
de Bruijn notation. We did not change pre.in (the syntactic preprocessor for ep)
9.10. Beta Reduction 59
or int.out (the pretty-printer for ei). Even with the elegant de Bruijn notation, two technical problems remain.
First, the official definition of B-reduction:
(\x.E F) — E[F/x]
cannot be written as a single equation, since the equation interpreter has no nota- tion for syntactic substitution (see [K180a] for a theoretical discussion of term rewriting systems with substitution). The obvious solution to this problem is to introduce a symbol for substitution, and define its operation recursively with equa- tions. A nice version of this solution is given by Staples [St79], along with a proof that leftmost-outermost evaluation is optimal for his rules. For notational econ- omy, we take advantage of the fact that the lambda term \x.E may be used to represent syntactic substitution, so that no explicit symbol for substitution is required. Combining this observation with Staples’ rules, we produced the follow-
ing recursive version of 6-reduction:
(\x.x G) -G (\x.y G) — y where x and y are different variables (\x.(E F) G) — (Ax.E G) (\x.F G)) (\x.\y.£ G) — Qy.\x.£ G)) These rules may be translated straightforwardly into the de Bruijn notation, and
written as equations, using a conditional function and equality test to combine the
first two rules into one equation. In the de Bruijn form, occurrences of lambda
60 9. Miscellaneous Examples
must be annotated with integers indicating how many other instances of lambda have been passed in applications of the fourth rule above. Otherwise, there would be no way to recognize the identity of the two instances of the same variable in the first rule. Initially and finally, all of these integer labels on lambdas will be 0, only
in intermediate steps of substitution will they acquire higher values.
- Unfortunately, the left-hand side of the third rule overlaps itself, violating res- triction 4 of Section 5. To avoid this overlap, we introduce a second application operator, JAP, to distinguish applications that are not the heads of rules. The third rule above is restricted to the case where E is applied to F by IAP. Since \x.(E F) is applied to G by the usual application operator, AP, there is no over- lap. Interestingly, Staples introduced essentially the same restriction in a different notation because, without the restriction, leftmost-outermost reduction is not optimal. This technique for avoiding overlap is discussed in Section 12.4. [0S84] develops these ideas for evaluating lambda-terms more thoroughly, but not in the notation of the equation interpreter. All of the observations above lead to the fol- lowing equations.
Symbols : Constructors for lists cons: 2; nil: 0; : Constructors for lambda-terms : IAP represents an application that is known to be inert (cannot become the head
: of a redex as the result of reductions in the subtree below). Lambda: 2;
incvars: 2;
: Arithmetical and logical operators if: 3;
9.10. Beta Reduction 61
add: 2;
equ: 2;
less: 2;
include atomic_symbols, truth_values, integer_numerals.
For all x, y, z, E, F, G, i, j:
: Detect inert applications and mark them with IAP.
(x E) = IAPIx, E] where x is either varly, i or IAPIF, G] or in atomic_symbols end or end where;
:\x:i.E represents a lambda expression that has passed by i other instances of lambda. It is necessary to count such passings in order to recognize instances of the bound variable corresponding to the x above. Only active instances of lambda, that is, ones that are actually applied to something, are given an
integer tag of this sort.
(:x.E F) = (\x:0.E F) where x is in atomic_symbols end where;
(y.i.varlx, j] E) = iflequli, j], E, ifllessli, jl, varlx, addfj, -1I], varlx, j]]] where i is in integer_numerals end where;
(\y:i.x E) = x where x is in atomic_symbols, i is in integer_numerals end where;
(\y:i IAPIE, F] G) = ((y-i.E G) \y:i.F G))
where i is in integer_numerals end where;
Ay-i\:z.E F) = (:z.\y:addli, 11.E F))
where i is in integer_numerals end where;
incvarslvarlx, il, j] = ifllessli, j], varlx, iJ, varlx, addli, LI];
incvars[x, i] = x where x is either in atomic_symbols or in integer_numerals or in truth_values or in character_strings end or end where;
62 9. Miscellaneous Examples
inevarslIAPIE, Fl], i] = IAPlincvars[E, iJ, incvarsIF, iJ]:
incvars|\x:t.E, i] = \x:incvarslt, i]. incvarslE, addli, 1]];
ifltrue, x, y] = x; iflfalse, x, y] = y;
include equint, addint, lessint.
Certain other approaches to the lambda calculus, such as the evaluation stra- tegy in LISP, avoid some of the notational problems associated with overlapping left-hand sides by using an evaluation operator. Essentially, such techniques reduce eval/[E] to the normal form of E, rather than reducing E itself. Such a solution could be programmed with equations, but it introduces two more problems, both of which exist in standard implementations of LISP. Notice that the outer- most evaluation strategy used by the equation interpreter exempted us from worry- ing about cases, such as (\x.y (\x. (xx)\x. (xx))), which have a normal form, but also an infinite sequence of reductions. Implementations of the lambda calculus using an evaluation operator must explicitly program leftmost-outermost evaluation, else they will compute infinitely on such examples without producing the normal form. Also, in terms, such as \x. (\y.yz), whose normal forms contain lambdas (in this case, the normal form is \x.z), it is very easy to neglect to evaluate the body of the unreduced lambda binding. Using rules that reduce lambda terms directly puts the onus on the equation interpreter to make sure that these rules are applied
wherever possible.
9.11. Lucid
Lucid is a programming language designed by Ashcroft and Wadge [AW76,
AW77] to mimic procedural computation with nonprocedural semantics. Early
9.11. Lucid 63
attempts to construct interpreters [Ca76] and compilers [Ho78] encountered serious difficulties. The following set of equations, adapted from [HO82b], produce a Lucid interpreter directly, using the Standmath syntax of Section 4.1. A trivial Lucid program, itself consisting of equations, is appended to the end of the equations that define Lucid. Evaluation of the expression output () produces the result of running the Lucid program. Even though convenience and performance considerations require the eventual production of a hand-crafted Lucid interpreter, such as the one in [Ca76], the ability to define and experiment with the Lucid language in the
simple and relatively transparent form below would certainly have been helpful in
the early stages of Lucid development.
: Equations for the programming language Lucid, plus a Lucid : program generating a list of integers.
Symbols
: Lucid symbols NOT; I; OR: 2; add: 2; equ: 2; if: 3; first: 1; next: 1; asa: 2; latest: 1; latestinv: 1;
Soy: 2;
include integer_numerals, truth_values; : symbols in the Lucid program
intlist: 0;
output: 0. For all W, X, Y, Z:
: Definitions of the Lucid operators
64 9. Miscellaneous Examples
NOT(true) = false;
NOT (false) = true;
NOT (fby(W, X)) = foy(NOT (first(W)), NOT(X)); NOT (latest (X)) = latest(NOT(X));
OR(true, X) = true; OR(false, X) = X;
ORG CW, X), [oy VY, Z)) = OR (first(W), first(Y)), OR(X, Z));
Bs »(W, X), latest(¥)) = Soy (ORGirst(W), latest(Y)), OR(X, latest(Y)));
OR(latest(x), foy(Y, Z)) foy(OR (latest (X), first(¥)), OR(latest(X), Z));
OR(foy(W, X), false) = foy(W, X); OR(latest(X), latest(Y)) = latest(OR(X, Y)); OR(latest(X), false) = latest (X);
if(true, Y, Z) = Y; if(false, Y, Z) = Z; if(by(W, X), Y, Z) = foy(if(frst(W), Y, Z), if(X, Y, Z); if(latest(X), Y, Z) = latest(if(X, Y, Z)); first(X) =
where Xn is either in truth_values
or in integer_numerals end or end where;
first (fby(X, Y)) = first(X); first (latest(X)) = latest(X);
next(X) =X
9.11. Lucid 65
where X is either in truth_values or in integer_numerals end or end where,
next (fby(X, Y)) = Y; next (latest(X)) = latest(X);
asa(X, Y) = if(first(Y), first(X), asa(next(X), next(Y)));
latestinv(X) = where X is Ais in truth values or in integer_numerals end or end where,
latestinv(fby(X, Y)) = latestinv(X); latestinv(latest(X)) = X; add(fby(W, X), (Y, Z))
Ne UA 3, ‘frst (Y)), add(X, Z));
ae (W, X), latest(Y) = ‘fby (a dd (first (W), latest(Y)), add(X, latest(Y)));
add (latest (X), fby(Y, foy (add (latest(X), haa. add(latest(X), Z));
ae W, X), Y) = da (first( W), Y), add(X, Y)) rahe Y is in integer_numerals end where,
add(Xx, fby(Y, Z)) Soy (add (x, ‘first (Y)), add(X, Z)) where X is in integer_numerals end where;
add (latest (X), latest(Y)) = latest(add(X, Y));
add (latest(X), Y) = latest(add(x, Y)) where Y is in integer_numerals end where;
——
66
add(X, latest(Y)) = latest(add(X, Y)) where X is in integer_numerals end where;
(fby(W, X), foy(Y, Z)) = Tautiea dD, first(Y)), equ(X, Z));
equ(foy(W, X), latest(Y)) =
foy(equ(first(W), latest(¥)), equ(X, latest(Y)));
equ(latest(X), foy(Y, Z)) = foy(equ(latest(X), first(Y)), equ(latest(X), Z));
equ (fby(W, X), Y) = foylequ(first(W), Y), equ(X, Y)) where Y is in integer_numerals end where;
equ(X, foy(Y, Z)) = foy(equ(X, first(Y)), equ(X, Z)) where X is in integer_numerals end where;
equ(latest(X), latest(Y)) = latest (equ(X, Y));
equ(latest(X), Y) = latest(equ(X, Y)) where Y is in integer_numerals end where;
equ(X, latest(Y)) = latest (equ(X, Y)) where X is in integer_numerals end where;
include addint, equint;
: A trivial Lucid program intlistO = foy(0, add(intlistO, 1); outputO = fby(first(intlistO),
(first (next (intlistO)), a re cna attist OD).
9. Miscellaneous Examples
The equational program given above differs from other Lucid interpreters in
one significant way, and two superficial ways. First and most significant, the OR
9.11. Lucid 67
operator defined above is not as powerful as the OR operator in Lucid, because it fails to satisfy the equation OR(X,true)=true, when X cannot be evaluated to a truth value. The weakening of the OR operator is required by restrictions 3 and 5 of Section 5. These restrictions will be relaxed in later versions of the equation interpreter, allowing a full implementation of the Lucid OR. Second, the variable INPUT, which in Lucid is implicitly defined to be the sequence of values in the input file of the Lucid program when it runs, is not definable until run time, so it cannot be given in the equations above. In order to mimic the input behavior of Lucid, the equation interpreter would have to be used with a syntactic preprocessor to embed given inputs within the term to be evaluated. Interactive input would require an interactive interface to the equation interpreter. Such an interactive interface does not exist in the current version, but is a likely addition in later ver- sions (see Section 15.3). Finally, of the many primitive arithmetic and logical operations of Lucid, only add, equ, OR, and if have been given above. To include other such operations requires duplicating the equations distributing primitive operations over fby and latest. With a large set of primitives, these equations would become unacceptably unwieldy. A truly satisfying equational implementa- tion of Lucid would have to encode primitive operations as nullary symbols, and use an application operator similar to the one in the Curry inner syntax of Section
4.4 in order to give only one set of distributive equations.
10. Errors, Failures, and Diagnostic Aids
In the interest of truth in software advertising, exceptional cases in the equation interpreter are divided into two classes: errors and failures. Errors are definite mistakes on the part of the user resulting from violations of reasonable and concep- tually necessary constraints on processing. Failures are the fault of the inter- preter itself, and include exhaustion of resources and exceeding of arbitrary limits. Each message on an exceptional case is produced on the UNIX standard error file, begins with the appropriate word "Error" or "Failure", and ends with an identify- ing message number, intended to help in maintenance. An attempt is made to explain the error or failure so that the user may correct or avoid it. The eventual goal of the project is that the only type of failure occurring in the reduction of a term to normal form will be exhaustion of the total space resources. Currently, the interpreter will fail when presented with individual input symbols that are too long, but it will not fail due to overflow of a value during reduction. There are also some possible failures in the syntactic preprocessing and output pretty-printing steps that result in messages from yacc (the UNIX parser generator) rather than from the equational interpreter system. These failures apparently are all the result of overflow of some allocated space, particularly the yacc parsing stack. Occasionally, running of a large problem, or of too many problems simultaneously, will cause an overflow of some UNIX limits, such as the limit on the number of processes that
may run concurrently.
Because of the layered modular design of the interpreter, different sorts of errors may be reported at different levels of processing, and, regrettably, in slightly different forms. For the preprocessor (ep), the important levels are 1) context-free
syntactic analysis, 2) context-sensitive syntactic analysis, and 3) semantic process-
10. Errors, Failures, Diagnostics 69
ing. For the interpreter (ei), only levels 1 and 3 are relevant. Sections 10.1
through 10.3 describe the sorts of messages produced at each of these levels.
10.1. Context-Free Syntactic Errors and Failures
Context-free syntactic errors in preprocessor input may involve the general syntax of definitions, described in Section 3, or one of the specific syntaxes for terms described in Section 4. Context-free errors in interpreter input may only involve a specific term syntax. Error messages relating to a specific term syntax always include the name of the syntax being used. Error detection is based on the parsing strategy used by yacc [Jo78]. Each error message includes a statement of the syn- tactic restriction most likely to cause that sort of parsing failure. The parser makes no attempt to recover from an error, so only the first syntactic error is likely to be reported. It is possible that an error in a term will be detected as an error in the general syntax of definitions, and vice versa. Error messages are particularly opaque when the wrong syntactic preprocessor was loaded by the last invocation of loadsyntax, so the user should always pay attention to the name of the syntax in use. Yacc failures are possible in the syntactic preprocessing, either from parser
stack overflow, or from an individual symbol being too long.
10.2. Context-Sensitive Syntactic Errors and Failures
Context-sensitive errors are only relevant to preprocessor input. They all involve inconsistent use of symbols. The three types of misuse are: 1) repeated declara- tion of the same symbol; 2) use of a declared symbol with the wrong arity; 3) attempt to include a class of symbols or equations that does not exist; 4) repetition of a variable symbol on the left-hand side of an equation; 5) appearance of a vari-
able on the right-hand side of an equation that does not appear on the left.
70 10. Errors, Failures, Diagnostics
Context-sensitive syntactic preprocessing may fail due to exhaustion of space resources, or to an individual symbol being too long for the current version. The second sort of failure will be avoided in later versions. In order to produce a lexi- con presenting all of the symbols used in an equational program, see Section 10.4
below.
10.3. Semantic Errors and Failures
The only semantic failure in the interpreter is exhaustion of total space resource. Other semantic errors and failures are only relevant to preprocessor input. The simplest such error is use of a symbol from one of the classes integer_numerals, atomic_symbols, truth_values, or characters, without a declaration of that class. In future versions, these errors will be classified as context-sensitive syn- tactic errors. The more interesting errors are violations of restrictions 3, 4, and 5 from Section 5. Violations of these restrictions always involve nontrivial overlay- ings of parts of left-hand sides of equations. In addition to describing which res- triction was violated, and naming the violating equations, the preprocessor tries to report the location of the overlap by naming the critical symbol involved. This is probably the weakest part of the error reporting, and future versions will try to provide more graphic reports for semantic errors. Notice that restriction 5 (left- sequentiality) will be removed in later versions. To specify the offending equations, the preprocessor numbers all equations, including predefined classes (counting 1 for each class), and reports equations by number. In order to be sure of the number- ing used by the preprocessor, and in order to get a more graphic view of the terms in the tree representation used by the preprocessor, the user should see Section 10.5
below.
10.4. Producing a Lexicon val
10.4. Producing a Lexicon to Detect Inappropriate Uses of Symbols (e/) After executing ep Equnsdir the user may produce a lexicon listing in separate categories 1) all declared literal symbols 2) all declared literal symbols not appearing in equations 3) all atomic symbols appearing in equations 4) all characters appearing in equations 5) all truth values appearing in equations. Empty categories are omitted, and symbols within a category are given in alphabet- ical order. A lexicon is produced on the standard output by typing el Equnsdir
el stands for equational lexicon. The lexicon is intended to be used to discover accidental misspellings and omissions that may cause a symbol to belong to a category other than the one intended. Each lexicon is headed by the date and time of the last invocation of ep. Changes to definitions after the given date and time
will not be reflected in the lexicon.
10.5. Producing a Graphic Display of Equations In Tree Form (es)
In order to understand the semantic errors described in Section 10.3, it is useful to see a set of equations in the same form that the preprocessor sees. Not only is this internal form tree-structured, rather than linear, but there may be literal symbols appearing in the internal form that are only implicit in the given definitions, such
as the symbol cons, which appears implicitly in the LISP.M expression (a b c).
72 10. Errors, Failures, Diagnostics
The user may also use the tree-structured form of the terms in his equations to ver- ify that the matching of parentheses and brackets in his definitions agrees with his original intent. To generate a tree-structured display of equations on the standard
output, type
es Equnsdir
es stands for equation show. Unfortunately, the more mnemonic abbreviations are already used for other commands. es may only be used after running ep on the same directory. The output from es lists the equations in the order given by the user, with the sequential numbers used in error and failure reports from the prepro- cessor. Each term in an equation is displayed by listing the symbols in the term in preorder, and using indentation to indicate the tree structure. Variables on the left-hand sides of equations are replaced by descriptions of their ranges, in pointed brackets (<>), and variables on the right-hand sides are replaced by the addresses of the corresponding variables on the left-hand sides. Representations of predefined classes of equations are displayed, as well as equations given explicitly by the user. For example, the following definitions
Symbols
fe 2: h: 1;
include atomic_symbols. For all x, y, z:
S(g(x, y), a) = h(y) where x is in atomic_symbols end where;
include equatom.
produce the listing:
10.5. Producing a Graphic Display 73
Listing of equational definitions processed on Apr 19 at 15:43
I: a
<atomic_symbol> <anything> a
variable I 2 2 equ <atomic_symbol> <atomic_symbol>
e variable | variable 2
Notice that, on the left-hand side of equation 1, the variable x is replaced by <atomic_symbol>, and the variable y is replaced by <anything>, representing the fact that any term may be substituted for y. On the right-hand side, y is
replaced by
variable 1 2
indicating that the corresponding y on the left-hand side is the 2nd son of the Ist son of the root of the term. The date and time at the top refer to the time of invo- cation of ep. The user should check that this time agrees with his memory.
Changes to definitions after the given date and time are not reflected in the display.
10.6. Trace Output (et) A primitive form of trace output is available, which displays for each reduction step the starting term, the redex, the number of the equational rule applied, and
the reductum. In order to produce trace output, invoke the equation interpreter
14 10. Errors, Failures, Diagnostics
with the option ¢ as ei Equnsdir t <input
where Equnsdir is the directory containing the equational definitions, and input is a file containing the term to be reduced. Since ei uses purely positional notation for its parameters, Equnsdir may not be omitted. The invocation of ei above pro- duces a file Equnsdir/trace.inter containing a complete trace of the reduction of
the input term to normal form. To view the trace output on the screen, type
et Equnsdir (Equnsdir defaults to .). et stands for equational trace. The trace listing is headed by the date and time of the invocation of ei resulting in that trace. The user should check that the given time agrees with his memory.
10.7. Miscellaneous Restrictions
Literal symbols are limited to arities no greater than 10, and all symbols are lim-
ited to lengths no greater than 20 in the current version.
11. History of the Equation Interpreter Project
The theoretical foundations of the project come from the dissertation "Reduction Strategies in Subtree Replacement Systems," presented by Michael O’Donnell at Cornell University in 1976. The same material is available in the monograph Com- puting in Systems Described by Equations [O’D77]. There, the fundamental res- trictions 1-4 on the left-hand sides of equations in Section 5 were presented, and shown to be sufficient for guaranteeing uniqueness of normal forms. In addition, outermost reduction strategies were shown to terminate whenever possible, and con- ditions were given for the sufficiency of leftmost-outermost reductions. A proof of optimality for a class of reduction strategies was claimed there, but shown incorrect by Berry and Lévy [BL79, O’D79]. Huet and Lévy later gave a correct treatment of essentially the same optimality issue [HL79].
In the theoretical monograph cited above, O’Donnell asserted that "a good programmer should be able to design efficient implementations of the abstract com- putations" described and studied in the monograph. In 1978, Christoph Hoffmann and O’Donnell decided to demonstrate that such an implementation is feasible and valuable. The original intent was to use the equations for formal specifications of interpreters for a nonprocedural programming languages. For example, the equa- tions that McCarthy gave to define LISP [McC60] could be given, and the equa- tion processor should automatically produce a LISP interpreter exactly faithful to those specifications. Preliminary experience indicated that such applications were severely handicapped in performance. On the other hand, when essentially the same computation was defined directly by a set of equations, the equation inter- preter was reasonably competitive with conventional LISP. So, the emphasis of the
project changed from interpreter generation to programming directly with equa-
16 11. History of the Equation Interpreter
tions.
From early experience, the project goal became the production of a usable interpreter of equations with very strict adherence to the semantics given in Section 1, and performance reasonably competitive with conventional LISP interpreters. The specification of such an interpreter was given in [HO82b], and the key imple- mentation problems were discussed there. Since the natural way of defining a sin- gle function might involve a large number of equations, the second goal requires that the interpreter have little or no runtime penalty for the number of equations given. Thus, sequential checking for applicability of the first equation, then the second, etc. was ruled out, and pattern matching in trees was identified as the key algorithmic problem for the project. The overhead of pattern matching appears to be the aspect of the interpreter that must compete with the rather slight overhead of maintaining the recursion stack in LISP. Some promising algorithms for tree
pattern matching were developed in [HO82a].
In 1979 Giovanni Sacco, a graduate student, produced the first experimentally usable version of the interpreter in CDC Pascal, and introduced some table- compression techniques which, without affecting the theoretical worst case for pat- tern matching, improved performance substantially on example problems. Hoffmann ported Sacco’s implementation to the Siemens computer at the Univer- sity of Kiel, Germany in 1980. Hoffmann and O’Donnell used Sacco’s implementa- tion for informal experiments with LISP, the Combinator Calculus, and the Lambda Calculus. These experiments led to the decision to emphasize program- ming with equations over interpreter generation. These experiments also demon- strated the inadequacy of any single notation for all problems, and motivated the
library of syntaxes provided by the current version. Another graduate student,
11. History of the Equation Interpreter 77
Paul Golick, transferred the implementation to UNIX on the VAX, and rewrote the run-time portion of the interpreter (ei in the current version) in 1980. During 1982 and Spring of 1983, O’Donnell took over the implementation effort and pro- duced the current version of the system. The final year of work involved informal experiments with three different pattern matching techniques, and reconfiguration
of the implementation to allow easy substitution of different concrete syntaxes.
Experience with the interpreter comes from the interpreter implementation itself, from two projects done in the advanced compiler course at Purdue Univer- sity, and form a course in Logic Programming at the Johns Hopkins University. O’Donnell used the equation interpreter to define the non-context-free syntactic analysis for itself, gaining useful informal experience in the applicability of the interpreter to syntactic problems. In 1982, Hoffmann supervised a class project which installed another experimental pattern-matching algorithm in the interpreter, and used the equation interpreter to define a Pascal interpreter. In 1983, Hoffmann supervised another class project using the equation interpreter to define type checking in a Pascal compiler. These two projects generated more information on the suitability of various pattern-matching algorithms, and on the applicability of equations to programming language problems. In 1983, O’Donnell assigned stu- dents in a Logic Programming course to a number of smaller projects in equational programming. One of these projects found the first natural example of a theoreti-
cal combinatorial explosion in one of the pattern-matching algorithms.
12. Low-Level Programming Techniques
Compared to the syntactic restrictions of conventional languages like PASCAL, the restrictions on equations described in Section 5 are a bit subtle. We believe that the additional study needed to understand the restrictions is justified for several reasons. First, the restrictions are similar in flavor to those imposed by determinis- tic parsing strategies such as LR and LALR, and perhaps even a bit simpler. The trouble taken to satisfy the restrictions is rewarded by the guarantee that the resulting program produces the same result, independently of the order of evalua- tion. This reward should become very significant on parallel hardware of the future, where the trouble of insuring order-independence in a procedural program may be immense. Finally, there are disciplined styles of programming with equa- tions that can avoid errors, and techniques for correcting the errors when they occur. A few such techniques are given in this section; we anticipate that a sizable
collection will result from a few years of experience.
12.1. A Disciplined Programming Style Based on Constructor Functions
In many applications of equational programming, the function symbols may be par- titioned into two classes:
1. constructor symbols, used to build up static data objects, and
2. defined symbols, used to perform computations on the data objects.
For example, in LISP M-expressions, the atomic symbols, nil, and the binary sym- bol cons, are constructors, and all metafunction symbols are defined symbols. Technically, a constructor is a symbol that never appears as the outermost symbol on the left-hand side of an equation, and a defined symbol is one that does appear
as the outermost symbol on a left-hand side. The constructor discipline consists of
12.1. A Disciplined Style 719
never allowing a defined symbol to appear on a left-hand side, except as the outer- most symbol. An equational program that respects the constructor discipline
clearly satisfies the nonoverlapping restriction 4 of Section 5.
Example 12.1.1
The following set of equations, in standard mathematical notation, does not respect the constructor discipline, although it does not contain an overlap.
Symbols
fl; g: 2;
include atomic_symbols. For all x:
Sea, x)) = gla, f(x);
g(b, x) =a.
The symbol g is a defined symbol, because it appears outermost on the left-hand side of the second equation, but g also appears in a nonoutermost position on the left-hand side of the first equation. On the other hand, the following set of equa- tions accomplishes the same result, but respects the constructor discipline. For all x:
SAG) = gla, f));
g(a, x) = hG);
g(b, x) =a.
Here, h, a, and b are constructors, f and g are defined symbols. Neither f nor g appears on a left-hand side except as the outermost symbol. The occurrences of f
and g on the right-hand sides are irrelevant to the constructor discipline.
0
80 12. Low-Level Programming Techniques
The constructor discipline avoids violations of the nonoverlapping restriction 4, but it does not prevent violations of restriction 3, which prohibits two different left- hand sides from matching the same term. For example, f(a,x)=a and f(x,a)=a violate restriction 3, although the defined symbol f does not appear on a left-hand
side except in the outermost position.
"When the constructor discipline is applied, the appearance of a defined symbol in a normal form is usually taken to indicate an error, either in the equations or in the input term. Many other research projects, particularly in the area of abstract data types, require the constructor discipline, and sometimes require that defined symbols do not appear in normal forms [GH78]. The latter requirement is often
called sufficient completeness.
It is possible to translate every equational program satisfying restrictions 1-4
(i.e., the regular term reduction systems) into an equational program that respects the constructor discipline. The idea, described by Satish Thatte in [Th85], is to create two versions, f and f’, of each defined symbol that appears in a nonouter- most position on a left-hand side. f remains a defined symbol, while f’ becomes a constructor. Every offending occurrence of f (i.e., nonoutermost on a left-hand side) is replaced by f’. In addition, equations are added to transform every f that heads a subterm not matching a left-hand side into f". Example 12.1.2 Applying the procedure described above to the first equational program in Example 12.1.1 yields the following program. For all x:
S(g'(a,x)) = gla, f(x);
gla, x) = g'(a, x);
12.1. A Disciplined Style 81
glb, x) =a.
0 In the worst case, this procedure could increase the size of the program quadrati- cally, although worst cases do not seem to arise naturally. At any rate, the con- structor discipline should probably be enforced by the programmer as he programs, rather than added on to a given program. In Section 12.4 we show how to use a similar procedure to eliminate overlaps.
The constructor discipline is rather sensitive to the syntactic form that is actu- ally used by the equation interpreter. Example 12.1.3 Consider the following program, given in Lambda notation.
Symbols AP: 2;
include atomic_symbols.
For all x, y: (REFLECT (CONS x y)) = (CONS (REFLECT x) (REFLECT y));
(REFLECT x) = x where x is in atomic_symbols end where. Recall that this is equivalent to the standard mathematical notation: For all x, y:
AP(REFLECT, AP(AP(CONS, x), y)) = AP(AP(CONS, AP(REFLECT, x)), AP(REFLECT, y));
AP(REFLECT, x) = x where x is in atomic_symbols end where.
82 12. Low-Level Programming Techniques
This program does not respect the constructor discipline, as the defined symbol AP appears twice in nonoutermost positions in the left-hand side of the first equation. As long as no inputs will contain the symbols REFLECT or CONS except applied (using AP) to precisely one or two arguments, respectively, the same results may be obtained by the following un-Curried program in standard mathematical nota- tion.
For all x, y:
REFLECT(CONS (x, y)) = CONS(REFLECT(x), REFLECT(y));
REFLECT(x) = x where x is in atomic_symbols end where.
The last program respects the constructor discipline.
0
Example 12.1.4
Weak reduction in the combinator calculus [Sc24, St72] may be programmed in Lambda notation as follows.
Symbols
AP: 2; include atomic_symbols.
For all x, y, z: (Sxy2=(xz 2); (Kx) =x.
As in the first program of example 12.2.1, the constructor discipline does not hold, because of the implicit occurrences of the defined symbol AP in nonoutermost posi-
tions of the first left-hand side. The left-hand sides may be un-Curried to
12.1. A Disciplined Style 83
S(x, y, z) K(x)
The latter program respects the constructor discipline, with S and K being defined symbols, and no constructors mentioned in the left-hand sides. The right-hand sides cannot be meaningfully un-Curried, without extending the notation to allow variables standing for functions. 0
One is tempted to take a symbol in Curried notation as a defined symbol when it appears leftmost on a left-hand side of an equation. Unfortunately, this natural attempt to extend the constructor discipline systematically to Curried notation fails to guarantee the nonoverlapping property. Example 12.1.5 In the following program, given in Lambda notation, the symbol P appears only leftmost in left-hand sides of equations.
Symbols AP: 2;
include atomic_symbols. For all x, y, z:
(Px y) =Q;
(Pxya=R.
The two left-hand sides overlap, however, and (PQQQ) has the two different nor- mal forms QQ and R.
0
Informally, the overlap violation above appears to translate to a violation of restric-
tion 3 in an un-Curried notation. Formalization of this observation would require a
84 12. Low-Level Programming Techniques
treatment of function symbols with varying arities. The appropriate formalization for this case is not hard to construct, but other useful syntactic transformations besides Currying may arise, and might require totally different formalisms to relate
them to the constructor discipline.
Because of the sensitivity of the constructor discipline to syntactic assump- tions, and because the enforcement of this discipline may lead to longer and less clear equational programs, the equation interpreter does not enforce such a discip- line. Whenever a particular problem lends itself to a solution respecting the con- structor discipline, we recommend that the programmer enforce it on himself, and document the distinction between constructors and defined symbols. So far, most examples of equational programs that have been run on the interpreter have respected the constructor discipline, and the examples of nonoverlapping equations not based on constructors have been few, and often hard to construct. So, experi- ence to date fails to give strong support for the utility of the greater generality of nonoverlapping equations. We expect that future versions of the interpreter will enforce even weaker restrictions, based on the Knuth-Bendix closure algorithm [KB70], and that substantial examples of programs requiring this extra generality will arise. Further research is required to adapt the Knuth-Bendix procedure, which was designed for reduction systems in which every term has a normal form,
to nonterminating systems.
12.2. Simulation of LISP Conditionals
The effort expended in designing and implementing the equation interpreter would be wasted if the result were merely a syntactic variant of LISP. For the many problems, and portions of problems, however, for which LISP-style programming is
appropriate, a programmer may benefit from learning how to apply an analogous
12.2. Simulation of LISP Conditionals 85
style to equational programming. The paradigm of LISP programming is the recursive definition of a function, based on a conditional expression. The general
form, presented in informal notation, looks like
Six] = if P,[x] then E, else if Pax] then Ez
else if P,[x] then E,
else En+1
where P,[x] is usually "x is nil", or occasionally "x is atomic", and P,--- ,P, require more and more structure for x. In order to program the same computation for the equation interpreter, each line of the conditional is expressed as a separate equation. The conditions P,,--~-,P, are expressed implicitly in the structure of the arguments to f on the left-hand sides of the equations, and occasionally in syn- tactic restrictions on the variables. Since there is no order for the equations, the effect of the order of conditional clauses must be produced by letting each condi- tion include the negation of all previous conditions. As long as the conditions deal only with the structure of the argument, rather than computed qualities of its value, this translation will produce a more readable form than LISP syntax, and the incorporation of negations of previous conditions will not require expansion of the size of the program. The else clause must be translated into an equation that applies precisely in those cases where no other condition holds. Expressing this condition explicitly involves some extra trouble for the programmer, but has the benefit of clarifying the case analysis, and illuminating omissions that might be
more easily overlooked in the conditional form. If the programmer accidentally
86 12. Low-Level Programming Techniques
provides two equations that could apply in the same case, the interpreter detects a violation of restriction 3. If he neglects to cover some case, the first time that a program execution encounters such a case, the offending application of f will
appear unreduced in the output, displaying the omission very clearly.
Example 12.2.1 Consider the following informal definition of a function that flattens a binary tree into a long right branch, with the same atomic symbols hanging off in the same
order.
flatlk] = = ifxisatomic thenx else if car[x] is atomic then conslcarlx]; flatlcdr[x]]] else flatlcons{car[car[xI]; cons[cdrlcar[x]]; cdrfxIII]
The actual LISP program, using the usual abbreviations for compositions of car
and cdr, follows.
(DEF "(FLAT (LAMBDA (xX) (COND ((ATOM X) Xx) ((ATOM (CAR X)) (CONS (CAR X) (FLAT (CDR X)))) eG (FLAT (CONS (CAAR X) (CONS (CDAR X) (CDR X)))))
The same computation is described by the following equational program, using LISP.M notation. Symbols
flat: 1;
cons: 2;
nil: 0;
include atomic_symbols.
For all x:
12.2. Simulation of LISP Conditionals 87
flat[x] = x where x is in atomic_symbols end where; flatl(x . y)] = (x . flatly) where x is in atomic_symbols end where; flatl((x . y). 2)] = flatlx . Gy. 2).
0
When conditions in a LISP program refer to predicates that must actually be com- puted from the arguments to a function, rather than to the structure of the argu- ments, the programmer must use a corresponding conditional function in the equa- tional program. The translation from LISP syntax for a conditional function to the
LISP.M notation for the equation interpreter is utterly trivial.
12.3. Two Approaches to Errors and Exceptional Conditions
The equation interpreter has no built-in concept of a run-time error. There are failures at run-time, due to insufficient space resources, but no exceptional condi- tion caused by application of a function to inappropriate arguments is detected. We designed the interpreter this way, not because we believe that run-time error detection is undesirable, but rather because it is completely separated from the other fundamental implementation issues. We decided to provide an environment in which different ways of handling errors and exceptions may be tried, rather than committing to a particular one. If errors are only reported to the outside world, then the reporting mechanism is properly one for a syntactic postprocessor. Certain normal forms, such as car[Q] in LISP, are perfectly acceptable to the equation interpreter, but might be reported as errors when detected by a postprocessor. Total support of this approach to errors may require a mechanism for halting evaluation before a normal form is reached, but that
mechanism will be provided in future versions as a general augmentation of the
88 12. Low-Level Programming Techniques
interface (see Section 15.3), allowing the interpreter to act in parallel with other
components. No specific effort should be required for error detection.
If a programmer wishes to detect and react to errors and exceptional condi- tions within an equational program, two basic strategies suggest themselves. In the first strategy, exception detection is provided by special functions that inspect a structure to determine that it may be manipulated in a certain way, before that manipulation is attempted. In the other strategy, special symbols are defined to represent erroneous conditions. Reaction to exceptions is programmed by the way these special symbols propagate through an evaluation. We chose to provide a set- ting in which many strategies can be tested, rather than preferring one.
Example 12.3.1
Consider a table of name-value pairs, implemented as a list of ordered pairs. The table is intended to represent a function from some name space to values, but occa- sionally certain names may be accidentally omitted, or entered more than once. In a program for a function to look up the value associated with a given name, it may be necessary to check that the name occurs precisely once. The following programs all use LISP.M notation. The first applies the strategy of providing checking func-
tions. Efficiency has been ignored in the interest cf clarifying the fundamental
issue. Symbols
cons: 2; nil: 0; occurs: 2; legmap: 2; lookup: 2; add: 2; equ: 2;
if: 3;
include atomic_symbols, integer_numerals, truth_values.
12.3. Two Approaches to Errors 89
For all m, n, 1, v: occursim; O] = 0;
occursim; (n. v) . D] = iflequlm; nl; addloccurstm; U; 1]; occurslm; II;
legmaplm,; I] = equloccurslm; U; 1;
ae (n.w.D] = iflequlm; nl; v; lookuplm; UI;
ifltrue; m:n] =m; — iflfalse; m; n] = 1;
include equint, addint, equatom.
When lookup[m;/] is used in a larger program, the programmer will have to test legmap[m;/] first, if there is any chance that m is not associated uniquely in /. Essentially the same facility is provided by the following program, which applies the strategy of producing special symbols to represent errors. Symbols
cons: 2;
nil: 0;
lookup: 2;
undefined, overdefined: 0;
if: 3; equ: 2;
include atomic_symbols, truth_values. For all m, n, I, v: lookuplm; O] = undefined[]; fooleuploas (a.w. DJ = iflequlm; ‘Aleqallookn lm; U; undefined[]]; v; overdefinedll]; lookuplm; if:
ifltrue; m; n] = m; iflfalse; m; n] =n;
include equatom.
90 12. Low-Level Programming Techniques
0
Either strategy may be adapted to produce more or less information about the pre- cise form of an error or exceptional occurrence. There appears to be no technical reason to prefer one to the other. The choice must depend on the programmer’s
taste. It is probably foolish to mix the two strategies within one program.
12.4. Repairing Overlaps and Nonsequential Constructs
When a set of logically correct equations is rejected by the equation interpreter because of overlapping left-hand sides, there are two general techniques that may succeed in removing the overlap, without starting from scratch. The first, and sim- plest, is to generalize the equation whose left-hand side applies outermost in the offending case, so that the overlap no longer involves explicitly-given symbols, but only an instance of a variable.
Example 12.4.1
Consider lists of elements in LISP.M notation, where a special element missing, different from nil, is to be ignored whenever it appears as a member of a list. Such a missing element would allow recursive deletion of elements from a list in an especially simple way. The equation defining the behavior of missing may easily overlap with other equations.
Symbols
cons: 2; nil: 0;
missing: 0;
: mirror[l] is | concatenated with its own reversal. mirror: 1;
include atomic_symbols.
12.4, Repairing Overlaps 91
For all 1, ll, x: (missing. D = 1; mirrorll] = appendll; reverselt]] where | is either 0 or (x ..11)
end or end where;
The two equations in the fragment above overlap in the term mirror((missing[] . 1)]
This overlap may be avoided by deleting the where clause on the second equation, and allowing the mirror operation to apply to elements other than lists, perhaps with meaningless results. Of course, the first equation will certainly overlap with other equations defining, for example, reverse and append, and these overlaps will require other avoidance techniques. 0
The technique of generalization, shown above, only works when unnecessarily restrictive equations have led to overlap. More often, overlaps may be removed by restricting the equation whose left-hand side appears innermost in the offending case, either by giving more structure to the left-hand side, or by adding extra sym- bols to differentiate different instances of the same operation. The second method seems to be unavoidable in some cases, but it regrettably decreases the readability and generality of the program. Example 12.4.2 Consider an unusual setting for list manipulation, based on an associative concate- nation operator cat, instead of the constructor cons of LISP. An atomic symbol a
is identified with the singleton list containing only a. The following program, given
92 12, Low-Level Programming Techniques
in standard mathematical notation, enforces the associativity of cat by always asso- ciating to the right, and defines reversal as well. Symbols cat: 2; reverse: 1; include atomic_symbols. For all x, y, z:
cat(cat(x, y), z) = cat(x, cat(y, z)) where x is in atomic_symbols end where;
reverse(cat(x, y)) = cat(reverse(y), reverse(x));
reverse(x) = x where x is in atomic_symbols end where.
The restriction of x in the first equation to be an atomic symbol prevents a self- overlap of the form cat (cat (cat(A,B),C),D), but there is still an overlap between the first and second equations in the form reverse (cat (cat(A,B),C)). The same effect may be achieved, without overlap, by restricting the variable x to atomic symbols in the second equation as well. This correction achieves the right output, but incurs a quadratic cost for reverse, because of the reassociation of cats implied by it. See Section 16.1 for a more thorough development of this novel approach to list manipulation, achieving the linear reversal cost that was probably intended by
the program above.
0
Example 12.4.3 In order to approach an implementation of the lambda calculus, one might take the Curried notation in which binary application is the only operation, and add a syn-
tactic substitution operator. The following program defines substitution, and for
12.4. Repairing Overlaps 93
the sake of simplicity only the identity function, using standard mathematical nota- tion. subst(x,y,z) is intended to denote the result of substituting x for each occurrence of y in z.
Symbols
include atomic_symbols, truth_values. For all w, x, y, 2: AP(IO, x) = x;
subst(w, x, y) = iflequ(x, y), w, y) where y is in atomic_symbols end where;
subst (w, x, AP(y, 2)) = AP(subst(w, x, y), subst(w, x, 2)); if(true, x, y) = x; if(false, x, y) = false;
include equatom.
The first and third equations overlap in the form subst(A,B,AP(I,C)). In order to avoid this overlap, change nonoutermost occurrences of the symbol AP on left- hand sides (there is only one in this example, in the third equation) into a new symbol, JAP (Inert APplication). Two additional equations are required to con- vert AP to JAP when appropriate. For all w, x, y, z:
AP(IO, x) = x;
subst(w, x, y) = iflequ(x, y), w, y) where y is in atomic_symbols end where;
subst(w, x, IAP(y, z)) = AP(subst(w, x, y), subst(w, x, z));
94 12. Low-Level Programming Techniques
AP(x, y) = IAP(x, y) where x is in atomic_symbols end where; AP(IAP(x, y)) = IAP(IAP(x, y)); if(true, x, y) = x; if(false, x, y) = false;
include equatom.
Notice that JAP is never used on the right-hand side. Essentially, the use of [AP
enforces innermost evaluation in those cases where left-hand sides used to overlap.
The technique of adding symbols to avoid overlap, illustrated in Example 12.4.3, is essentially the same idea as that used by Thatte [Th85] to translate nonoverlapping equations into the constructor discipline (see Section 12.1). Example 16.3.3 shows how the same technique was used independently by Hoffmann, who thought of overlap as a potential inconsistency in a concurrent program, and used locks to
avoid the inconsistency.
When a logically correct set of equations is rejected by the equation inter- preter because of a failure of left-sequentiality, the first thing to try is reordering the arguments to offending functions. If the program is part of a polished product, the reordering may be accomplished in syntactic pre- and postprocessors. If the trouble of modifying the syntactic processors is too great, the user, regrettably,
must get accustomed to seeing the arguments in the new order.
Example 12.4.4
Consider a program generalizing the LISP functions car and cdr to allow an arbi- trarily long path to a selected subtree. select[t;p] is intended to select the subtree of t reached by the path p from the root of t. Paths are presented as lists of
atomic symbols L and R, representing left and right branches, respectively.
12.4. Repairing Overlaps 95
Symbols
cons: 2; nil: 0;
select: 2; include atomic_symbols. For all x, y, p: selectlx; 0] = x; selectl(x . y); (L. p)] = selectlx; pl; selectl(x . y); (R.. p)] = selectly; pl.
Left-sequentiality fails for the program above because, after seeing the symbol select, there is no way to decide whether or not to inspect the first argument, seek- ing a cons, or the second argument, seeking (). Only after seeing the second argu- ment, and determining whether or not it is Q, can the interpreter know whether or not the first argument is relevant. Left-sequentiality is restored merely by revers- ing the arguments to select. For all x, y, p:
select[0; x] = x;
selectI(L . p); (x . y)] = selectlp; x];
selectI(R . p); (x . y] = selectlp; yl.
0
Failures of left-sequentiality may also be repaired by artificially forcing the interpreter to inspect an argument position, by replacing the variable with a dis- junction of all possible forms substituted for that variable. This technique degrades
the clarity of the program, and risks the omission of some possible form. In the
96 12. Low-Level Programming Techniques
worst case, the forced inspection of an ill-defined argument could lead to an unnecessary infinite computation. Example 12.4.5 The first program of Example 12.4.4 may also be repaired by replacing the first equation,
selectlx; 0] = x;
with the two equations,
select[x; 0] = x where x is in atomic_symbols end where;
selectl(x . y); O] = (x. y);
or, equivalently, by qualifying the variable x to be
either (y . z) or in atomic_symbols end or
0
Whenever permutation of arguments restores left-sequentiality, that method is pre- ferred to the forced inspection. In some cases, however, argument permutation fails where forced inspection succeeds. Example 12.4.6 The parallel or function may be defined by Symbols or: 2; include truth_values. For all x: or(true, x) = true;
or(x, true) = true;
12.4. Repairing Overlaps 97
or(false, false) = false.
Left-sequentiality fails because of the first two equations. If the second equation is changed to or(false, true) = true;
then left-sequentiality is restored. The or operator in the modified equations is sometimes called the conditional or. The results are the same as long as evalua- tion of the first argument to or terminates, but arbitrarily much effort may be wasted evaluating the first argument when the second is true. In the worst case,
evaluation of the first argument might never terminate, so the final answer of true
would never be discovered.
0
13. Use of Equations for Syntactic Manipulations
While computer programming in general remains something of a black art, certain special problem areas have been reduced to a disciplined state where a competent person who has learned the right techniques may be confident of success. In par- ticular, the use of finite-state lexical analyzers and push-down parses, generated automatically from regular expressions and context-free grammars, has reduced the syntactic analysis of programming languages, and other artificially designed languages, to a reliable discipline [AU72]. The grammatical approach to syntactic analysis is also beneficial because the grammatical notation is reasonably self- documenting - sufficiently so that the same grammar may be used as input to an automatic parser generator, and as appendix to a programmer’s manual, providing a reliable standard of reference to settle subtle syntactic issues that are not explained sufficiently in the text of the manual. Beyond context-free manipula- tions, the best-known contender for formal generation of language processors is the attribute grammar [Kn68, AU72]. An attribute grammar is a context-free gram- mar in which each nonterminal symbol may have attribute values associated with it, and each production is augmented with a description of how attributes of non- terminals in that production may be computed from one another. Although they have proved very useful in producing countless compilers, interpreters, and other language processors, attribute grammars have not provided a transparency of nota- tion comparable to that of context-free grammars, especially when, as is usually the case, actual computation of one attribute from another is described in a conven-
tional programming language, such as C.
Linguists who seek formal descriptions of natural languages have tried to
boost the power of context-free grammars with transformational grammars
13. Use for Syntactic Manipulations 99
[Ch65]. A transformational grammar, like an attribute grammar, contains a context-free grammar, but the context-free grammar is used to produce a tree, which is then transformed by schematically presented transformation rules. The parse tree produced by the context-free grammar is called the surface structure, and the result of the transformations is called the deep structure of an input string of symbols. A number of different formal definitions of transformational grammar have been proposed, all of them suffering from complex mechanisms for controlling the way in which tree transformations are applied. We propose the equation inter- preter as a suitable mechanism for the transformational portion of transformational grammars. By enforcing the confluence property, the equation interpreter finesses the complex control mechanisms, and returns to a notation that has the potential for self-documenting qualities analogous to those of context-free grammars. The concepts of surface structure, which captures the syntactic structure of the source text, without trivial lexical details, and deep structure, which is still essentially syn- tactic, but which captures the structure of the syntactic concepts described by the source text, rather than the structure of the text itself, appear to be very useful ones in the methodology of automatic syntactic analysis. The analysis into surface and deep structures should be viewed as a refinement of the idea of abstract syn- tax - syntax in tree form freed of purely lexical issues - which has already proved a
useful organizing concept for language design [McC62, La65].
In this section, we propose that the concept of abstract syntax be made an explicit part of the implementation of syntactic processors, as well as a design con- cept. Rather than extend the traditional formalisms for context-free grammars, we develop a slightly different notation, clearly as strong in its power to define sets of
strings, and better suited to the larger context of regular/context-free/equational
100 13. Use for Syntactic Manipulations
processing.
13.1 An Improved Notation for Context-Free Grammars
The tremendous success of context-free grammars as understandable notations for syntactic structure comes from their close connection to natural concepts of type structure. By making the connection even closer, we believe that clarity can be improved, and deeper structural issues separated more cleanly from superficial lexi- cal ones. The first step is to start, not with source text, which is a concrete realiza- tion of a conceptual structure, but with the conceptual structure itself. By "concep- tual structure," we mean what a number of programming language designers have called abstract syntax [McC62, La65], and what Curry and Feys called formal objects or Obs [CF58]. We do not intend meanings or semantic structures: rather abstract, but syntactic, mental forms that are represented straightforwardly by source texts. Just as programming language design should start with the design of the abstract syntax, and seek the most convenient way to represent that abstract syntax in text, the formal description of a language should first define the abstract structure of the language, then its concrete textual realization. Abstract syntaxes are defined by type assignments.
Definition 13.1.1
Let 2 be an alphabet on which an abstract term is to be built, and let Ig be a set of primitive types used to describe the uses of symbols in 2.
A flat type over To is either a primitive type t €T, or
5,)X% °° Xs, —t where s),°° + S,,t€T.
The set of all flat types over Tg is called I.
A type assignment to 2 is a binary relation TC DX.
13.1. An Improved Notation 101
For f €2, when 7 is understood, f:t,,°** ,f, means that frt; for i€f1,nJ. Usually 7 is a function, so n=1.
The typed language of & and 1, denoted Z,, is the set of all terms built from sym- bols in 2, respecting the types assigned by r. Formally,
If (f,t) €7 for #€1p then f is in the typed language of 2 and 1, and f has type ¢ (i.e., f is a constant symbol of type t).
If E,,°*:,£, are in the typed language of 2 and 7, with types s),°-~ ,s,€I'g, and "if fis,% +++ Xs, et, then f(E,,---,£,) is in the typed language of Z and 7 with type ¢.
r is extended so that E7t, also written E:t, if ¢ is the type of an arbitrary term
E€2,. 0
f(E,,°°+,£,) denotes an abstract term, or tree, with f at the head or root, and subterms E,,--:,£, attached to the root in order. In particular, f(E,, °° - ,E,) does not denote a particular string of characters including parentheses and com-
mas.
Abstract syntaxes are merely the typed languages of typed alphabets as described above. These languages might naturally be called regular tree languages, because they can be recognized by nondeterministic finite automata running from the root to the leaves of a tree, splitting at each interior node to cover all of the sons (a different next state may be specified for each son), and accepting when every leaf yields an accepting state [Th73]. All of the technical devices of this section are found in earlier theoretical literature on tree automata and gram- mars [Th73], but we believe that the particular combination here is worthy of con-
sideration for practical purposes.
102 13. Use for Syntactic Manipulations
Type assignments are almost context-free grammars, they merely lack the ter- minal symbols. Each primitive type ¢ may be interpreted as a nonterminal symbol N, in the context-free grammar, and the type assignment f:s|X--+ Xs, —¢t corresponds to the production N, —*N,,:*:N,. Notice that the type assignment has one piece of information lacking in the context-free production - the name /. Names for context-free productions clarify the parse trees, and make issues such as ambiguity less muddy. The terminal symbols, giving the actual strings in the context-free language, are given separately from the type assignment. This separa- tion has the advantage of allowing different concrete notations to be associated with the same abstract syntax for different purposes (e.g., internal storage, display on various output devices - the translation from a low-level assembly code to binary machine code might even be represented as another notation for the assembly code). Separation also clarifies the distinction between essential structure and superficial typography. We do not suggest that typography is unimportant com-
pared to the structure, merely that it is helpful to know the difference.
The following definition introduces all of the formal concepts needed to define context-free languages in a way that separates essential issues from typographical ones, and that makes the relationship between strings and their abstract syntax trees more explicit and more flexible than in traditional grammatical notation. The idea is to define the abstract syntax trees first, by introducing the symbols that may appear at nodes of the trees, and assigning a type to each. Legitimate abstract syntaxes are merely the well-typed terms built from those symbols. Next, each symbol in the abstract syntax is associated with one or more notational schemata, showing how that symbol is denoted in the concrete syntax, as a string of charac-
ters. Auxiliary symbols may be used to control the selection of a notational
13.1. An Improved Notation 103
schema when a single abstract symbol has several possible denotations. This con- trol facility does not allow new sets of strings to be defined, but it does allow a given set of strings to be associated with abstract syntax trees in more ways than can be done by traditional grammar-driven parsing.
The design of this definition was determined by three criteria.
1. The resulting notation must handle common syntactic examples from the com- piler literature in a natural way.
2. The translations defined by the notation must be closed under homomorphic (structure-preserving) encodings of terms. For example, if it is possible to make terms built out of the binary function symbol f correspond to strings in a particular way, then the same correspondence must be possible when f(x,y) is systematically encoded as apply (apply (f,x),y), an encoding called Curry- ing. If this criterion were not satisfied, a user might have to redesign the abstract syntax in order to achieve typographical goals. The whole point of
the new notation is to avoid such interference between levels.
3. A natural subset of the new notation must equal the power of
syntax —directed translation schemata [Ir61, LS68, AU72].
Definition 13.1.2
Let cat be a binary symbol indicating concatenation. cat(a,8) is abbreviated af. Let V={x),x,°--} be a set of formal variables, VN Z=¢. Let A be a finite alpha- bet of auxiliary symbols, S€A a designated start symbol. A notational specification for a type assignment 7 to abstract alphabet 2 in concrete alphabet 2 with auxiliary alphabet A is a binary relation nS (((ZUV),U {empty}) xd) x ({cat} U Q* U(VxA))y, where 7’ is r augmented so
that each variable in V has every type in Tp (i-e., a variable may occur anywhere in
104 13. Use for Syntactic Manipulations
a term), and y is the type assignment such that each word in 0*, and each anno- tated variable in VXA, is a single nullary symbol of type STRING, and cat is of type STRINGXSTRING —STRING. empty denotes the empty term, without even a head symbol. Without loss of generality, the variables on the left-hand term in the 7 relation are always x),°°*,X,, in order from left to right. Notational specifications are described by productions of the form
E4 is denoted F,
indicating that <E,A>nF. Within F, the auxiliary symbols are given as super- scripts on the variables. When the same pair <E,A> is related by 7 to several expressions F,,°*- ,F,,, the m elements of the relation are described in the form E4 is denoted E, or «++ or E,,. Multiple superscripts indicate that any of the superscripts listed may be used. If only one element of A may appear on a particu- lar symbol, then the superscript may be omitted.
A context—free notational specification is one in which the 7 relation is restricted so that when <E,A>nF, no variable occurs more than once in F.
A simple notational specification is a context-free notational specification in which the » relation is further restricted so that when <E,A>nF, the variables X4,°°*,X, occur in order from left to right in E, possibly with omissions, but without repetitions.
Each notational specification » defines an interpretation nG (Z,xA) x Q*, associat- ing trees (abstract syntax) with strings of symbols (concrete syntax). The interpre- tation is defined by
<E,A>na => <E,A>7a when E contains no variables.
<E[x1, °° + X_1,4 >nal<x),B)>,°°*,<x,,B,>1&
<E,B,>78, & °** & <EmsBm>7Bm &
13.1. An Improved Notation 105
<empty Brnai>mBm+ & °° * & <empty,B,>7B,
=> f(E),°** ,E,,)nalB,, °° By] Fra abbreviates <F,S >na, where S is the start symbol. _ The string language of a notational specification » at type ¢ is {ee 2*| FED, (Ft) €7 & Fra. 0 The formal definition above is intuitively much simpler than it looks. Notational specifications give nondeterministic translations between terms and strings in a schematic style using formal variables. The intent of a single production is that the terms represented by the variables in any instance of that production should be translated first, then those translations should be combined as shown in the produc- tion. In most cases, the left-hand side of a production is of the simple form f(x,,°°+,x,). This form is enough to satisfy design criterion 1 above. The more complex terms allowed on left-hand sides of productions are required to satisfy cri- terion 2, and the empty left-hand sides are required for criterion 3. Greater experience with the notation is required in order to judge whether the added gen-
erality is worth the trouble in understanding its definition.
The string languages of context-free notational specifications and simple nota- tional specifications are precisely the context-free languages, but notational specifications offer more flexibility than context-free grammars for defining tree- string relations. Since the relation between the parse tree and a string has a prac- tical importance far beyond the set of parsable strings, this added flexibility appears to be worth the increase in complexity of the specifications, even if it is never used to define a non-context-free language. The type-notation pairs defined
here are more powerful than the syntax—directed translation schemata of Aho
106 13. Use for Syntactic Manipulations
and Ullmann [AU72] when the abstract-syntax trees are encoded as strings in pos- torder. Context-free notational specifications are equivalent in power to syntax- directed translation schemata, and simple notational specifications are equivalent in power to simple syntax—directed translation schemata and to pushdown trans- ducers. Independently of the theoretical power involved, the notation of this sec- tion should be preferred to that of syntax-directed translation schemata, since it makes the intuitive tree structure of the abstract syntax explicit, rather than encod- ing it into a string. Notice that auxiliary symbols are equivalent to a restricted use of inherited attributes in attribute grammars. Since these attributes are chosen from a finite alphabet, they do not increase the power of the formalism as a
language definer.
A notational specification is generally not acceptable unless it is complete, that is, every abstract term in the typed language has at least one denotation. Completeness is easy to detect automatically. Unfortunately, ambiguity, that is, the existence of two or more denotations for the same abstract term, is equivalent to ambiguity in context-free grammars, which is undecidable. For generating parsers, ambiguity is deadly, since the result of parsing is not well-defined. Just as with context-free grammars, we should probably enforce stronger decidable sufficient conditions for nonambiguity. For unparsers, ambiguity is usually undesir- able, but it might be occasionally useful when there is no practical need to distin-
guish between two abstract terms.
Every context-free grammar decomposes naturally into a type assignment and a simple context-free notational specification, with a trivial A, and 7 a function. The natural decomposition is unique except for ordering of the arguments to each
function symbol, and choice of names for the function symbols. Similarly, every
13.1. An Improved Notation 107
context-free type-notation pair defines a unique natural context-free grammar.
Example 13.1.1
Consider a context-free grammar for arithmetic expressions.
S—N | S—-S+S | S S+S S—(S) N —0|1[2/3]415]6|7/8|9
N—-NO|N1|--- |N9 This grammar decomposes naturally into the type assignment
numeral:N —-S
plus:SxS —-S times:S XS —S I paren:S —S H digitO:N --- digit9:N
extend0:N —-N --+ extend9:N —N
and the context-free notational specification
numeral (x) is denoted x,
plus (x ,x2) is denoted x,+x»
times(x ,,x2) is denoted x\*x
paren(x,) is denoted (x,)
digitO is denoted 0 --- digit9 is denoted 9
extendO(x,) is denoted x,0 --- extend9(x,) is denoted x,9
108 13. Use for Syntactic Manipulations
The type-notation pair derived above is not the most intuitive one for arithmetic expressions, because the paren symbol has no semantic content, but is an artifact of the form of the grammar. To obtain a more intuitive notational specification, for the type assignment that omits paren, delete the paren line from the notational
specification above, and replace the plus and times lines by the following.
plus (x,,x) is denoted x,+x2 or (x,+x»)
times (x,x2) is denoted x \*Xx» or (x1*x)
The auxiliary alphabet A is still trivial, but the notational relation 7 is no longer a
function.
0
The auxiliary symbols in A are not strictly necessary for defining an arbitrary context-free language, but they allow technical aspects of the parsing mechanism to be isolated in the notational specification, instead of complicating the type assign- ment. For example, the extra nonterminal symbols often introduced into grammars to enforce precedence conditions should not be treated as separate types, but
merely as parsing information attached to a single type.
Example 13.1.2
The grammar of Example 13.1.1 is ambiguous. For example, 1+2*3 may be parsed as plus (1,times (2,3)), or times (plus (1,2),3). The usual way to avoid the ambiguity in the context-free grammar is to expand the set of nonterminal symbols as follows.
S-S+T
S—-T
T —T+#R
13.1. An Improved Notation 109
TR R—-(S)
R-N
N —0|1]2/3]4|5|6]7/8]9 | N—NO|N1|++° |N9
This grammar gives precedence to * over +, and associates sequences of the same
operation to the left. The direct translation of this grammar yields the following type assignment and notational specification.
plus:SxT —-S
summand:T +S
times:T XR —-T
multiplicand:R —-T
paren:R —S
numeral:N —R
digit0:N +++ digit9:N
extend0:N —-N +++ extend9:N —-N
plus (x ,x2) is denoted x,+x» summand (x) is denoted x,
times (x 1,2) is denoted x *x multiplicand (x,) is denoted x, paren(x,) is denoted (x) numeral (x,) is denoted x,
digitO is denoted 0 --+ digit9 is denoted 9
110 13. Use for Syntactic Manipulations
extend0(x,) is denoted x,0 +++ extend9(x,) is denoted x,9
As well as paren, we now have the semantically superfluous symbols summand and multiplicand, and types T and R that are semantically equivalent to S. In order to keep the abstract syntax of the second part of Example 13.1.1, while avoiding ambiguity, the parsing information given by these semantically superfluous symbols and types should be encoded into auxiliary symbols $,7,R as follows.
plus:SxS —-S
times:S XS —-S
numeral:N —-S
digitO:N --+ digit9:N
extend0:N —-N +++ extend9:N —-N
plus (x,,x2)5 is denoted xf+xJ or (x$+x7) plus (x,x2)7® is denoted (x$+x7)
times (x1,x2)5"" is. denoted x]ex® or (xT*x®) times (x,,x2)* is denoted xT «xP
numeral (x,)5\™ is denoted x,
digitO is denoted 0 +++ digit9 is denoted 9
extend0(x)) is denoted x0 «++ extend9(x,) is denoted x,9
0
Non-context-free notational specifications allow matching of labels at the beginning and end of bracketed sections. For example, to define procedure
definitions in a conventional programming language so that the name of a pro-
13.1. An Improved Notation 111
cedure appears at the end of its body, as well as in the heading, use the following
notational specification:
procedure (x ,X,x3) is denoted PROCEDURE x(x );x3 END{x,)
Given current software that is available off-the-shelf, the best thing to do with a context-free type-notation pair is probably to convert it into a context-free gram- mar, and apply the usual parsing techniques. The new notation does not show to greatest advantage under such usage, since the tricks that are required to avoid parsing conflicts may complicate the grammar by introducing otherwise unneces- sary auxiliary symbols. It is probably not a good idea to try to parse with non-
context-free type-notation pairs, although unparsing is no problem.
The ideal application of type-notation pairs depends on future availability of structure—editor generators, as an alternative to parsers [MvD82, Re84]. Struc- ture editors are fruits of the observation that source text was developed for particu- lar input techniques, such as the use of punched cards, involving substantial off-line preparation before each submission. During the off-line work, a user needs to work with a presentation of his program that is simultaneously machine-readable and human-readable. With highly interactive input techniques, based on video termi- nals, there is no more requirement that what the user types correspond character by character to what he sees on the screen. Structure editors let short sequences of keystrokes produce simple manipulations of tree structures (i.e., abstract syntaxes), and instantly display the current structure on the screen. The process that must be automated is unparsing, a process technically much simpler than parsing, and immune to the problems of nondeterminism that arise in general context-free pars- ing. Type assignments are ideal for defining the internal structures to be manipu-
lated by structure editors. The notational components still need strengthening to
EE I
112 13. Use for Syntactic Manipulations
deal with the two-dimensional nature of the video display. Unparsing is easy even
for non-context-free type-notation pairs.
13.2. Terms Representing the Syntax of Terms
When equations are used to perform syntactic processing, we often need a way to distinguish different levels of meaning. For example, there is no way to write an equational program translating an arbitrary list of the form (F A, -:: A,) to
F[A,,:°*+,A,]. We might be tempted to write translatel(x . y)] = x[transargsly]],
but F is a nullary symbol in the first instance, and an n-ary function symbol in the second. Also, the use of the variable x in place of a function symbol on the right- hand side is not allowed. Yet, the translation described above is a very natural part of a translation of LISP programs into equational programs. The problem is, that the symbol translate is defined above as if it operates on the objects denoted by S-expressions, when, in fact, it is supposed to operate on the expressions them- selves. Consider the further trouble we would have with a syntactic processor that
counts the number of leaves in an arithmetic expression. We might write
count_leaves (add (x, y)) = add (count_leaves (x), count_leaves (y)).
Technically, this equation overlaps with the defining equations for add. Intuitively,
it is a monster.
To achieve sufficient transformational power for the LISP example, and to avoid the confusion of the add example, we need a notation for explicitly describ- ing other notations -- terms denoting terms. As long as we accept some equations
on terms, we are in trouble letting terms stand for themselves. Rather, we need
13.2. Terms Representing Syntax 113
explicit functions whose use is to construct terms. One natural set of functions for this purpose uses list notation based on nil and cons, as in LISP, plus atomic_symbols as names of symbols within the terms being described, unary func- tions litsym, atomsym, intnum, truthsym, and char to construct symbols of vari-
ous types from their names, and multiap to apply a function symbol syntactically to a list of argument terms.
Example 13.2.1 Using the notation defined above, we may translate LISP S-expressions to the func-
tional terms that they often represent. We —i must translate multiap[litsymI[cons]; (atomsym[F]. --+)] to multiap[litsym[F]; (--+)]. This
may be accomplished naturally by the following equational program.
Symbols
: Constructors for terms nil: 0; cons: 2; litsym, atomsym, intnum, truthsym, char: 1; multiap: 2; include atomic_symbols;
: Translating operators translate: 1; transargs: 1.
For all x, y: translatellitsym[x]] _= litsymlx]: translatelatomsymlx]]_ = atomsymlx]; translatelintnum[x]] _ = intnum[x];
translateltruthsym[x]] = truthsymIx]; translatelchar[x]]_ — = charlx];
translatelmultiapllitsymlcons]; (atomsymIx] . y)]] = multiapllitsymlx]; transargslyll;
transargs[O] = QO;
transargsl(x . y)] = (translatelx] . transargsly)).
114 13. Use for Syntactic Manipulations
Similarly, the leaf counting program becomes Symbols : Constructors for terms
nil: 0;
cons: 2;
litsym, intnum: 1;
multiap: 2;
‘include atomic_symbols;
: Counting operator count: I;
: Arithmetic operator ada: 2.
For all x, y: countlintnum[x]] = 1;
count[multiapllitsymladd]; (x y)]] = addlcount[x]; countlyl].
0
The examples above are not appealing to the eye, but the potential for confusion is so great that precision seems to be worth more than beauty in this case. Special denotations involving quote marks might be introduced, but they should be taken as abbreviations for an explicit form for denoting syntax, such as the one described
above.
In order to apply an equational program to perform a syntactic manipulation, the term to be manipulated should be in an explicitly syntactic form. Yet, the ini- tial production of that term as input, and the final output form for it, are unlikely to be explicitly syntactic themselves. For example, one use of the translation of S- expressions to functional forms is to take an S-expression, translate it into a func- tional form, evaluate the functional form, then translate back to an S-expression.
The user who presents the initial S-expression only wants to have it evaluated, so
13.2. Terms Representing Syntax 115
he should not need to be aware of the syntactic translations, and should be allowed to present the S-expression in its usual form. In order to evaluate the functional form, it must not be given as explicit syntax, else it will not be understood by an evaluator of functional forms. From these considerations, we seem to need transformers in and out of explicit syntactic forms. These translators cannot be defined by the equation interpreter, without getting into an infinite regress, since the transformation of a term into its explicit syntactic form is itself a syntactic transformation. So, the implementation of the equation interpreter includes two programs called syntax and content. syntax transforms a term into its explicit
syntactic form, and content transforms an explicit syntactic form into the term that
it denotes. Thus, the syntax of (A . B) is multiap[litsymI[cons]; (atomsym[A] atomsym[B))],
and the content of
multiap[litsym[f 1; (atomsym[A] intnum[22] atomsym[B))] is f[A; 22; B].
13.3. Example: Type-Checking in a Term Language
A typical problem in non-context-free syntactic processing is type checking, where the types of certain symbols are given by declarations lexically distant from the occurrences of the symbols themselves. While the most popular application of type checking occurs in processing conventional programming languages, such as Pascal, the essential ideas can be understood more easily by considering a simpler language, consisting of declarations followed by a single term. An advanced compiler class taught by Christoph Hoffmann constructed a full type checker for
Pascal in 1983. Substantial extra effort was required to enforce declare-before-use
Ee
116 13. Use for Syntactic Manipulations
and similar restrictions included in the design of Pascal mostly to simplify conven-
tional compiler implementations.
We present the term language in the type-notation form of Section 13.1, and assume that some mechanism is already available to translate from concrete syntax to abstract syntax. The type S denotes a list of declarations paired with a term, E denies a term, D a declaration, A an atomic symbol and T a type. EL, DL, and TL denote lists of declarations, terms, and types, respectively. S, E, and T are also used as auxiliary symbols, corresponding intuitively to their uses as types. The auxiliary symbol P indicates a type occurring within the left-hand side of a func- tional type, and so needing to be parenthesized if it is a functional type itself, EC
and DC indicate nonempty lists of terms and declarations, respectively.
typed_term:DLXE —S
cons:DXDL —DL, EXEL ~EL, TXTL —TL nil:DL,EL,TL
declaration:A XT —D
type:A —T
function:T XTL -T
term:A -E
multiap:E XEL -E
typed_term(dl,e) is denoted dl? .e
’
13.3. Example: Type-Checking 117
cons (d,dl)°© is denoted d; dl or d dl% declaration (a,t) is denoted a:t™ type(a)™? is denoted a
function (t,tl)7 is denoted tI™T® —1 function (ttl)? is denoted (tIT —1)
cons (t,tl)™ is denoted t?xtIT© or t t1N term(a)®-P is denoted a
multiap (e,el)® is denoted e? (el=) multiap (e,el)? is denoted (e?) (el®°) cons (e,el)®© is denoted e,el=© or e elN
nil® is denotede
The operator multiap above applies a function, which may be given by an arbi- trarily complex term, to a list of arguments. term constructs a primitive term, and type a primitive type, from an atomic symbol that is intended to be the name of the term or type. function(t,tl) represents the type of functions whose arguments are of the types listed in t/, and whose result is of type t. The denotational specifications above produce the minimum parenthesization needed to avoid ambi- guity. The empty list is denoted by the empty string. A typical element of the
language above is given by the concrete form
118 13. Use for Syntactic Manipulations
fuxt ot; giltxt 1) xt >t 1; ait.
Sf (a,(gYf,a)) (a))
and the abstract form, given in LISP.M notation, typed_term[ : declaration|f ;function[type(t],(typelt] typelt))]] Heilafation [
Fanction| function(typelt],(type(t))]; , Wametont pelt Gipelil” type[t])],type(t])
i:
declaration|a,typel(t]]
multiap[ Re if);
termla] multiap{
multiaplterml[g],(term[f] termla))]; termla
Although the presentation of the abstract syntax above is not very readable for even moderate sized declarations and terms, it has the flexibility needed to describe a wide class of computations on the terms. The substantial ad hoc extensions to the concrete syntax that would be required to denote portions of well-formed
expressions, and to present equations with variables, would end up being more
13.3. Example: Type-Checking 119
confusing than the general abstract form. In particular, the distinction between semantic function application, involving the operators performing the type check- ing, and syntactic function application in the term being checked, requires some departure from conventional concrete notation. So, even though it is a very poor
display notation, the abstract syntax may be a good internal notation for defining
computations.
The following equational program uses the LISP.M notation to define type checking for the typed terms described above. We assume that equations are given defining operations on symbol tables, as described in Section 16.3.
: checkltyped_termldl; e]] evaluates to true if the term e is type correct : with respect to the declarations in dl.
Symbols:
: constructors for typed terms : declarationla;t] declares the atomic symbol a to have type t : typelal is a primitive type named by the atomic symbol a : Pinetioniall is the type of functions with argument types given by the list tl, and value of type t : termlal] is a primitive term named by the atomic symbol a : multiaplezel] is a term with head function symbol e applied to the list of arguments el typed_term: 2; declaration: 2; type: 1; function: 2; term: 1; multiap: 2;
: standard list constructors cons: 2; nil: 0;
: type manipulating operations typeof: 2; typelist: 2; resulttype: 2; argtypes: 2;
: primitive symbol table operations entertable: 3;
—
120
3. Use for Syntactic Manipulations
emptytable: 0; lookup: 2;
: special type-checking operations check: 1; buildtable: 1; looklist: 2; checkargs: 2; typecheck: 2;
: standard logical symbols equ: 2; and: 2; include truth_values;
: atomic symbols used to identify type, constant, and function symbols include atomic_symbols.
For all d, dl, e, el, t, tl, 2, tl, tll, tl2, a, al, a2, st, b:
: To check a typed term, build a symbol table from the declarations, : then check the term against the symbol table.
checkltyped_termldl,e]] = typechecklbuildtableldll; el; : The symbol table is built by the natural iteration through the list of : declarations.
buildtable[0] = emptytablel];
buildtable[(declaration[a;t] . di)] = entertablelbuildtableldl]; a; tJ; : Final type checking goes recursively through the term. Whenever a : function application is encountered, the type of the function is computed and : checked for consistency with the types of the arguments.
typechecklst; 0] = true;
typechecklst; (e . el)] = andltypechecklst; e]; typechecklst; elll;
typechecklst; termla]] = true;
typechecklst; multiaple; el]] =
andlandltypechecklst; e]; typecheckIst; ell]: checkargslargtypesltypeoflst; el]; typelistlst; elll];
typelistlst; 0] = 0;
13.3. Example: Type-Checking 121
typelistlst; (e . eD] = (typeoflst; e] . typelist[st; ell);
typeoflst; termlal] = lookuplst; al;
typeoflst; multiaple; ell] = resulttypeltypeoflst; ell: resulttypelfunctionlt, tl]] = t;
argtypeslfunctionlt; tll] = tl;
checkargs[0; O] = true;
checkargs[0; (t . 1D] = false;
checkargsl(t . 1D; O] = false;
checkargs[(t1 . tl1); (t2 . 1120] = andleqult1; t2]; checkargs[tl1; t12]]:
: Assume that equations are given for entertable, lookup. : The standard equality test on atomic symbols is extended to type expressions by the natural recursion.
equltypelail; typela2]] = equlal; a2]:
equlfunction{t1; tl1]; function[t2; t12]] = andleqult!; t2]; checkargs[tl1; tl2]];
equlfunctionlt; tl]; typelal] = false; equltypelal; functionlt,; tl]] = false;
include equatom;
: and is the standard boolean function. andltrue; true] = true;
andltrue; b] = b.
The equations above may easily be augmented to report conflicts and omis- sions in the declarations. If type checking is to be followed by some sort of seman-
tic processing, such as interpreting or translating the term, it may be useful to
122 13. Use for Syntactic Manipulations
attach types (or other information from a symbol table) to the symbols in a term, so that the semantic processing does not need the symbol table. This technique was used in the implementation of the equation interpreter itself: each variable symbol in a left-hand side term is annotated to indicate the allowable substitutions for that variable, and each variable in a right-hand side term is annotated to show its address on the corresponding left-hand side. These annotations provide pre- cisely the information needed by the semantic portion of the interpreter. To illus- trate the technique, we present equations annotating each symbol in a typed term with its type.
: typenotes|st; e] annotates each symbol in the term e with the type assigned to it : by the symbol table st. Type conflicts are marked for debugging purposes.
Symbols:
: constructors for types and terms used as in the previous program type. I; Sunction: 2; term: 1; multiap: 2;
‘extra constructors for annotated terms : aterm[a,t] is a primitive term named by the atomic symbol a of type t : typeconflictle] marks the term e as having a typeconflict in the application of its head function symbol to inappropriate arguments aterm: 2; typeconflict: 1;
: standard list constructors cons: 2; nil: 0;
: type manipulating operations typeof: 2; typelist: 2; resulttype: 2; argtypes: 2;
: primitive symbol table operations lookup: 2;
: special type-checking operations
13.3. Example: Type-Checking 123
looklist: 2; checkargs: 2; typenotes. 2;
: standard logical symbols equ: 2; and: 2; if: 3; include truth_values;
: atomic symbols are used to identify type, constant, and function symbols include atomic_symbols.
For all d, dl, e, el, t, t1, t2, tl, tll, tl2, a, al, a2, st, b: : Annotation goes recursively through the term. Whenever a function : application is encountered, the type of the function is computed and checked : for consistency with the types of the arguments. typenoteslst; O] = O; typenoteslst; (e . el] = (typenoteslst; e] . typenotes[st; ell); typenotes|[st; termla]] = atermla; typeoflst; all; typenotes[st; multiaple; ell] = iflcheckargslargtypesltypeoflst; ell: fypelistlst ell]; multiapltypenotesIst; e]; typenotes!st; ell]; typeconflict[multiapltypenotesl[st; el; typenotes|st; ellll]:
: Assume that equations are given for entertable, lookup. : Equations for the remaining operations are the same as in the previous program.
if (true, x, y) = x; if(false, x, y) = y.
ee
14. Modular Construction of Equational Definitions
The equational programming language, although its concepts are far from being primitive operations on conventional computing machines, is not really a high-level language. Rather, it is the assembly language for an unusual abstract machine. The problem is that the set of equations constituting a program has no structure to help in organizing and understanding it. In order to be suitable for solving any but rather small problems, the equational programming language needs constructs that allow decomposition of a solution into manageable components, or modules, with semantically elegant combining operations. In particular, different sets of equa- tions, presented separately, must be combinable other than by textual concatena- tion, and the combining operation must protect the programmer from accidental coincidence of symbols in different components. Furthermore, sets of equations must be parameterized to allow variations on a single concept (e.g., recursion over a tree structure) to be produced from a single definition, rather than being gen- erated individually. In this chapter, we define a speculative set of combining operations for equational programs, and discuss possible means of implementation. None of these features has been implemented for the equational programming language, although some similarly motivated features are implemented in OBJ [BG77, Go84]. Libraries of predefined equations cannot be integrated well into the
equation interpreter until some good structuring constructs are implemented.
In order to combine sets of equations in coherent ways, we need to design the combining operations in terms of the meanings of equational programs, rather than their texts. Based on the scenario of Section 1, the meaning of an equational pro-
gram should be characterized by three things:
14. Modular Equational Definitions 125
1. the language of terms over which equations are given;
2, aclass of models of that language; 3. the subset of terms that are considered simple, or transparent, enough to be
allowed as output. Items 1 and 2 are standard concepts from universal algebra. For equational pro- gramming, the information in 2 may be given equivalently as a congruence relation on terms, but extensions to other logical languages might need the greater general- ity of classes of models. In any case, the use of classes of models is better in keep- ing with the intuitive spirit of our computing scenario. Definition 14.1 Let = be a ranked alphabet, p(a) the rank of a€Z, Dy the set of terms over 2. A model of 2 is a pair <U,y>, where U, the universe, is any set, and y is a map- ping from © to functions over U with y(a):U%™ —U. y extends naturally to Zy by Y(/(E,,---,£,)) = W)) (WE), ++ > W(E,)). A model <U,y> satisfies an equation E=F, written <U,y> |- E=F, if W(E)=)(F). A set W of models satisfies a set E of equations, written W|=E, if <U,y> |= E=F for all <U,y>€W, E=F EE. The set of models of a set E of equations, written Mod(E), is {<U,y>| <U,y> |+ E}. The set of normal forms of a set E of equations, written Norm(E), is {E€X,| VF=GEE F is not a subterm of E). A computational world is a triple <2,W,N>, where & is a ranked alphabet, W is a class of models of Z, and NCZy. N is called the output set.
The world of a set of equations E over %, written World(E), is
126 14. Modular Equational Definitions
<2, Mod(E), Norm(E)>. a
The function World assigns meanings to equational programs by letting 2 be the ranked alphabet defined by the Symbols section, and taking World(E) where E is the set of all instances of equations in the program. The computing scenario of Section 1 may be formalized by requiring, on input E, an output F € Norm (E) such that Mod(E) |= E=F. In principle, we would like separate control of the models and the output terms, but the pragmatic restriction that output terms must be precisely the normal forms -- terms with no instances of left-hand sides of equations -- limits us to constructs that generate both of these components of pro- gram meanings in ways that make them compatible with the evaluation mechanism of reduction. Notice that uniqueness of normal forms means that there may not be two different terms F,,F,€Zy with W |= F\=F,, i.e, there must exist at least one
model in W in which ¥(F\)¥(F,).
Explicit presentation of an equational program is one way to specify a compu- tational world, the purpose of this section is to explore others. Because this aspect of our study of equational logic programming is quite new and primitive, we do not try to build a convenient notation for users yet. Rather, we explore constructs that are both meaningful and implementable, and leave the design of good syntax for them to the future. The OBJ project has gone much farther in the user-level design of constructs for structured definition of equational programs, and the con- structs proposed here are inspired by that work [BG77, Go84]. Our problem is to determine how well such constructs can be implemented in a form consistent with our computational techniques. In particular, we prefer implementations in which
all of the work associated with the structuring constructs is done during the prepro-
14, Modular Equational Definitions 127
cessing of equations, so that the resulting interpreter is of the same sort that we derive directly from a set of equations.
The most obvious way of constructing a computational world is to combine the information in two others. Definition 14.2 Given two ranked alphabets 2,C2,, and a model <U,y> of 2), we may restrict this model naturally to a model of 2, by restricting y to Z,. When two classes of models, W, and W2, are given over different alphabets, 2, and 2., the intersection W,NW, is the set of all models of Z;UZ, whose restrictions to 2, and Z» are in W, and W2, respectively. In the case where 2;=Zp, this specializes naturally to the conventional set-theoretic intersection. Given two computational worlds, <2, ,W, ,N,;> and <Z,, W2, N2>, the sum <2), Wi, Ny >+<2,, W2, N2> is <2, UZ, W)NW2, Ni; UN2>. 0 The sum described above is similar to the enrich operation of [BG77], except that enrich requires one of its arguments to be given by explicit equations. The sum of computational worlds corresponds roughly to concatenation of two sets of equa- tions. Even this simple combining form cannot be implemented by simple-minded textual concatenation, because a variable in one program may be a defined symbol in the other. If an equational program is syntactically cooked into a form where each instance of a variable is marked unambiguously as such, then concatenation of cooked forms will often accomplish the sum of worlds. The equation interpreter is implemented in such a way that the cooked form is created explicitly, and may pro- vide the basis for a very simple implementation of the sum. For greater efficiency,
we would like to implement the sum at the level of the pattern-matching tables
128 14. Modular Equational Definitions
that drive the interpreter. Such an implementation depends critically on the choice
of pattern matching technique (see Section 18.2).
The concatenation of syntactically cooked programs described above succeeds only when the combined equations satisfy the restrictions of Section 5. Certainly, this need not be the case, since the sum of computational worlds with unique nor- sadt forms may not have unique normal forms. For example, combining the single- ton sets {a=b) and (a=c} fails to preserve uniqueness of normal forms. This prob- lem is inherent in the semantics of the sum, and can only be solved completely by an implementation that deals with indeterminate programs. Unfortunately, there are cases where the semantics of the sum does maintain uniqueness of normal
forms, but the concatenation of equations fails nonetheless.
Example 14.1
Consider the following two sets of equations:
1 (f(g(x))=a , a=b}
2. {g(c)=c , b=a}
Each of these sets individually has unique normal forms, and every reduction sequence terminates. Their sum has uniqueness of normal forms, but not finite ter- mination. Notice that, in the sum, the unique normal form of f (g(c)) is f (c), but
there is also an infinite reduction f(g(c))=a=b=a---. The combined set of
equations cannot be processed by the equation interpreter, because of the overlap between f (g(x)) and g(c). 0
This failure to achieve an implementation of the sum in all semantically natural
cases is a weakness of the current state of the equation interpreter. Of course, one
14. Modular Equational Definitions 129
may fiddle with the semantics of the sum to get any implementation to work, but we would rather seek implementation techniques that approach closer to the semantic ideal.
The sum allows for combining equational programs, but by itself is not very useful. We need a way of hiding certain symbols in a program, so that sums do not produce surprising results due to the accidental use of the same symbol in both summands. Other changes in notation will be required so that a module is not res- tricted to interact only with other modules using the same notational conventions. Both of these needs are satisfied by a single semantic construct.
Definition 14.3
Given a model <U,,y,> over the alphabet 2), a model <U,,y> over the alpha- bet 2», and a relation 6C2,4%2Z24z, <U),y)>5<U,y.> if
VE\,F\€2\4 £2,F2€ 224 ESE, & Fi6F, & p(E)) =, (F)) > y,(E,)=y,(F,) For a set W of models over Z,, 6[W] is the set of all models over 2, that are in the 6 relation to some member of W.
For a subset NG, y, [NV] is the set of all terms in 24 that are not in the 6 rela- tion to any term not in NV.
Given a computational world <2Z,,W,N>, an alphabet 2, and a relation
5G 4XZ 4, the syntactic transform of <2,,W,N>, 5, 2] is <2, 6[W], SIN I>.
0
The syntactic transform defined above may accomplish hiding of symbols, by let- ting ZC), with 6 the equality relation restricted to Z,. A change of notation, for example letting f (x,y,z) represent g(x,h(y,z)), is accomplished by a 6 relating each term E to the result E' of replacing every subterm of the form
g(F,,h(F2,F3)) with the form f(F,,F2,F;). The syntactic transform is similar to
130 14. Modular Equational Definitions
the derive operator of [BG77].
The syntactic transform for symbol hiding may be implemented by renaming the symbols in 2, — Z in such a way that they can never coincide with names produced elsewhere. Other syntactic transforms require different implementations depending on the characteristics of the 6 relation. Suppose 6 is a partial function, defined by a set of equations suitable for the equation interpreter. Add to the
equations defining 6 the equation
6(v) = v where v is a variable symbol.
The resulting equational program may be used to transform equations for <2,,W,N> into equations defining ol<2,,W,N>, 6, 22]. If the transformation eliminates all instances of symbols not in 24, and produces equations satisfying the restrictions of Section 5, then we have an implementation of the syntactically transformed world. We have not yet determined how easy or hard it is to produce
definitions of useful 5s allowing for this sort of transformation.
If 5 is a one—to—one correspondence between some subset of 2,4 and a sub- set of Z¥, and if § and &' can be programmed nicely, then the syntactic transform may be implemented by applying 5! to the input, applying an existing program for <Z,,W,N> to that result, and finally applying 6. If 6 is defined by an equational program, and if the result of reversing each equation satisfies the res- trictions of Section 5, then we may use those reversals as a program for &!. Alter- natively, given equational programs for 5 and 5! we may try to verify that they are inverses by applying the program for 6! to the right-hand sides of equations in the program for 5, to see if we can derive the corresponding left-hand sides.
Further study is required to determine whether either of these techniques works
14. Modular Equational Definitions 131
often enough to be useful. In any case, the last two techniques produce something more complex than a single equational program, raising the question of how to con-
tinue combining their results by more applications of sum and syntactic transform.
15. High-Level Programming Techniques
This section treats high-level programming concepts that fit naturally on the evaluation mechanism of the equation interpreter. The current version of the inter- preter does not provide any special syntax to support these techniques, but future
work may lead to convenient syntaxes for applying them.
15.1. Concurrency
Although the current implementation of the equation interpreter runs on conven- tional sequential computing machines, some qualities of equational logic show interesting potential for implementation on parallel machines of the future. Even with a sequential implementation, the ability to express programming concepts based on concurrent computations gives a conceptual advantage. There are three distinguishable sources of concurrent programming power in a reduction-based evaluation for equational logic, two of which are provided by the current implemen-
tation.
The simplest source of concurrent programming power arises from the possi- bility of evaluating independent subexpressions concurrently. Roughly speaking, this means that several, or all, of the arguments to a function may be evaluated simultaneously (in general, evaluation of a single argument may be partial instead of complete). This potential concurrency arises simply from the absence of side- effects in equational evaluation, and is the same as the potential concurrency in Functional Programming languages [Ba78]. A genuinely parallel implementation of the equational interpreter might realize a substantial advantage from such con- currency. There is also a definite conceptual advantage to the programmer in
knowing that order of evaluation of arguments is irrelevant to the final result. In
15.1. Concurrency 133
most cases, however, no style of programming is supported that could not translate
quite simply into sequential evaluation in some particular order.
Concurrent evaluation of independent subexpressions becomes an essential feature when it is possible to reach a normal form without completing all of the evaluations of subexpressions, and when there is no way a priori to determine how much evaluation of which subexpressions is required. The most natural example of this behavior is in the parallel or, defined by the equations:
or(true, x) = true; or(x, true) = true;
or(false, false) = false.
Intuitively, it seems essential to evaluate the arguments of the or concurrently, since one of them might evaluate to true while the other evaluation fails to ter- minate. In the presence of other equations defining other computable functions, there is no way to predict which of the arguments will evaluate to a truth value. Translation of the intuitively concurrent evaluation of an expression containing ors into a sequential computation requires a rather elaborate mechanism for explicitly interleaving evaluation steps on the two arguments of an or. The current imple- mentation of the equation interpreter does not support the essential concurrency involved in the parallel or, due to the restriction number 5 of Section 5 (left- sequentiality). Section 19 presents evidence that the parallel or cannot be satisfac- torily simulated by the sequentializable definitions allowed by the current imple- mentation. Future versions will support the parallel or, and similar definitions. The basis of the implementation is a multitasking simulation of concurrency on a sequential machine. We do not know yet how small the overhead of the multitask-
ing can be, and that stage of the project awaits a careful study of the concrete
134 15. High-Level Techniques
details involved. The general problems of supporting this sort of concurrency on
conventional sequential machines are discussed in Section 18.3.
The final source of concurrent behavior is easily overlooked, but is arguably the most useful of them all. This sort of concurrency results from the outermost, or "lazy", evaluation strategy, that allows nested subexpressions to be reduced con- currently. Roughly, this means that evaluation of a function value may go on con- currently with evaluation of the arguments. Every piece of partial information about the arguments may immediately result in partial, or even complete, evalua- tion of a function value. As with the parallel or, this sort of concurrency translates into sequential behavior only at the cost of a careful interleaving of steps from the intuitively concurrent components of a computation. Unlike the unconstrained interleaving of the parallel or, the interleaving of nested evaluation steps is highly constrained by the need to generate enough information about arguments to allow an evaluation step on a function of those arguments. Nested concurrency is sup- ported by the current version of the interpreter, as it depends only on the outermost evaluation strategy. It allows a style of programming based on pipelined corou-
tines, or more generally on dataflow graphs, as shown in Section 15.3.
15.2. Nondeterminism ys. Indeterminacy
Dijkstra argues eloquently [Di76] that programmers should not be required to specify details that are inessential to solving a given problem. Not only is inessen- tial detail wasteful of programming time, it may have an adverse effect on the clar- ity of the program, by requiring a reader to separate the essential issues from the inessential. On this consideration, Dijkstra proposes a nondeterministic program- ming language, based on guarded commands. Not only are the computation steps
nondeterministic, the final answer of a guarded-command program may not be
15.2. Nondeterminism vs. Indeterminacy 135
uniquely determined by the inputs. This indeterminacy in the output is sometimes desirable in addition to nondeterminism in the computation steps, since many prob-
lems admit several acceptable answers.
The equation interpreter supports some of the advantages of nondeterministic specification of evaluation steps. Although the current implementation chooses, rather arbitrarily, to evaluate from left to right, the semantics of the language allow for any order of evaluation that finds a normal form. The presence or absence of real nondeterminism in the implementation is never visible to a user in the results of evaluation, since the restrictions of Section 5 guarantee uniqueness of the normal form, independently of order of evaluation. This guarantee of unique- ness is helpful in solving problems with uniquely determined answers, but it prevents taking full advantage of the simplifications allowed by nondeterministic and indeterminate programs for those problems with several acceptable answers. In particular, it prevents a satisfying implementation of guarded commands [Di76] by equations. The ideal facility would allow equational definitions with multiple nor-
mal forms, but recognize special cases where uniqueness is guaranteed.
There are no plans to introduce indeterminacy into the equation interpreter, not because of anything fundamentally repugnant about indeterminacy, but because the only simple way that we know to relax the uniqueness of normal forms causes other problems in the semantics of equational programming. The unique- ness of normal forms in equations satisfying the restrictions of Section 5 is a conse- quence of the Church-Rosser, or confluence, property. This property says that, whenever an expression A may reduce to two different expressions B and C, then there is another expression D to which both B and C reduce. The confluence pro-
perty is required, not only to guarantee uniqueness of normal forms, but also to
136 15. High-Level Techniques
guarantee that some normal form will be found whenever such exists. The problem arises when some expression A reduces to normal form, and also has an infinite reduction sequence. The confluence property guarantees that, no matter how far we pursue the infinite path, it is still possible to reduce to the normal form. Without the confluence property, we may have such an A, and an expression B appearing on its infinite reduction sequence, but B may not be reducible to normal form. Yet, by the semantics of equational logic, B is equal to A, and therefore to
any normal form of A. Figure 15.2.1 shows this situation graphically.
/ /
/
Figure 15.2.1
So, in a system of equations without the confluence property, there may be an expression B with a normal form, but the only way to find that normal form may
require backwards reductions. All of the helpful theory used to find an efficient
15.2. Nondeterminism vs. Indeterminacy 137
forward reduction to normal form fails to apply to backward reductions. It seems quite unlikely that a weaker condition than confluence can be designed to allow indeterminacy, but guarantee reductions to normal forms, so an indeterminate equation interpreter probably requires a substantial new concept in evaluation tech- niques. Even if such conditions are found, equational programming is probably the wrong setting for indeterminacy, since it would be impossible to have, for example, an expression a equal to each of the normal forms a and 4, and an expression 8 equal to each of the normal forms b and c, without also having a =c and B =a. The computation techniques of reduction sequences could be applied to asymmetric congruence-like relations in order to produce indeterminacy, but such relations are
not well established in logic, as is the symmetric congruence relation of equality.
15.3. Dataflow
The nested concurrency provided by outermost evaluation allows a very simple translation of dataflow programs into equational programs. A dataflow program [KM66] consists of a directed graph with an optional entry and an exit. The entry, if any, represents an input sequence of values, the exit an output sequence. Each edge in the graph represents a sequence of values communicated between two processes, represented by the nodes. In the semantics of a dataflow graph, the processes at the nodes are restricted to view their incoming sequences in order, and to produce each outgoing sequence in order, although several proposals for imple- menting them use numerical tags to simulate the order, and allow the actual time sequence to be different. A common kind of node, called a copy node, contains a process that merely copies its single incoming edge to each of its outgoing edges. It is easier for our purposes to group a copy node and its edges into one multiedge
with a single head and several tails.
138 15. High-Level Techniques
Every dataflow graph in which the node processes are all determinate as func- tions of their incoming sequences (i.c., they cannot make arbitrary or timing- dependent steps) may be translated easily into an equational program running on the equation interpreter. The basic idea is to use a convenient list notation, such as LISP.M, and replace each edge in the dataflow graph by a constant symbol, whose value will turn out to be the (possibly infinite) sequence of values transmitted through that edge. Each node becomes a set of function symbols, one for each out- going edge, with arity equal to the number of incoming edges. Equations are writ- ten to define each node function. Finally, the connection structure of the dataflow graph is realized by a single defining equation for each of the edge constants. That is, if f is the function symbol representing a node with incoming edges a and b, and outgoing edge c, include the equation c[] = f[a, 6]. In some cases structural equations may be condensed with some of the equations defining node processes. In particular, tree-shaped subgraphs of the dataflow graph may be condensed into single expressions, using edge names only to break loops. A single example should
suffice to illustrate the technique.
Example 15.3.1 Consider the dataflow graph of Figure 15.3.1. The labels on edges and nodes indi- cate the corresponding constant and function symbols. Assuming that equations are given to define f, g, A, the following equations give the structure of the graph: al] = flinputfl, bil; bf = glafll; output] = hla; bf].
These equations may be condensed by eliminating either a or b (but not both),
15.3. Dataflow 139
Figure 15.3.1
yielding 60 = glflinputQ; bf: output] = hlflinputf]; bf; bf.
or all = flinputf]; glaf]ll; output] = hlall; glaff]].
0
In order to make use of the equational translation of a dataflow program, we also require a definition of input, and some mechanism for selecting the desired part of output, unless it is known to be finite (and even short). The idea of translating dataflow graphs into equations comes from Lucid [WA85].
For a more substantial example of dataflow programming, consider the prime number sieve of Eratosthenes. This elegant version of the prime number sieve is based on a dataflow program by Mcllroy [Mc68, KM77]. The basic idea is to take
the infinite list (2 3 4 ...), and remove all multiples of primes, in order to produce
140 15. High-Level Techniques the infinite list of primes (the same one used to produce the multiples of primes that must be removed to produce itself). Figure 15.3.2 shows the dataflow graph corresponding to this idea, with each edge and node labelled by its corresponding
symbol in the equational program.
primes
intlist [3]
Figure 15.3.2
The following equational program captures the concept of the dataflow graph above, varying the notation a hit for clarity. Of course, in the equational program,
it is essential to evaluate, not the infinite lists themselves, but an expression produc- ing some finite sublist.
Example 15.3.1
: The following definitions are intended to allow production of lists of : prime numbers. The list of the first i prime numbers is firstnli; primes[]].
Symbols
: List construction and manipulation operators cons: 2; nil: 0; first: 1; tail: 1; firstn: 2;
15.3. Dataflow 141
: Logical and arithmetic operators
if: 3;
add, subtract, multiply, modulo, equ, less: 2; : Operators associated with the prime sieve
intlist: 1;
sieve: 2;
fact: 2;
primes: 0;
: Primitive domains include integer_numerals, truth_values.
For all i, j, q, r:
: firstlq] is the first element in the list q. : taillg] is the list of all but the first element in the list q.
firslG. Q)] =i; taillG. J] = 4; : firstnli; q] is the list of the first i elements in the list q. firstnli; q] = iflequli; 0]; O; Girstlq] . firstnlsubtractli; 1]; taillgl)J;
: if is the standard conditional function. ifltrue; i; j] = i; iflfalse; i; j] = j;
include addint, multint, subint, modint, equint, lessint;
: intlistli] is the infinite list @ it1 i+2 ...).
intlistli] = (i . intlistladdfi; 1]); : The definitions of sieve and fact assume that r is given in increasing order, that contains all of the prime numbers, and that r contains nothing less than 2.
: sievelq; r] is the infinite list of those elements of the infinite list q that are ‘not multiples of anything on the infinite list r.
sievel@i . q); r] = iflfactli; rl; sievelq; r]; G . sievelg; rD)I; : factli; r] is true iff the infinite list r contains a nontrivial factor of i. factli; G . r)] = ifflessh; multiply[j; jl]; false; iflequlmoduloli; jl; 01;
142 15. High-Level Techniques
true,
factli; r]]]; : primes[] is the infinite list of prime numbers, (2 3 57 11 13 17... primesf] = (2. sievelintlist[3]; primesfD.
The correct behavior of this dataflow program for primes depends on the not-so- obvious fact that there is always a prime between n and n. If not, the loop
through the sieve and cons nodes would deadlock.
While the outermost evaluation strategy guarantees that the internal computa- tions of an equational program will satisfy the semantics of an associated dataflow graph, there is still an interesting issue related to input to and output from the interpreter. At the abstract level, input is any expression, and output is a corresponding normal form. It is semantically legitimate to think of the input expression being provided instantaneously at the beginning of the computation, and the output expression being produced instantaneously at the end. An implementa- tion following that idea strictly will not allow a dataflow behavior at the input and output interfaces of the interpreter. The all-at-once style of input and output forms a barrier to the composition of several equational programs, since it forces the, pos- sibly very large, expressions transmitted between them to be produced totally, before the receiving program may start. In the worst case, an infinite term may be
produced, even though only a finite part of it is required by the next step.
The current version of the equation interpreter incorporates a partial dataflow interface on output, but none on input. The output pretty-printers do not support incremental output, so a user may only benefit by using the internal form of out- put, or writing his own incremental pretty-printer. Interestingly, incremental out-
put from the interpreter was easier to code than the all-at-once form. An output
15.3. Dataflow 143
process drives the whole evaluation, by traversing the input expression. When it reaches a symbol that is stable -- that can clearly never change as a result of further reduction -- it outputs that symbol, and breaks the evaluation problem into several independent problems associated with the arguments to the stable symbol. Section 17.2 gives a careful definition of stability, and Section 18.2 describes how it is detected. When the output process finds an unstable symbol, it initiates evalua- tion, which proceeds only until that particular symbol is stable. The control struc- ture described above is precisely the right one for outermost evaluation, even if incremental output is not desired.
In the computation described above, the interpreter program must contain an a priori choice of traversal order for expressions -- Jeftmost being the natural order given current typographical conventions. Leftmost production of an output expression would support natural incremental output of lists, using the conventional LISP notation. Since the essential data structure of the equation interpreter is trees, not lists, and there is no reason to expect all applications to respect a left- most discipline, a future version of the interpreter should support a more flexible output interface. Probably the right way to achieve such an interface is to think of an interactive question-answer dialogue between the consumer of output and the interpreter. The precise design of the question-answer language requires further study, but it is likely to be something like the following. The output consumer possesses at least one, and probably several, cursors that point to particular subex- pressions of the normal form expression coming from the interpreter. The output consumer may issue commands of the form "move cursor x to a", where a may be the father, or any specified son, of x’s current position. The interpreter does pre-
cisely enough evaluation to discover the new symbol under the cursor after each
144 15. High-Level Techniques
motion, and reports that symbol on its output. Thus, the order of traversal may be
determined dynamically as a result of the symbols seen so far.
The question-answer output interface described above may also be used for the input interface, with the equation interpreter producing the questions, and some input process providing the answers. Such an interface will certainly allow connec- tion of several equational programs in a generalized sort of pipeline. Another sort of input interface is likely to be more valuable, and also more difficult to construct. In order for equational programs to be used as interactive back ends behind struc- ture editors, a purpose for which the equational programming style appears to be nicely suited as argued in Section 13, an incremental input interface must be pro- vided in which the input process (essentially the user himself) determines the order in which the input expression is produced. Worse, existing portions of the input term may change. Changing input is substantially more difficult to accommodate than incrementally produced input. Besides requiring an efficient mechanism for incremental reevaluation, avoiding the reevaluation of unchanged portions of an expression, some adjustment of the output interface is required to notify the output consumer of changes as they occur. Apparently, the output consumer must be able to specify a region of interest, and be notified of whatever happens in that region of the expression. Further study is required to find a convenient representation of a region of interest, taking into account that the topology of the expression is subject
to change as well as the symbol appearing at a particular node.
The incremental evaluation problem for equational programs appears to be substantially more difficult than that for attribute grammars [RTD83]. The elegance of Teitelbaum’s and Reps’ optimal reevaluation strategies for attribute
grammars gives that formalism a strong edge in competition for beyond-context-
15.3. Dataflow 145
free processing. Work is in progress to close that gap. The same factors that make incremental evaluation more difficult for equational programs will also make a good solution more valuable than that for attribute grammars. The optimal reevaluation strategy for attribute grammars treats attributes and the functions that compute one attribute from another as primitives. The reevaluation strategy only deter- mines which attributes should be recomputed, it does not treat the problem of incremental reevaluation of an individual attribute with significant structure. Yet, one of the most critical performance issues for attribute reevaluation is avoidance of multiple copies and excess reevaluation of components of highly structured attri- butes, especially symbol tables. A lot of work on special cases of reevaluation of large, structured attributes is underway, but we are not aware of any thoroughly general approach. A good incremental evaluation strategy for equational programs will inherently solve the total problem, since the expression model of data underly- ing the equation interpreter makes all structure in data explicit, rather than allow-
ing large structures to be treated as primitives.
All of the preceding discussion of input and output ignores the possibility of sharing of identical subexpressions in an expression. The equation interpreter could not achieve an acceptable performance in many cases without such sharing. Perhaps such sharing should be accommodated in the treatment of input and out-
put as well, but careful thought is required to do so without distasteful complexity.
15.4. Dynamic Programming
Dynamic programming may be viewed as a general technique for transforming an inefficient recursive program into a more efficient one that stores some portion of the graph of the recursively defined function in an array, in order to avoid recom-
putation of function values that are used repeatedly. In a typical application of
146 15. High-Level Techniques
dynamic programming, the user must completely specify the way in which the graph of the function is to be arranged in an array, and the order in which the graph is to be computed. The latter task may be handled automatically by the equation interpreter. To illustrate this automation of part of the dynamic program- ming task, we give equations for the optimal matrix multiplication problem of [AHU74]. Instead of defining only a small finite part of the graph of the cost function, we define the infinite graph, and the outermost evaluation strategy of the equation interpreter guarantees that only the relevant part of the infinite graph is actually computed.
Example 15.4.1
The following equations solve the optimal matrix multiplication problem from The Design and Analysis of Computer Algorithms, by Aho, Hopcroft and Ullman, sec- tion 2.8. Given a sequence of matrix dimensions, (dy d, -- - d,,,), the problem is to find the least cost for multiplying out a sequence of matrices M,+*M *-:-> M,, where M; is d;_,Xd;, assuming that multiplying an ixj matrix by a jXk matrix to get an ixk matrix costs ixj*k. There is an obvious, but exponentially inefficient recursive solution, expressed in a more liberal notation than that allowed by the equation interpreter:
costl(dg- ++ d,,)] = min{costl (do +++ dj) Heost (dja, +++ dp.) Hdoxd +d, |0<i<m)
cost((dod,)] =0
The problem with the recursive solution above is that certain values of the function cost are calculated repeatedly in the recursion. Dynamic programming yields a polynomial algorithm, by storing the values of cost in an array, and computing
each required value only once. In the following equations, the function cost is
15.4. Dynamic Programming 147
represented by an infinite-dimensional infinite list giving the graph of the function:
costgraph(O] =
(0 (costl(1)] (cost{A 1] (ost 1 1].. . (cost[( 1 2)]...
.d (cost{Q 2] (costI(l 2 1]... ) Censiltt 2 2)]...)
me, (cost{(2)] (cost[(2 1)] couse 11J...) wad 7
That is, cost{ (dg --- d,,)] is the first element of the list which is element d,,+1 of element d,, — ;+1 of ... element do+1 of costgraph{0]. cost[(i)] is always 0, but explicit inclusion of these Os simplifies the structure of costgraph. costgraph([al, for a*¥(), is the fragment of costgraph[Q] whose indexes are all prefixed by a. Symbols
: operators directly related to the computation of cost cost: 1; costgraph: 1; costrow: 2; reccost: I; subcosts: 2;
: list-manipulation, logical, and arithmetic operators
cons: 2; nil: 0; min: 1; index: 2; length: 1; element: 2;
148 15. High-Level Techniques
equ: 2;
less: 2;
subtract: 2;
multiply: 2;
include integer_numerals, truth_values.
For all a, b, i, j, k, x, y:
costla] = indexl[a; costgraphIOI]; : costgraphla] is the infinite graph of the cost function for arguments starting : with the prefix a.
costgraphla] = (reccostla] . costrowla; 1]); : costrowla; i] is the infinite list : (costgraphlai] costgraphlai+1] ... ) : where ai is a with i added on at the end.
costrowla; i] =
(costgraphladdendla; i]] . costrowla; addli; 111);
: reccostla] has the same value as costla], but is defined by the recursive equations : from the header.
reccostl(i j)] = 0; reccostI()] = 0; reccost[O] = 0;
reccostl(i j . a)] = minlsubcostsl(i j . a); lengthlall]
where a is (k . b) end where;
: subcostsla; i] is a finite list of the recursively computed costs of (dO ... dm), : fixing the last index removed at i, i-I, ... 1.
subcostsla; i] =
iflequli; 0]; 0;
(addladdlcostffirstnladdli; 1]; all; costlafternladdli; 1]; aJJ]; multiplylmultiplylfirstla]; elementladdli; 1]; all; lastlal]DI;
: Definitions of list-manipulation operators, logical and arithmetical operators. minl@)] = i; minl(i. a)] = ifflessli; minlall; i; minlal]
where a is (k . b) end where;
15.4. Dynamic Programming 149
indexlO; (x . bJ] = x;
indexl(i . a); x] = indexla; elementladdli; 1]; xIl;
length[O] = 0;
lengthl(x . a)] = addflengthlal; 1];
elementli; (c . a)] = iflequli; 1]; x; elementlsubtractli; 1]; all; firstnli; a] = iflequli; 0]; O;