perm filename CGOL.ITS[LIB,LSP] blob sn#316147 filedate 1978-02-11 generic text, type T, neo UTF8
	     CGOL - an Algebraic Notation For MACLISP users

			    V. R. Pratt



	MACLISP programmers who feel comfortable with ALGOL-like
notation, that is, an algebraic style in which one might write a
matrix multiply routine as
	for i in 1 to n do
	  for k in 1 to n do
	    (ac := 0;
	     for j in 1 to n do
		ac := ac + a(i,j)*b(j,k);
	     c(i,k) := ac)
can now write LISP programs in just such a notation.  This notation is
essentially transparent to the MACLISP system, and files containing
CGOL code (possibly mixed in with standard code) can be read by the
interpreter and compiled by the compiler just as though they were
written in straight LISP notation. 


    1.1  Representational Issues

	LISP S-expressions ("abstract" objects in a domain that
contains atoms and is closed under the pairing function CONS) require
an internal and an external representation ("concrete" objects).  The
former is for the convenience of the programmer, the latter for that
of the machine.  The functions (READ) and (PRINT) provide maps between
the two concrete representations. 

	We use the term "LISP" principally to denote the abstract
objects; out of respect for usage, however, we shall also use it to
denote "the" standard external representation, which varies mainly in
detail from one implementation to the next.  We rely on context to
disambiguate these usages.  We use "INT" to denote whatever internal
representation obtains in a particular implementation; this may vary
considerably between installations. 

    1.2  CGOL as an alternative external representation. 

	This document describes an alternative external
representation, CGOL, for those LISP S-expressions that encode
procedural information.  The representation is modeled on McCarthy's
M-expression notation; as such it has Smith and Enea's MLISP language
as a predecessor.  See section 8 for a discussion of similarities and
differences between MLISP and CGOL. 

	In an environment that supports both CGOL and LISP external
representation, we envisage facilities for mapping between
representations as diagrammed below:

		    READ	     CGOLREAD
	 ------   ========>  -----  <=========   ------
	| LISP |	    | INT |	        | CGOL |
	 ------  <========   -----   =========>  ------
		   PRIN1	     CGOLPRIN1

	This diagram of course generalizes to any set of external
representations.  The beauty of it is that the entire system described
in this manual is no more than the implementation of CGOLREAD and
CGOLPRINT.  Thus the MACLISP user wishing to experiment with CGOL can
immmediately begin by typing (CGOLREAD) to LISP and then typing in an
expression to see what it returns.  The results of such an exercise
are given at the start of section 2. 

    1.3 Usage

	If LISP's top-level, a cut-down version of which might be
is replaced by
the top level should now expect CGOL notation, and reply in like
notation.  If you prefer standard notation for output (the default if
you type (CGOL) as described below) then clearly you need

	In MACLISP, the function (CGOL) will set the variable READ to
CGOLREAD, achieving the above effect on MACLISP's top, break and edit
levels.  Alternatively, for users who want to use only CGOL notation,
LISP can be invoked from top level via ":L CGOL;" and the LISP so
loaded will be already set up to use CGOL notation.  Both (CGOLREAD)
and (CGOL) (but not CGOLPRINT) are autoloading.  Thus if at any time
while talking LISP you suddenly want or need to use CGOL notation, you
need merely type (CGOL) at LISP and LISP will then expect CGOL
expressions.  To revert to standard notation, type EXIT$ .  (Throughout
this document we shall use $ to denote both altmode and dollar which
may be used interchangeably in CGOL, except that when typing directly
to LISP as opposed to creating a file of CGOL programs altmode should
be used to ensure immediate response of the system, which treats only
altmode as a "force-feed" character.)  This implies that you may have
a mixture of CGOL and LISP programs in a single file, provided that
each block of CGOL code is preceded by (CGOL)$ (the $ is redundant but
useful if for any reason (CGOL) is taken to be in CGOL notation - it
is in fact legal CGOL notation provided the $ follows it) and is
followed by =EXIT$ (the = forces the exit to be executed even if read
but not eval'd by, say, the compiler). 

    1.4  Design Considerations. 

	The two principles that serve respectively as lower and upper
bounds on the choice of CGOL notation are:

(i)	The notation should fairly match what the non-LISP
community regards as a reasonable notation.  In particular, users of
ALGOL, FORTRAN, PL/I etc, should not experience difficulty in guessing
the meanings of those constructs common to those languages, e.g.
assignment, application, arithmetic and relational operations, and the
more familiar control constructs.  This requirement need not apply to
constructs peculiar to LISP, such as LIST, CONS, LAMBDA, EVAL, QUOTE
and so on. 

(ii)	The notation should restrict itself to being a notation for
LISP abstract objects, and not try to be a full-blown programming
language with its own useful set of constructs.  (This represents a
point of departure from the MLISP philosophy.)  Useful constructs
should be added to LISP via the same channels (modulo choice of
notation) that are used regularly by all LISP users to augment LISP,
e.g. (DEFUN ...) or its CGOL equivalent. 

	There is a tension between these two requirements that is at
present not appreciated by the bulk of the computer science community. 
That tension is brought about by the two very different techniques for
specifying the syntax of ALGOL-like languages and the semantics of
LISP programs.  The former is phrase oriented, the latter is
token oriented.  ("Phrase" and "token" may be replaced respectively by
"non-terminal" and "terminal", to use the formal language
terminology.)   A typical syntax specification uses BNF, i.e. a
context-free grammar, whereas LISP programs are specified in terms of
the meaning of atoms.  The rich non-terminal structure of, say, the
syntax of ALGOL is replaced by a trivial non-terminal structure for
LISP, namely all non-atomic programs are of the form (FUNCTION

	Not all ALGOL-like languages are specified by their phrase
structure.  From time to time attempts are made to use the powerful
macro facilities of assembly languages to implement "higher level"
languages.  Since macros are identified by single tokens, they
constitute a token oriented specification.  A limitation of the
approach is that the macro identifier must usually appear before its
arguments.  Also a macro interpreter that allows nested calls must be

	The CGOL notation uses a token oriented approach to fit
comfortably with constraint (ii).  Unlike macro based approaches, a
given token may have either zero or one argument preceding it, in
addition to any number following it (suitably delimited).  This gives
rise to the familiar association problem, which we deal with using
Floyd's notion of precedence functions.  Each argument position of a
token has an associated "binding power";  association is resolved in
favor of the higher binding power.  The binding power idea is due to
Floyd, who called the binding powers "operator functions"; the term we
use appears to have originated with CPL.  The parsing algorithm we use
differs from Floyd's in that ours is left-corner whereas Floyd's is
pure bottom-up.  The original version of the CGOL parser was
implemented at Stanford in July 1970; its use of binding powers was
copied soon after by Smith for the MLISP system.  We discuss the
details of the association problem in section 3. 


	The following examples of CGOL expressions, together with the
effect of doing (PRINT (CGOLREAD)) on them (i.e. their LISP
translations), are given below.  To aid the eye we shall use upper
case for LISP and lower case for CGOL.  Note that CGOLREAD demands an
ALTMODE or DOLLAR (not shown) after each CGOL expression.  ALTMODE is
better for TTY input since it causes immediate response (force-feed). 
In files, where force-feeding is irrelevant, DOLLAR is more convenient
to insert using TECO. 

(PLUS 1 1)

[1, '2+2', sin(.37*x+1)]
(LIST 1 '(PLUS 2 2) (SIN (PLUS (TIMES .37 X) 1)))

\x,y; 1/sqrt(x**2 + y**2)

toplevel :=  'print *; eval read'

car m  &  car m := cdr m

'father' of x := 'brother' of relative of y

father ofq x := brother ofq relative of y  % analog of SETQ %

a(i,j) := 3
(STORE (A I J) 3)

if numberp i and -j<i<j then |i| else print i

a.(b@c) = (a.b)@c

for i in a@b do if 7<i<13 then return "In range"
				   (RETURN '|In range|)))))
      (APPEND A B))

(((F X Y) U V W) I)

if j remainder 6 isin !'(1 5) then print j
			else badlist := j . badlist
	%note: - ! is a no-op unary function that expects its argument
	 in LISP notation.  Also note that comments may be inserted
	 between %%'s, as here;  altmodes and dollars must be
	 "slashified" in comments, which in CGOL means being preceded by ? . 
	 This avoids the screw of getting complete garbage for the
	 rest of the file on account of a missing percent, since
	 CGOL aborts the comment if it finds an unslashified altmode
	 or dollar in it. %

while (a;b) do c
	% a handy way to put the stopping condition b in the middle of
	  a loop body a;b %
(DO NIL ((NOT (PROG2 A B))) C)

define a "TO" b; if not a>b then a.((a+1) to b)


	A CGOL expression is a string of sub-expressions and tokens. 
For example the expression "if a=b then 0 else i+1" has six
constituents corresponding to the six items in the "construct"
	if a then b else c
In this example those constituents are:
 the token "if"
 the sub-expression "a=b"
 the token "then"
 the sub-expression "0"
 the token "else" and
 the sub-expression "i+1". 
In turn, the sub-expression "a=b" has three constituents:
 the sub-expression "a"
 the token "=" and
 the sub-expression "b". 
And the sub-expression "a" has one constituent, the token "a". 

	For those accustomed to BNF, the grammar of CGOL might look

  <expression> ::= if <expression> then <expression> [else <expression>]
  <expression> ::= <expression>=<expression>
  <expression> ::= (<expression>)
  <expression> ::= <expression>;<expression>

and so on.  Since <expression> will be the only non-terminal, the left
hand side of productions may be omitted without loss of information. 
Substituting variables for <expression> , we can then write CGOL rules

	if a then b [else c]
	a = b
	a ; b

and so on. 

	As they stand, such rules are ambiguous.  Does a=b;c mean
(a=b);c or a=(b;c) ?  The problem is that each of "=" and ";" could take
b as its argument.  We say that b associates with the one that takes
it as argument.  Thus if "print a+b" means "print(a+b)", "a" has
associated with "+" in preference to "print".  In CGOL, all such
disputes are resolved by binding powers, a sort of syntactic version
of atomic valences.  Thus if the right binding power (rbp) of "=" is
10 and the left binding power (lbp) of ";" is 1, then a=b;c is read as
(a=b);c because the right hand argument slot of "=" pulls harder on b
than does the left hand argument slot of ";" .  (Ties are broken by
associating to the left.)

	Left and right binding powers are completely independent. 
Each is relevant when the expression can have a sub-expression
at the left and right hand ends respectively.  Thus the left binding
power of "car" is irrelevant because "car a" does not have a
sub-expression at the left.  However, it has one at the right, so its
right binding power is relevant.  The opposite obtains for the suffix
operator "isatom", which has a left binding power but no right binding
power.  In an expression like "if a then b else c", the right binding
power applies to all three arguments even though only the last may
actually be fought over by another operator to its right. 

	In addition to the left-right association problem there is the
"dangling else" problem, named after an instance of the problem: does
"if a then if b then c else d" mean "if a then (if b then c else d)"
or "if a then (if b then c) else d" ?  This problem is just a variant
of the left-right association problem;  the argument "else d" could
associate with either the first or the second "if".  In CGOL, all such
disputes are resolved by associating to the closer operator.  (For
those who liked the atomic-valence analogy, imagine an inverse
(square?) law for distance holds.)

	The above two rules deal with most of the ambiguities that
might arise in the CGOL grammar given below. 

	The following table lists the explicitly defined CGOL
constructs together with their translation into standard notation. 
Except where otherwise noted, a,b,...,z denote CGOL expressions and
A,B,...,Z their corresponding standard forms.  The table has three
columns, the CGOL form, the left and right binding powers when
relevant (only given once when they are the same, or when only one is
relevant), and the translation. 

---------BRACKETING OPERATORS---------
(a)			0	A
f(a,b,...,z)		25 0	(F A B ... Z)
[a,b,...,z]		0	(LIST A B ... C)
a[b,c,...,z]		0	(MAPCAR (FUNCTION A) B C ... Z)
a{b}			0	(APPLY (FUNCTION A) B)
				       (CONS (FUNCTION A) B))

---------QUOTING OPERATORS-------
'a'			0	'A
"a"  (where a is a string)	'|a|
?a  (where a is a character)	/a
#a (where a is any CGOL token)	A
!A  (where A is a LISP S-expr)	A

\    a,b,..,p; q;r;..;z	0	(LAMBDA (A B ... P) Q R ... Z)
let a=x,b=y,..,c=z;p;q;..;s 0	((LAMBDA (A B ... C) P Q ... S) X Y ... Z)
let x,y,..,z={a}; b;c..;h 0	(APPLY '(LAMBDA (X Y .. Z) B C .. H) A)
(Think of {a} as "unpacking" a.  This is consistent with its use
above as f{a}.  "let x,y={a}, u=b, v,w={c};..." will do what you would
expect, translating into (APPLY '(LAMBDA (X Y U V W) ...) (APPEND A
(LIST B) C)) .)
prog a,b,..,p; q;r;..;z 0	(PROG   (A B ... P) Q R ... Z)
new  a,b,..,p; q;r;..;z	0	(PROG   (A B ... P) Q R ... (RETURN Z))
special a,b,...,z		(DECLARE (SPECIAL A B ... Z))

---------CONTROL OPERATORS--------
eval a			1	(EVAL A)
a;b;...;z		1 0	(PROGN A B ... Z)
a&b			1 0	(PROG2 NIL A B)
if a then b [else c]	2	(COND (A B) [(C)])
return a		1	(RETURN A)
while a do b		2	(DO NIL ((NOT A)) B)
for i in l, j in m do f	2	(MAPC (FUNCTION (LAMBDA (I J) F)) L M)
for i in a to b do f	2	(DO ((I A (ADD1 I))) ((GREATERP I B)) F)
iter [for i := a step b] 2	(DO ((I A B) (J C D)) (E G) F)
     [for j := c step d]
     [until e]
     [do f]
     [return g]
(The [...] in the above all indicate optional items.  Order is immaterial.)

--------STORAGE OPERATORS-------
a of b := c		25 1	(PUTPROP B C A)
a ofq b := c		25 1	(PUTPROP B C 'A)
a:=b    (a is atomic)	25 1	(SETQ A B)
x(a,b,...,z):=y		25 1	(STORE (X A B ... Z) Y)
a{b} := c		25 1	(STORE (APPLY (FUNCTION A) B) C)
  (useful if a is an array whose dimensions are not fixed at compile time)
car a := b		25 1	(RPLACA A B) (similarly for cdr)
plist a := b		25 1	(SETPLIST A B)
arg a := b		25 1	(SETARG A B)
ttyread := a		25 1	(SSTATUS TTYREAD A) (etc)
a of b			25 24	(GET B A)

---------LIST OPERATORS---------
a.b			14 13	(CONS A B)
a@b			14 13	(APPEND A B)

a=b			10	(EQUAL A B)
a ne b			10	(NOT (EQUAL A B))
a eq b			10	(EQ A B)
a<b<...<z		10	(LESSP A B ... Z)
a>b>...>z		10	(GREATERP A B ... Z)
a isin b		10	(MEMBER A B)

(fixed point:)
a=#b			10	(= A B)
a<#b<#...<#z		10	(< A B ... Z)
a>#b>#...>#z		10	(> A B ... Z)

(floating point:)
a=$b			10	(= A B)		(just in case...)
a<$b<$...<$z		10	(<$ A B ... Z)
a>$b>$...>$z		10	(>$ A B ... Z)

---------LOGICAL OPERATORS-------
not a			9	(NOT A)
a and b			8	(AND A B)
a or b			7	(OR A B)

|a|			0	(ABS A)
+a			20	A
a+b			20	(PLUS A B)
-a			20	(MINUS A)
a-b			20	(DIFFERENCE A B)
a*b			21	(TIMES A B)
a/b			21	(QUOTIENT A B)
a rem b			21	(REMAINDER A B)
a mod b			21	(MOD A B)	(courtesy of CGOL)
a**b			22	(EXPT A B)

(fixed point:)
a+#b			20	(+ A B)
a-#b			20	(- A B)
a*#b			21	(* A B)
a/#b			21	(// A B)

(floating point:)
a+$b			20	(+$ A B)
a-$b			20	(-$ A B)
a*$b			21	(*$ A B)
a/$b			21	(//$ A B)

---------BOOLEAN OPERATORS--------
:n: a		not	21	(BOOLE 12 0 a)
a :a: b		and	21	(BOOLE 1 a b)
a :v: b		or	20	(BOOLE 7 a b)
a :x: b		xor	20	(BOOLE 6 a b)
a :↑: b		shift	22	(LSH a b)

---------I/O OPERATORS-------
print a			2	(PRINT A)
princ a			2	(PRINC A)
write a			2	(PROG2 (TERPRI) (PRINC A))
uread a b ... z	(a-z are tokens) (UREAD A B ... Z)
uwrite a b ... z		(UWRITE A B ... Z)
ufile a b ... z			(UFILE A B ... Z)
load a b ... z (a-z are tokens) (FASLOAD A B ... Z)
newline				(TERPRI)

oct(a)	  (parens mandatory)	parses a with IBASE=8
=a			25	Translation is the value of the expression
				at parse time.  Useful in code that
				would not otherwise be evaluated, e.g.
				when compiling code. 

	In addition to the above, CGOL "knows" about all the unary
functions in LISP.  Thus although "car" does not appear in the above
table, CGOL knows that CAR is a LISP function with one argument.  CGOL
treats all such functions f as though they were defined as

f a			25	(F A)

	Note that f and a in the above must have some formatting
delimiter between them (space, tab or carriage return); thus car* is
not acceptable for car *, which translates as (CAR *).  In general
CGOL permits atoms to have both values (i.e. to serve as variables)
and functional properties.  Thus although "last [1,2,3]" will return
(3), you can set "last:=2" and find that the variable last now has
value 2.  The advantage of this is that you do not have to memorize
the entire vocabulary of LISP before choosing a variable name.  Make
sure you don't try to use any of the above-defined operators as
variables, or attempt to write them in the "f(x,y,z)" format; for
example, + and AND should not be used as variables, while +(x,y) and
-(a,b) will not parse because CGOL will not be able to distinguish
them from the legal +(x) and -(x) at the time it is reading the + or
-.  Actually, these are about the only trouble spots, and provided you
stick to the algebraic use of + and - you won't encounter this sort of
problem.  When in doubt you can always drop back into LISP by using
"!".  However, that should rarely be necessary - it is intended mainly
for non-procedural items such as lists for doing MEMBER and ASSOC in. 
If you can't recall the CGOL form of an expression (F A B ... Z) where
F does not itself have any special meaning to CGOL, you won't go wrong
by writing f(a,b,...,z).  Thus if you forget the form "1+1", you can
write "PLUS(1,1)" and it will translate correctly.  By the same token,
any LISP function not catered for in the above table can be written as
"f(a,b,...,z)", e.g. "assoc(a,b)".  Care has been taken in the
definition of CGOL to avoid using LISP functions as CGOL operators
with special syntactic significance as far as possible, to permit
PLUS(1,1) and the like.  However, some LISP functions remain in CGOL,
in particular +, -, * and /, along with PROG, EQ, NOT, AND and OR.  While these can to an extent be used as variables, this
is not recommended without preceding them with #, the syntax-removing
operator.  However, except for + and -, all these operators can be
used without # as the function f in f{a} or f[a].  When you define
functions of your own using DEFINE, you should always invoke the
function just as you defined it.  If you defined it as
 define "F" a "ON" b; a+b$ then you should not call it with f(2,3) but
only with f 2 on 3 unless you precede the f with #, as in #f(2,3). 
The only exception is when you define it with define "F"(a,b); a+b$
which attaches no syntactic significance to f, but has the effect of
(DEFUN F (A B) (PLUS A B)) .

	At first sight the binding powers may seem a lot to learn. 
However, they have been chosen on the basis of the data types their
operators take as arguments and return as results in order to minimize
the need for parenthesization.  If you want to use CGOL notation but
don't want to have anything to do with binding powers, simply
parenthesize every CGOL expression as though you were writing in
LISP.  However, if you omit all parentheses (apart from those needed
in constructs of the form f(a,b,...,z)) you will not often go wrong. 
Most often you will want parentheses for grouping in arithmetic
expressions when the default priority ranking (|| + - * / mod **)
gives the wrong grouping, and when procedural expressions occur as
arguments to non-procedural expressions, e.g. "if a then (b;c)",
"(print a) + b", "a:=(b:=read)+1" and the like. 

	Some operators have a right binding power one less than their
left binding power.  This is to make those operators right
associative.  For example, "a of b of c" is parsed as "a of (b of c)"
(since oftener than not that's what was intended), and "a@b@c" as
"a@(b@c)" (for efficiency).  An interesting pair of right associative
operators is ";" and "&".  These are duals of each other.  They both
evaluate their arguments in the same order, but differ in the value
they return:  a&b returns a, a;b returns b.  By making both of them
right associative and giving them the same priority, they interact in
an elegant way.  Suppose you want to evaluate a,b,...,z and to return
the value of k.  Then the expression a;b;...k&...;z has the desired
effect.  That is, follow every argument but the last with ";", except
for the one you want the value of, which if it is not the last should
be followed by "&".  A common use for "&" is in tidying up after
computing some value, e.g.  "x & x:=2" will set x to 2 and return the
old value, "a:=(b & b:=a)" will swap the contents of a and b without
using a temporary variable, and so on.  Note that all this is really a
feature of LISP rather than of CGOL;  however, the notation makes it
easier to see at a glance the intent of what would be relatively
difficult to follow in LISP. 


	The CGOL analog of DEFUN is "define".  In addition to allowing
the user to attach a lambda expression to some functional property of
an atom, it gives him some syntactic capabilities as well. 

	If you don't want to take advantage of CGOL's syntactic
extensibility, just say
define "F"(x,y); x**2 + y**2 $
or whatever it is you want to define.  This is equivalent to LISP's
In calling F, the arguments should always be parenthesized, e.g.
f(1,2,3) .  In this way CGOL can figure out that f is being applied to
(1,2,3) even though no syntax has been specified for F, because of the
way "(" is treated as an infix operator.  If F is supposed to be a
fexpr, lexpr or macro then the appropriate word should follow define,

define lexpr "F"(x); arg(1) $  %Note that x is parenthesized, unlike in LISP%

	If you want syntax for f as well, then the basic form is

define [fexpr|lexpr|macro] <pattern> [,bp [,bp]]; body

For example, the following could serve as a definition of "@":

define a "@" b, 14, 13; if a then car a . (cdr a @ b) else b $

	In this example, the pattern is  a "@" b , and gives the rule
for this operator, and the two numbers give the two binding powers. 
The body is what would normally appear as the body of a DEFUN. 

	At present the allowable forms of patterns are few in number. 
You may write a sequence of variables separated by one or more tokens
in string quotes (letters inside string quotes must be capitalized -
this is the one place where a distinction is drawn between upper and
lower case, in the sense that the reader maps all lower-case letters
not in string quotes to upper-case before thinking about what they
mean).  The variables stand for CGOL expressions and the strings for
tokens (recall that a CGOL expression is a sequence of sub-expressions
and tokens).  The sequence may start and end with either tokens or

	The first token in the pattern is called the operator, and the
remaining tokens are called delimiters.  The operator is said to have
been defined.  An operator may be defined twice only if it takes a
left argument in one case and no left argument in the other, this
being a criterion CGOL uses when deciding what a particular token in a
program means.  In the above table, the operators "(", "+" and "-" all
have such dual meanings.  The delimiters may appear in arbitrarily
many definitions, and arbitrarily often in each.  A delimiter's
binding power is set to 0; thus in general delimiters cannot also be
used as LEDs, though they can serve as NUDs (e.g. |, which is both the
NUD for ABS and a delimiter).  When you specify what delimiter is to
be used, you must stick to it; thus you cannot say
define "LOG" a "BASE" b; ...$
and invoke it later with "log 3, 2";  you must say "log 3 base 2". 

	The default binding power is 25 if none is specified, and
applies to both left and right binding powers.  If one binding power
is given it is the left and right binding powers.  If both are given,
they are respectively the left and right ones. 

	Like LISP, CGOL is a one-pass system.  This is so that a user
can type in a definition and have it take effect immediately.  This
conflicts with the requirement in any system offering sophisticated
syntax that it be possible to use the syntax of an operator before it
has been defined.  This requirement is nice in general, and essential
for mutually recursive function definitions.  To get around this
problem, you may define the syntax of an operator at any time without
giving its semantics, simply by omitting the body of the define
command.  This is not an elegant solution, and a later version of CGOL
may deal with this.  (A possible solution is to keep around pieces of
unparsable text until they become parsable, and then parse them.  Even
more dramatic is the solution of not parsing anything until it is to
be evaluated, i.e. dropping the unparsability condition from the
previous solution.)


	It is possible to add to or change the definitions of the
rules of CGOL (the ones in Section 2).  To see examples of such
definitions, look under the heading BASE COMPONENT in the file
AI:PRATT;CGOL > .  Given the above table, you should be able to get a
rough idea of how to write such definitions. 

	The meaning of

infix "+" 20 is "PLUS"

is as follows.  The infix operator "+" is being defined with left and
right binding power 20.  (Had I said "infixr" it would make "+" right
associative by making its right binding power 19, one less than its
left.)  The translation of "a+b" is then (PLUS A B) .  Since the
argument positions are "standard" for infix operators, the only
information you really need to supply is what the LISP function name
is, so you say "is "PLUS"" .  You don't have to use "is" - if you want
you can spell it out by saying ["PLUS", left, right] , which is a CGOL
program to build a list whose three elements are the atom "PLUS", the
left hand argument and the right hand argument.  Notice the use of
this technique when defining

infixr "&" 1 ["PROG2", nil, left, right]

We can't say 'is "PROG2"' because that would mean '["PROG2", left,


	(This treatment assumes an understanding of the basic
mechanism of the parser used for CGOL, which treats tokens as either
nuds, leds or delimiters.  A nud (NUll Denotation) is a
lambda-expression of no arguments which is applied to NIL as soon as a
token denoting it is encountered by the parser without a left hand
argument.  A led (LEft Denotation) is the same but with a left hand
argument.  A delimiter is a token the parser does not access either
denotation of (though it may access its left binding power, lbp)
because some other denotation caused it to be skipped over (using
ADVANCE, often called by CHECK).)

	The *fix operators take three arguments with no delimiters
between them (except for nilfix which takes two and infixd which takes

arg1 is the name of the defined operator.  It need not
be string-quoted, but I string-quote it anyway for visibility.  It is
accessed as the value of token, which in the case of
strings is the string itself without quotes.  Strings are the one
place in CGOL where lower-case is recognized, so if you do stringify
this arg, as I do, don't use lower-case.

arg2 is the precedence, which is used for both lbp and rbp, except that in
infixr the rbp is taken to be one less than arg1.  In infixd both lbp and
rbp are given.

arg3 is the denotation, which is run at the time the parser bumps into
the operator.  At that moment, token is bound to the token immediately
following that operator.  Calling advance at that point will rebind
token to the next token.  Advance returns the new value of token.  In
the case of LEDs (LEft Denotations, namely infix, infixr, infixd,
infixm, suffix), the variable LEFT is lambda-bound (on entry to the
denotation) to the translation of the immediately preceding
expression.  RIGHT is not a variable but a CGOL expression whose
translation is (PARSE r) where r is the right binding power as
determined by arg2.  Thus each mention of RIGHT represents an
invocation of the parser.  The parser assumes that the input pointer
is pointing to the first token of the expression to be parsed (that
token being the value of TOKEN).  The parser returns the translation,
leaving the pointer at the token following the just-translated

The IS operator is shorthand for ["FOO", left, right] (in the case of
infix) or its analogue for other *fixes, e.g. for nilfix it is
["FOO"], for prefix it is ["FOO",right], for suffix it is
["FOO",left], and for infixm it is a hairy beastie that results in 'a
foo b foo c foo d' translating as (FOO A B C D).

To some extent you can figure some of this out by reading the
definitions of the *fix operators.  Deffix takes 5 args:
the denotation type (NUD or LED, or XNUD or XLED if you are LEARNing
language X)
The appropriate IS operator
The name of the operator being defined
The left binding power
The right binding power

The formal parameters of DEFFIX are referenced from outside DEFFIX in
a couple of cases, which makes it hard to figure out what's going on.
The comments in the list of special variables should alleviate that

Incidentally, for development purposes, the LEARN-SPEAK-FORGET cluster
of functions makes definition of a new language quite painless in
CGOL.  To learn definitions in language X, say LEARN "X" and
everything defined from then on using *FIX or DEFINE will be invisible
to CGOL.  LEARN "" restores the standard effect of *FIX and DEFINE,
which is to specify CGOL denotations.  When you say SPEAK "X" the
definitions learnt while LEARN "X" was in force take effect,
over-riding any prior CGOL definitions of the defined tokens.  Both
LEARN and SPEAK work by rebinding variables.  Thus the user can
achieve their effect by doing his own binding.  If he uses lambda- or
prog-binding (CGOL makes lambda-binding more convenient than
prog-binding) then the effect of the binding will be undone on exit
from the expression doing the binding, whether the exit was normal or
abnormal, which has obvious advantages over the side-effect-oriented
LEARN and SPEAK functions.

	The conventions are:  LEARN "X" binds CNUD to XNUD, CLED to
XLED and CLBP to XLBP.  Standard CGOL takes X as the empty string.
SPEAK "X" binds NUDL, LEDL and LBPL to lists whose cdr is the previous
value before the SPEAK and whose car is XNUD, XLED or XLBP
respectively.  (Hence  SPEAKs may nest notations; thus if you say
SPEAK "X"; SPEAK "Y" then notation Y takes priority over notation X
which takes priority over standard CGOL notation.)  FORGET undoes the
effect of the last SPEAK; it is careful not to forget CGOL itself. 
If the user wishes to bind NUDL etc himself he does not have to nest
notations in the way SPEAK does; in particular, he can eliminate CGOL
notation altogether, say by binding NUDL to (XNUD), etc.  However,
care should be exercised here to avoid permanently cutting off LISP
from the user in this way, e.g. by ending up with a language not
powerful enough to return to CGOL or LISP.

	These operations have several applications.  One is for those
defining a whole new language.  Another is for when a small and
temporary notation change is required; for example, in writing an
ALGOL parser, the infix ":" needs a different denotation inside array
declarations than elsewhere; thus just before parsing its argument, an
array declaration may say SPEAK "ARRAY" which will activate a
definition of ":" made when LEARN "ARRAY" was in force.  At the end,
the array declaration should say FORGET.  (Actually, in this example,
it would be more appropriate to lambda-bind LEDL and LBPL to (ARRAYLED
LED) and (ARRAYLBP LBP) respectively, so that the old bindings will be
restored automatically whether exiting normally or abnormally from the
array definition.)

	Other uses of LEARN, SPEAK and FORGET may be found in the
files TC >, EG > and MSYN >, all on the directory PRATT; at MIT-AI.


	Compilation of a CGOL file is done by invoking NCOMPL in the
same way as for a file in standard notation.  That is, at DDT you say
<file name>

	The LISP compiler knows about the incantation (CGOL)$ and
evaluates it instead of outputting it.  The autoload property of CGOL
means that if CGOL is not already loaded into NCOMPL it will be.  When
=EXIT$ is encountered, the = forces CGOL to evaluate the EXIT
immediately, returning NCOMPL to LISP notation.  The upshot of all
this is that provided each block of CGOL code in your file is preceded
with (CGOL)$ and followed by =EXIT$ , NCOMPL will have no trouble
understanding CGOL notation, even when mixed in with standard
notation.  Since you need these commands in the file to interpret it
anyway, you should find that compiling a CGOL file requires no more
massaging of the file than if you had been writing in LISP notation
(e.g. declaring special variables, specifying types of variables). 


	All of the above deals with CGOLREAD.  CGOLPRINT is also
available, and supplies a good way to automatically convert standard
notation to CGOL notation.  Thus the LISP program
 (DO () (()) (CGOLPRINT (READ)))
will read a sequence of LISP expressions and print them (on whatever
output device is selected) in CGOL notation.  CGOLPRINT, unlike PRINT,
makes an effort to prettyprint the output.  It uses an extremely
efficient but simple-minded pretty-printing algorithm, avoiding the
overhead of the GRIND program. 


	The similarities between CGOL and Smith and Enea's MLISP are:

(i)	the use of ALGOL-like notation for LISP S-expressions

(ii)	the use of numeric operator precedence functions to resolve
association problems. 

(iii)	the ability to export LISP translations of MLISP/CGOL programs
to sites supporting LISP but not MLISP/CGOL. 

	The differences are:

(i)	MLISP is a sophisticated programming language offering many
facilities not appearing in LISP.  These facilities are only visible
to the speaker of MLISP, and vanish if he wants to use them while
speaking LISP.  (Due to the ubiquity of LISP's oblist, the user can
get at them from LISP, though they are undocumented and have names
starting with & to identify them as sytem names not for general
consumption.)  Assignment to S-expressions is a particularly complex
example.  In contrast, CGOL offers nothing but an alternative
notation for things already meant for consumption by LISP users. 
This enables CGOL to be very small, both with respect to its
implementation and its manual. 

(ii)	MLISP is a system that the user must call from the monitor,
whereas CGOL is a package that can be loaded into LISP when the
need arises.  Hence a non-CGOL user can read a CGOL file without
having to commit himself to a CGOL-oriented system when he loads LISP. 
In fact, when the I/O details are worked out as in Section 1.3 he may
never know that he was reading a CGOL file. 

(iii)	CGOL is considerably more extensible syntactically than MLISP I
and considerably more efficient than MLISP II,  which though
extensible uses backup to an excessive degree. 


	CGOL programs can be edited using the TECO editor just like
LISP programs.  For ITS users, a recent development in LISP and TECO
has allowed LISP to have a copy of TECO as a subjob.  The effect of
this is as though LISP had its own resident editor with all the power
of the TECO editor but with the added advantage that LISP can be more
selective about what it reads out of a file after the file has been
modified.  In particular, TECO users with the appropriate macros
(either the EMACS macros or for non-EMACS users the LISPT MACROS file
on the .TECO.; directory of any ITS machine) can send a selected
fragment of their file to LISP by saying m,nM.ZLISP$ where m and n
specify the start and end of the buffer fragment to be sent to LISP. 
CGOL users can take advantage of this scheme in exactly the same way
that LISP users can.  Provided LISP's toplevel is expecting CGOL
notation at the time the user calls TECO()$ (which invokes TECO if
LISPT FASL DSK LIBLSP has been loaded), when the user returns from
TECO with a string of CGOL expressions (as determined by the m,n
argument to M.Z) LISP will correctly interpret those expressions. 
(Pending repair of a bug, CGOL users must type a space when they
return to LISP, or LISP will do nothing.  This should be the only
noticeable difference between LISP and CGOL for LISPT users.)


	Errors are dealt with exactly as in LISP, with the exception
of syntactic errors, which CGOL tries to patch, reporting on the
problem and the patch when it does so.  Syntactic errors do not cause
a breakpoint to be entered, but errors that would do so in LISP do so
when using CGOL.  While $P is one way (besides ↑G) of exiting from a
break loop, the CGOL notation requires that the "$" be "querified",
the CGOL analog of slashifying.  Also you need $ at the end rather
than space.  Thus you would type


	The following programs were written by the author for a
variety of purposes.  Most of them deal with one or another
computational complexity problem. 

% Sample CGOL programs for matrix multiplication %

define a "MM" b;	% Mat multiply - square matrices %
array a; array b;
let d := car dim a;
array c(d,d);
	for i in 1 to d do
		for k in 1 to d do
			(let ac = 0;
			for j in 1 to d do
				ac := ac + a(i,j)*b(j,k);
			c(i,k) := ac);

define a "MMR" b;  % Mat multiply - rectangular matrices %
array a; array b;
let d1,d2 := {dim a}, d2b,d3 = {dim b};
if d2 ne d2b then "Mismatch" else
(array c(d1,d3);
	for i in 1 to d1 do
		for k in 1 to d3 do
			(let ac = 0;
			for j in 1 to d2 do
				ac := ac + a(i,j)*b(j,k);
			c(i,k) := ac);



% Maze-running program.  Works for any number of dimensions.
Instructions for use:
Get a CGOL by saying :L CGOL; to DDT.
uread maze > ai pratt <altmode><control>Q
init [3,3] <altmode>
try source <altmode>
survey <altmode>

Here is what the above commands mean.  The "uread" selects the file
containing the maze program, and the control Q causes CGOL to read the
selected file.  This has the effect of loading the maze program.  Then
"init [3,3]" initializes the maze to a 3 by 3 maze in which ALL paths
are legal.  "try source" will search for a path from source (defined
by init to be (0,0)) to destination (defined by init to be (2,2), the
opposite corner).  Finally, "survey" causes a sequence of positions to
be printed out, forming a path from source to destination. To vary
this, it is possible to do, say, init [4,2,5] <altmode> which will
initialize "maze" to a 4 by 2 by 5 array.  After doing "init" but
before doing "try source", "maze" can be changed.  To forbid a path
through, say, point (1,2), say to CGOL:
maze(1,2):="+" <altmode>
If you want to set up all the elements of maze rapidly to the maze 
(where + means forbidden and - means allowed), say
maze := !'(- + + - - + + - -) <altmode>
where the list of +'s and -'s is what you get by reading the maze
left to right and then down.
To see what maze looks like, say
listarray 'maze' <altmode>
This will print out a list of the contents of the maze array, row by
row but without delimiting one row from the next, in the same format
as was used above.  If you want to save a particular
setting of the maze array for future use, just say
y := listarray 'maze' <altmode>
To reset maze to that value, say
maze := y <altmode>  
You can also simply set y (or any other variable) to a maze by saying
y := !'(- + + - - + + - -) <altmode>
A particular maze
can be explored simply by saying
whynot <altmode>

'=array(maze,t,1)' $

define "TRY" here;  % the maze-searching algorithm %
here = destination
not(or{minusp[here]}) and and{lessp[here, dims]}	% outside boundary? %
 and maze{here} = "-"   % is this square new and legal? % and
(maze{here}:="*";	% mark square to avoid revisiting %
 iter for one := stepsource step cdr one do	% various directions %
	(for j in [plus[here,one], difference[here,one]] do % "up" or "down" %
	  if try j then (maze{here}:=j; return t); % success: mark maze %
	 if cadr one = 1 then return nil) % failure: return nil %) $

define "INIT" d;   % initializes maze as per dimensions d.  All moves legal. %
dims := d;
stepsource := (\x;0)[dims];	% source of increment vector 1,0,0,... %
car stepsource := 1; cdr last stepsource := stepsource;
source := (\x;0)[dims]; destination := sub1[dims] $

define "SURVEY";	% prints path from source to destination %
let x = source;
while x ne destination do print x := maze{x} $

define "WHYNOT";	% sample non-trivial maze %
init [4,4];
maze := !'(- + + -  - - - +  - + - +  - + - -);
try source;
survey $


% Program to solve a system of linear Diophantine equations.

To use, say
solve(A,b,n) altmode
where A is an mxn matrix, b is an m-vector, and n is an integer.  Here
m is the number of equations and n the number of variables.  The
result will be an nxp matrix N such that Nv is a solution in x
for Ax=b for any integer v with unit first element.  Morevover, all
solutions are of the form Nv.

  The basic idea is to use Gauss-Jordan elimination, but to use
Euclid's algorithm to guide the process.  Euc(a) takes an integer
vector a and returns a two-list (M g) where g is the gcd of the
elements of a and a"M is a vector with g in the first coordinate and
0's elsewhere.  (Here a" denotes the row vector a.)  Euc is called
once per equation, and the resulting matrices from each call are more
or less multiplied together to give the final answer.

Matrices are represented as lists of rows, rows in turn being
represented as lists of numbers.  %

define "EUC"(a);   % Euclid's algorithm %
if null a then [nil,0]		% should not be needed below %
else if null cdr a then [[[1]],car a]
let M,g = {euc(cdr a)}, h = car a;
M := (\i;0)[1 to length cdr a] . M;
let c1 = 1 . (\i;0)[1 to length cdr a], c2 = car[M];
let ng, nc1, nc2 = {euc1(h,g,c1,c2)};
[cons[nc1,cons[nc2,cdr[M]]],ng] $

define "EUC1"(a,b,u,v);
if b=0 then [a,u,v]
let d = a/:b;
euc1(b, a-b*d, v, difference[u,(\x;x*d)[v]]) $

define "SOLVE"(Av, Iv, n);
if null Av then (\x;0 . x)[Id(n)]
 let a = car Av, i = car Iv;
 let m,g = {euc(a)};
 if g = 0 then
   if i = 0 then solve(cdr Av, cdr Iv, n)
	    else throw nil
 else if not g|i then throw nil
 else (m := scale1(m, i/:g);
       matmul(m, (1 . (\x;0)[2 to n]) .
		 solve(cdr matmul(Av,cdr[m]),
		       n-1))) $

define "SCALE1"(m, x); % Scale up col 1 of m by factor x %
cons[(\y;y*x)[car[m]], cdr[m]] $

define "SCMUL"(a,b);  % multiply matrix a by scalar b %
(\q;(\p;p*b)[q])[a] $

define "MATMUL"(a,b); (\i;(\j;plus{times[i,j]})[list[{b}]])[a] $

define "ID"(n);  % construct the nxn identity matrix %
(\i;(\j;if i=j then 1 else 0)[1 to n])[1 to n] $



%FuperFonic Tranfport algorithm - a real treafure.  Does FFT
multiplication of integers.  Though this algorithm is no faster than
the Strassen-Schoenhage n log n log log n FFT integer multiplication,
it is considerably simpler. %

special n, ord, w, wi, g $

newtok "+:", "*:", "**:" $
infixm "+:" 20 is "SUM" $
infixm "*:" 21 is "BY" $
infixr "**:" 22 is "TOTHE" $

define lexpr "SUM"(argno);  plus{arg[1 to argno]} rem n $

define lexpr "BY"(argno);  times{arg[1 to argno]} rem n $

define "TOTHE"(a,b);
if b = 0 then 1
else if oddp b then a *: (a**:(b-1))
else (a*:a)**:(b/:2) $

define "TOPOLY"(m,b,len); if len ne 0 then m mod b . topoly(m/:b, b, len-1) $

define "FROMPOLY"(p,b); if p then car p + b*frompoly(cdr p, b) else 0 $

define "EVENS" x; x and car x . evens cddr x $	% even half of vector %
define "ODDS"  x; evens cdr x $			% odd  half of vector %

define "FFT"(a,pw);  % FFT of vector a;  pw is vector of powers of w %
if null cdr a then a
else let efft = fft(evens a, evens pw), offt = fft(odds a, evens pw);
     sum[efft@efft, #mult[pw, offt@offt]] $

define "IFFT"(a,pwi); (\x;g*:x)[fft(a,pwi)] $

define a "MULT" b, 21;  % n log n log log n integer multiplication %
let m := max(a, b);
if not bigp m then a*b
(let x = fix(log(35 * length cdr m)/log(2));
 let n = 2**2**(x-1)+1, ord = 2**x, w = 2;
 let wi = w**:(ord-1), g = (n-1)/:ord*:(n-1);
 let m = wi;
 let pw = for i in 1 to ord collect m:=m*:w;
 let pwi = car pw . reverse cdr pw,
     dig = ord/:x;
	       dig)) $


% This procedure tests the function defined as a bit string in array A
to see if it has the (PISH,TUSH) property, namely that given any
PISH variables, one can find at least TUSH functions of the remaining
variables by appropriately diddling the PISH variables.%

special mask, llv, nb, curr, token, left, pish, tush, n, nw $
% MASK is a 32-bit mask corresponding to a particular choice of variables
LLV	is a partition of the possible functions for a given choice of
	variables.  Initially discrete, it is progressively refined as
	evidence arises for distinguishing functions.  The partition is
	represented as a list of the non-singleton blocks, each of which
	is represented as a list of variables, each of which is
	represented as (ostensively) 1, 2, 4, 8, ...
NB	The number of blocks in LLV.%

infix "↑" 22 is "LSH" $

array(mp, fixnum, 5) $		% array of 32-bit masks %
=let ibase = 2 in cgolread()$	% interprets following expr in binary %
mp(0) := 10101010101010101010101010101010;
mp(1) := 11001100110011001100110011001100;
mp(2) := 11110000111100001111000011110000;
mp(3) := 11111111000000001111111100000000;
mp(4) := 11111111111111110000000000000000$
% Note convenience of not losing meaning of 2,3,4 in above %
% Hack: starting from right, read each column as a 5-bit number.
Recognize the sequence?  These vectors are the columns arising in the
method of truth tables for testing propositional tautologies. %

define "TEST"; enum(pish, n-1, nil, [0], (-1)↑(-4)) $

define "ENUM" (nv, ulim, vl, sl, mask);
if nv=0 then (llv := [sl]; nb := 1; prin1(ulim+1); chek(vl, 0, 1↑(n-5));
		if nb < tush then (write "No." ; tyo(7); err()))
else iter for i := ulim step i-1 until i = nv-2 do
		if i>4 then vl @ [1↑(i-5)] else vl,
		sl @ mapcar('\x;x+1↑i', sl),
		if i>4 then mask else mp(i) :A: mask) $

define "PARTITION"; llv := mapcan('makeq', llv) $

define "CHEK"(vl, ll, ul);
if vl then
  iter for i := ll step i+car vl + car vl while i<ul and nb < tush do
		chek(cdr vl, i, i + car vl)
else iter for i := ll step i+1 while i<ul and nb<tush do (curr:=i; partition) $

define "MAKEQ" (lv); scan(mapcar('cnvt', lv), lv) $

define "CNVT"(j); mask :A: a(curr+j↑-5)↑(j :A: 31) $

define "SCAN"(la, lv);  if alleq(la) then [lv] else sep(la, lv) $

define "ALLEQ"(la);  null(cdr la) or car la = cadr(la) and alleq(cdr la) $

define "SEP"(la, lv); cdr la and if not car la isin cdr la then
				(nb:=nb+1; sep(cdr la, cdr lv))
		else if alleq(cdr la) then [lv]
		else (nb := 1 + nb; 
			(car lv . select(car la, cdr la, cdr lv)) .
			sep(remove(car la, cdr la, cdr la),
			    remove(car la, cdr la, cdr lv))) $

define "SELECT"(a,la,lp);
la and if a= car la then car lp . select(a, cdr la, cdr lp)
else select(a, cdr la, cdr lp) $

define "REMOVE"(a, la, lp);
la and if a = car la then remove(a, cdr la, cdr lp)
	else car lp . remove(a, cdr la, cdr lp) $

%The following procedures allow one to build an array given a circuit%

define "LCOMB"(aa,bb,dd,ff);
for i in 0 to nw-1 do dd(i) := boole(ff, aa(i), bb) $

define "HCOMB"(aa, pp, dd, ff);
for i in 0 to nw-1 by 2*pp do
	(for j in i to i+pp-1 do
		dd(j) := boole(ff, aa(j), (-1)↑(-4));
	for j in i+pp to i+2*pp-1 do
		dd(j) := boole(ff, aa(j), 0)) $

define "BCOMB"(aa, bb, dd, ff);
for i in 0 to nw-1 do dd(i) := boole(ff, aa(i), bb(i)) $

define "RDCKT";   % read a circuit, calculate its function, store in a %
new dest,	% where output goes %
    sce,	% accumulator containing first gate-input %
    fn, 	% gate-type: 1=and, 7=or, 6=xor %
    argt;	% second gate-input - num means variable i,
				      letter means accum %
advance;	% CGOL's scanner: effect is "token:=read()" %
while token ne "END" do
(dest:=token; sce:=advance; fn:=advance; argt:=advance; advance;
if not argt isnum then bcomb(sce, argt, dest, fn)  % arg is accum %
else if argt < 5 then lcomb(sce, mp(argt), dest, fn) % arg is input %
else hcomb(sce, 1↑(argt-5), dest, fn)) $

%A small circuit with the 3,5 property%

pish:=3;tush:=5 $
nw:=2 $     % takes 2 words to handle 6 variables %
n:=6 $

array(a,fixnum,64) $  % Initialized to 0 %
array(b,fixnum,64) $

rdckt $
a a 7 0 	% dummy OR, to copy input 0 into accum a %
a a 1 1 	% ANDs a with input 1, leaving result in a %
b a 6 2 	% XORs a with input 2, leaving result in b %
a a 6 3 
b b 1 4 
a a 1 5 
a a 7 b 	% ORs a and b, leaving result in a %
a a 6 0 	% Finally, XORs a with input 0, result in a %


% This algorithm assigns types to those lambda calculus expressions
that have a well-defined type.  Thus
type '\x;\y;x'  should yield  (A -> (B -> A)) .
No promises are made about the behavior of the algorithm for
expressions without well-defined types, as in
type '\x;x(x)' .
This algorithm is believed to have a bug.  However, it works pretty
well in general.  %

#prin1 := "PRINC" $

define "GENTYPE";  ascii(typecnt:=typecnt+1) $

define "PRETTY" x;
if not car x then pretty cdr x 
else if not cdr x then princ(car x)
else (princ("("); pretty car x; princ("->"); pretty cadr(x); princ(")")) $

define "UNIFY"(ta,tb);
if null car ta then unify(cdr ta, tb)
else if null car tb then unify(ta, cdr tb)
else if cdr ta and cdr tb then ([unify(car ta, car tb), unify(cadr(ta), cadr(tb))])
else if cdr tb then unify(tb, ta)
else (rplaca(tb, nil); rplacd(tb, ta)) $

define "LUNIFY"(ta,tb);
if cdr ta then unify(car ta, tb)
else (rplaca(ta, tb); rplacd(ta, [[gentype]])) $

define "CHAIN" tx; if car tx then tx else chain cdr tx $

define "RTYPE" lx;
if lx isatom then eval lx 
else if car lx = "LAMBDA" then 
  eval [["LAMBDA", cadr(lx), ["XCONS", ["LIST", 'rtype caddr(lx)'],
chain caadr(lx)]],
	["QUOTE", [gentype]]]
else (new qq; lunify(qq := rtype car lx, rtype cadr(lx));
	     chain cadr(qq)) $

define "TYPE" lx;
typecnt:=64; newline; pretty rtype lx; " " $



%Galil and Seiferas' (G&S) linear-time palstar finder. 
Here a palstar is a string of palindromes of length at least two (to
avoid trivializing the problem).  Thus
test "ABBACCDEFED" would return ((A B B A) (C C) (D E F E D))
test "ABBACCDEFE" would return NIL because there is no way to parse
the input as a string of non-unit-length palindromes. %

special n $

% Following takes advantage of CGOL to get around LISP's insistence
that arrays start from 0 and index by 1.  In the following, A and M
start from -1 and R increments by 0.5 (the 0.5 is needed because R is
the centroid of a palindrome, and if the palindrome is of even length
the centroid will fall between two characters). All arrays can accept
real arguments, again not offered by LISP. %
prefix "A" 25 subst(right, "X", '#a(fix(x+1.1))') $
prefix "R" 25 subst(right, "X", '#r(fix(2*x+1.01))') $
prefix "L" 25 subst(right, "X", '#l(fix(x+.1))') $
prefix "U" 25 subst(right, "X", '#u(fix(x+.1))') $
prefix "V" 25 subst(right, "X", '#v(fix(x+.1))') $
prefix "M" 25 subst(right, "X", '#m(fix(x+1.1))') $
prefix "LINK" 25 subst(right, "X", '#link(fix(x+.1))') $

define "PALFIND" x, 1;	% Manacher's on-line palindrome finder - used by G&S %
x:=explodec x;
array(#a, t, (n:=length x)+2); fillarray("A", nil.x@[[nil]]);
array(#r, t, 1+2*length x);
n := float n;
r(-.5):= -.5; r 0 := 0.0;
let c=.5, lt=0.0, rt=1.0;
  while lt>-1 and rt<n and a lt = a rt do (lt:=lt-1; rt:=rt+1);
  if lt=rt and rt=n then return nil;
  r c := rt-c-1;
  let j1 = nil;
  iter for j := c+.5 step j+.5
	until ((j=rt and lt:=c)
	       or (j1:=c-(j-c); j1-r j1 = lt+1 and a rt = a(j-(rt-j))))
	return (c:=j; lt := j-(rt-j)) 
	do r j := min(r j1, rt-j-1)) $

% Galil and Seiferas' palstar parser %

define "SETL";		% sets prime palstar lengths at each position %
array(#l,t,fix n); fillarray("L", [1]);
let stack = nil;
for i in 0.0 to n-1 by .5 do
  (catch(for j in stack do
	(if j < i - r i then throw nil
	 else (l j := 2*(i-j+.5); stack := cdr stack)));
   if i = float fix i then stack := i . stack) $

define "SETUV";		% sets flags indicating presence of type 1, 2 palstars %
array(#u, t, fix n); array(#v,t, fix n);
for i in 0.0 to n-1 do
	(if l i > 1 then
	   let p = i + l i -1; if p<n and p - r p <= i then u i := t;
	   p := p+1; if p<n and p - r p <= i then v i := t) $

define "MARK";		% marks end of each palstar in some parse %
array(#m, t, 1 + fix n); m(-1) := t;
array(#link, t, fix n);
for i in 0.0 to n-1 do
  (if l i > 1 and m(i-1) then
    (	      let x  = i + l i - 1; if not m x then (m x := t; link x := fix i-1);
     if u i then (x := i+2*(l i)-2; if not m x then (m x := t; link x := fix i-1));
     if v i then (x := i+2*(l i)  ; if not m x then (m x := t; link x := fix i-1)))) $

define "PALSTAR" x, 10;	% if x is palstar, returns its parse, else nil %
palfind x; setl; setuv; mark;
if m(n-1) then
    let i=n-1, z=nil;
    while i>=0 do (z := implode((\x;a x)[1+link i to i]) . z; i := link i);
    z $

% Following is useful in coming up with random test inputs %
define "RANDSTR" n "ALPHABET" s, 19;  % random string, length n, alphabet size s %
implode((\x;ascii(65+random s))[1 to n]) $



%Salamin's elliptic integral algorithm for computation of pi.
To get n digits of pi, say PI n .  The answer will be an integer which
is pi times some power of ten. The last digit may be off by 1.  The
algorithm is not coded for efficiency - it serves only to exhibit the
principle of Salamin's algorithm, and to churn out a few paltry digits
of pi for those who don't trust tables.  As implemented below it is an
O(n**2) algorithm, and takes about a minute on the AI machine to get
400 decimal digits. %

%Auxiliary routine for integer square root %
define a "ISQ" b, 19;
new x; x := (b+a/b)/2; if |b-x| < 2 then x else a isq x $

% Integer square root of a - silly LISP's sqrt demands floating point! %
define "ISQRT" a; a isq 2**(\base;flatsize(a))(4) $

% Summation of 2**j * c[j+1] ; also computes agm %
define a "SIG" b, 19;
if |a-b| < 2 then (agm := a; 0) else ((a-b)/2)**2 + 2*((a+b)/2 sig isqrt(a*b)) $

% print pi to n significant digits %
define "PI" n, 1;
new sum, agm; sum := 10**n sig isqrt(10**(2*n)/2);
	1 + agm**2*10**(n-1)/(10**(2*n)/4-sum) $


%  Rabin's Composite Detector.  You can set k to whatever you want,
and the probability that PRIME will return T erroneously is 1/2**k.  It
will never return NIL erroneously. Running time is cubic in the length
of the input. %

special q,prod,tw,a,olda,qp,left,n,k,sw,w,hw,stopwatch $
?*nopoint t; stopwatch:=0; qp := 821; k := 20; w := 2**35; sw:=w-1; hw := w/2 $

alloc(!'(list (10000 12000 0.25) 
	 fixnum (10000 12000 0.90)
	 bignum (10000 12000 0.9))) $
gc?-overflow := '\x;nil' $

define "TIM"; -stopwatch + stopwatch := runtime()/1000000.0 $

define a "BY" b, 21;  a*b mod n $

define a "TOTHE" b, 22;   %  a**b mod n %
       if zerop b then 1
  else if  oddp b then a by (a tothe (b-1))
  else		       (a by a) tothe (b/2) $

define a "TOTHEL" b, 22;   % The list [a**b, a**(b/2), ..., a**(2x+1)] %
if a=1 or b=0 then [1]
else if oddp b then [a by (a tothe (b-1))]
else new x; x := a tothel (b/2); if car x = 1 then x else car x by car x . x $

define "BRND" x;    % Auxiliary routine for computing big random numbers%
if bigp x then random(sw) + w*brnd(x/hw) else random(x) $

define "BIGRANDOM" x, 25;  brnd(x) mod x $

define "CARMICHAEL" x ,12; if x then (car(x)-1 gcd n) ne 1 or Carmichael cdr x $

define "PRIME" n, 12;   % Returns T if n is prime %
if n<30 then n isin !'(2 3 5 7 11 13 17 19 23 29)
(n gcd 6469693230) = 1 and 
new kk,looksprime; kk:=0; looksprime:=t;
    while looksprime and kk<k do
	(new a; a := (bigrandom(n-2)+2) tothel (n-1);
	 if car a ne 1 or Carmichael cdr a then looksprime := nil;
    looksprime $

define "TESTWIN" n, 12;   % Returns T if n , n+2 both prime %
(\k; prime n and prime n+2 and prime n and prime n+2)(1)
and prime n and prime n+2 $

define "FINDTW" x,1;  % Finds some twin primes > prod of primes < x+1 %
new prod; prod := 1;
for i in 2 to x do if prime i then prod := prod*i;
new tw,q; tw:=prod+qp; q:=1; tim;
while t do
  (if testwin tw then  (write "Prod(" cat x cat ")*" cat q cat "+" cat qp;  
			if prime tw+6 then
				(princ " *";
				 if prime tw+8 then princ " *");
			write tim cat " secs");
   tw := tw+prod; q:=q+1) $

define "DOWN" n,1;   % Searches down from 2**n for primes %
new nn; nn:=2**n-1;
while not prime nn do nn:=nn-2;
write nn cat "	= 2**" cat n cat "-" cat 2**n-nn $



%Linear time Sieve of Eratosthenes program. %
% Due to Ross Gale and Vaughan Pratt. %
% The sieve is in the array sv, each word of which is 36 bits, although
only the rightmost 32 bits hold information.  If 2*i+1 is composite
then a 1 appears in position (i mod 32) of sv(i/32).  (Position 0 is the
least significant bit.)  %

% :A: is bitwise and, :V: is bitwise or, :↑: is leftshift %

define "PRIME" x,10;		% predicate that looks up sieve %
  x=2 or oddp x and sv(x:↑:-6):↑:(35-(x:↑:-1 :A: 31)) >= 0$

define "SIEVE" n,1;	% construct a sieve for primes < n %
array(sv,#fixnum,n/64+1);  % the sieve, initially all 0's %
sv(0):=1;		% 1 is composite by convention %
let s=[1];		% singleton set of numbers sieved so far %
for i in 3 to n/2 by 2 do   % enumerate those i %
 if prime i then	    % that are prime %
  (let ts=nil;		% temporary accumulator for new s %
   for j in s do	% j has been sieved, so j*i is composite for j~1 %
    (let k=j;		% k = j*i**z for some z %
     while k<n do	% and also k<n %
       (if k not isin [1,i,j] then
	  (let kq = k:↑:-6, kr = k:↑:-1 :A: 31;
	   sv(kq):=sv(kq) :V: (1 :↑: kr));	% sieve composite k %
	let m:=k*i;
	if m<n then ts:=k.ts;	% accumulate k if good chance of use %
	k:=m));			% multiply k by i %
   s:=ts)$		% ratify temporary accumulator %


(cgol) $
% <Ed: Comment added long after this program was written:  this program is
mostly rambling comment generated on-line while its inebriated author
grappled at the terminal with a problem raised some hours earlier by
Drew McDermott during a beer blast celebrating his successful thesis
defense.  Eventually, as I knew would happen, I had to write some code
to check that my answer was complete; the code appears at the end.>  %

% The following solves a problem raised by DVM shortly before I became
almost unable to type this.  A man thinks of two distinct numbers
between 2 and 100.  He tells the product to A and the sum to B, and
asks them to reconstruct the original numbers.  (i) A says he doesn't
know.  (ii) B then announces that he knew A didn't know.  (iii) A
retorts that now he knows.  (iv) B's comeback is that now he knows
too.  What was the bus-driver's name? <Ed: what were the two numbers?>%

% Here's the WWI ace practically unable to type this.  What is he
thinking? %

% Answer: what indeed %

% (Thinks).  What numbers could B have been given?  At least 5.
5⊃2,3, so not (ii).  6⊃2,4 so again not (ii).  7⊃2,5 or 3,4 , but
2,5 would give it away to A straight away.  Hence B seeing n=7
couldn't be sure A didn't know.  So we are looking for a number n no
binary partition of which is into two distinct primes.  8=3+5, 9=2+7,
10=3+7, 11=???, 12=5+7, 13=2+11, 14=3+11, 15=2+13, 16=3+13, 17=???...
In fact if n is odd and n-2 is not prime, we are clearly have such a
number. Conversely, if n is odd and n-2 is prime we have failed.  Now
consider just even n.  18=5+13, 20=3+17, 22=3+19, 24=5+19, 26=3+23,
28=5+23, 30=7+23, 32=3+29,...  We observe that 3-5-7 form a solid
block, the only way to beat which is to find three consecutive odd
non-primes.  Weelll,... 91 115 117 119 121 141 143 183 185 are all
that are in this category.  But 94=11+83, 118=11+107, 120=11+109,
122=13+109, 124=17+107, 144=13+131, 146=19+127, 186=13+173, 188=31+157
(a sort of local success for Goldbach's conjecture). So this rules out
the possibility of even n. Hence B must have seen odd n such that n-2
is not prime, or equivalently n such that n-2 is an odd composite.
Thus n is one of 11 17 23 27 29 35 37 41 47 53 ...

Now if A now knows the answer, it could have been (2,9), (3,8),
(4,7),... which we note have the property that their products are
each associated only with one n in the above list.  (E.g., (5,6) is
excluded because its product 30 is associated with not only 11=5+6 but
17=2+15.)  But now B knows the answer, which would be impossible if it
were any of (2,9), (3,8) or (4,7), since B's knowing the sum is 11
doesn't help here.  More generally, we are looking for an n with
exactly one binary partition whose product is not the product of the
binary partition of some other n (always n is 11 17 23 27 ... as
above.)  So the following should work:

for each n such that n-2 is odd composite do
  if there exist none or more than one partition of n such that no
  other factorizations of product of this partition have their sum-2
  odd composite then reject n.

Checking this for the first few values of n, we find:
n=11: eliminated by argument above about (2,9), (3,8) and (4,7).
n=17: 2*15=5*6, 3*14=2*21, 5*12=3*20, 6*11=2*33, 7*10=2*35, 8*9=3*24.
This leaves only (4,13) for n=17, so (4,13) has to be one answer.  We
now check that there are no further answers as follows.  %

fasload:=nil $

load rab$  % prime number package %

delim "BY" $

array(g,t,200); array(awful,fixnum,9901) $

define "SEARCH";
for n in 11 to 199 by 2 do if not prime(n-2) then
  (g(n) := nil;
   for j in 2 to n/2 do
     (jtnmj := j*(n-j); g(n) := jtnmj . g(n);
for n in 11 to 199 by 2 do if not prime(n-2) then
  (p := 0;
   for j in 2 to n/2 do if awful(j*(n-j))=1 then (p:=p+1; x:=j);
   if p=1 then (print n; princ x)) $

% Running this program yielded only n=17, x=4, so the pair of numbers
had to be (4,13) %



% Repetition finding algorithm; Pratt's variant of Weiner's algorithm %

% The following paragraph defines some slick syntax which allows this
code to be read in conjunction with my paper on this variant of
Weiner's algorithm, available from the author at the MIT AI Lab. %

"TAILS" of ":" := nil; newtok ":=" $   % so w:a is not confused with :a: %
infixr "." 26 [["GET", left, ["QUOTE", "ONE"]], right] $ % the w.a property %
infixr ":" 26 [["GET", left, ["QUOTE", "TWO"]], right] $ % the w:a property %
prefix "*" 26 [["GET", xx:=right; check ":";right, ["QUOTE", "ABOVE"]], xx] $
							 % the *a:w property %
literal suc, loc, len $  % Avoids having to write "SUC" etc %

define "GENERATE";   %creates a new node in the graph%
let x=ncons nil;
"ONE" of x := array(nil,t,6);  % the w.a property %
"TWO" of x := array(nil,t,6);  % the w:a property %
"ABOVE" of x := array(nil,t,6); % the *a:w property %
x $

define "CREATEVERTEX";    % what to do when a node to be visited isn't there %
wa := generate; len of wa := 1 + len of w; loc of wa := w.a;
w:a := wa;   % connect w to wa via w's :a link %
u := if w = eps then eps   % u is to be wa's longest proper suffix in the graph %
     else (tt := suc of w;
	   while null tt:a and tt ne eps do tt := suc of tt;
	   tt:a or eps);
c := a(loc of wa - len of u - 1); x := *c:u;
suc of wa := u;			% link wa to u %
*c:u := wa;			% link u to wa %
if x then (suc of x := wa;	% if x exists link it to wa %
	   d := a(loc of x - len of wa - 1); *d:wa := x);
if null x then wa.a(w.a) := w.a + 1
else for b in 0 to 5 do wa.b := x.b $

define "SCAN" x;  % Main routine.  X is a list of numbers in range [0,5] %
n := length x; array(a,t,1+n); fillarray("A", cons(0, x));  % input %
eps := generate; len of eps := 0; loc of eps := 1; suc of eps := eps;
w := eps; i := 1;				% initialization %
while i <= n do					% scan whole string %
  (a := a(i);					% a abbreviates a(i) %
   if null w.a then				% seen wa before? %
	(w.a := i+1;				% no, record its position %
	 if w ne eps then w := suc of w		% then push [ pointer %
		     else i := i+1)		% or push both [ and ] %
   else (if null w:a then createvertex;		% new node if only 2nd wa %
	 w := w:a; i := i+1);			% push ] pointer %
   print i; princ len of wi cat " " cat loc of w) $  % trace i,w %