perm filename CLVALI.MSG[COM,LSP]3 blob sn#813808 filedate 1986-03-29 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	Introduction
C00006 ENDMK
C⊗;
Introduction
Welcome to the Common Lisp Validation Subgroup.
In order to mail to this group, send to the address:

		CL-Validation@su-ai.arpa

Capitalization is not necessary, and if you are directly on the ARPANET,
you can nickname SU-AI.ARPA as SAIL. An archive of messages is kept on
SAIL in the file:

			   CLVALI.MSG[COM,LSP]

You can read this file or FTP it away without logging in to SAIL.

To communicate with the moderator, send to the address:

		CL-Validation-request@su-ai.arpa

Here is a list of the people who are currently on the mailing list:

Person			Affiliation	Net Address

Richard Greenblatt	LMI		"rg%oz"@mc
Scott Fahlman		CMU		fahlman@cmuc
Eric Schoen		Stanford	schoen@sumex
Gordon Novak		Univ. of Texas	novak@utexas-20
Kent Pitman		MIT		kmp@mc
Dick Gabriel		Stanford/Lucid	rpg@sail
David Wile		ISI		Wile@ISI-VAXA
Martin Griss		HP		griss.hplabs@csnet-relay (I hope)
Walter VanRoggen	DEC		wvanroggen@dec-marlboro
Richard Zippel		MIT		rz@mc
Dan Oldman		Data General	not established
Larry Stabile		Apollo		not established
Bob Kessler		Univ. of Utah	kessler@utah-20
Steve Krueger		TI		krueger.ti-csl@csnet-relay
Carl Hewitt		MIT		hewitt-validation@mc
Alan Snyder		HP		snyder.hplabs@csnet-relay
Jerry Barber		Gold Hill	jerryb@mc
Bob Kerns		Symbolics	rwk@mc
Don Allen		BBN		allen@bbnf
David Moon		Symbolics	moon@scrc-stonybrook
Glenn Burke		MIT		GSB@mc
Tom Bylander		Ohio State	bylander@rutgers
Richard Soley		MIT		Soley@mc
Dan Weinreb		Symbolics	DLW@scrc-stonybrook
Guy Steele		Tartan		steele@tl-20a
Jim Meehan		Cognitive Sys.	meehan@yale
Chris Reisbeck		Yale		riesbeck@yale

The first order of business is for each of us to ask people we know who may
be interested in this subgroup if they would like to be added to this list.

Next, we ought to consider who might wish to be the chairman of this subgroup.
Before this happens, I think we ought to wait until the list is more nearly
complete. For example, there are no representatives of Xerox, and I think we
agree that LOOPS should be studied before we make any decisions.

∂23-Sep-84  1625	RPG  	Introduction  
To:   cl-validation@SU-AI.ARPA   
Welcome to the Common Lisp Validation Subgroup.
In order to mail to this group, send to the address:

		CL-Validation@su-ai.arpa

Capitalization is not necessary, and if you are directly on the ARPANET,
you can nickname SU-AI.ARPA as SAIL. An archive of messages is kept on
SAIL in the file:

			   CLVALI.MSG[COM,LSP]

You can read this file or FTP it away without logging in to SAIL.

To communicate with the moderator, send to the address:

		CL-Validation-request@su-ai.arpa

Here is a list of the people who are currently on the mailing list:

Person			Affiliation	Net Address

Richard Greenblatt	LMI		"rg%oz"@mc
Scott Fahlman		CMU		fahlman@cmuc
Eric Schoen		Stanford	schoen@sumex
Gordon Novak		Univ. of Texas	novak@utexas-20
Kent Pitman		MIT		kmp@mc
Dick Gabriel		Stanford/Lucid	rpg@sail
David Wile		ISI		Wile@ISI-VAXA
Martin Griss		HP		griss.hplabs@csnet-relay (I hope)
Walter VanRoggen	DEC		wvanroggen@dec-marlboro
Richard Zippel		MIT		rz@mc
Dan Oldman		Data General	not established
Larry Stabile		Apollo		not established
Bob Kessler		Univ. of Utah	kessler@utah-20
Steve Krueger		TI		krueger.ti-csl@csnet-relay
Carl Hewitt		MIT		hewitt-validation@mc
Alan Snyder		HP		snyder.hplabs@csnet-relay
Jerry Barber		Gold Hill	jerryb@mc
Bob Kerns		Symbolics	rwk@mc
Don Allen		BBN		allen@bbnf
David Moon		Symbolics	moon@scrc-stonybrook
Glenn Burke		MIT		GSB@mc
Tom Bylander		Ohio State	bylander@rutgers
Richard Soley		MIT		Soley@mc
Dan Weinreb		Symbolics	DLW@scrc-stonybrook
Guy Steele		Tartan		steele@tl-20a
Jim Meehan		Cognitive Sys.	meehan@yale
Chris Reisbeck		Yale		riesbeck@yale

The first order of business is for each of us to ask people we know who may
be interested in this subgroup if they would like to be added to this list.

Next, we ought to consider who might wish to be the chairman of this subgroup.
Before this happens, I think we ought to wait until the list is more nearly
complete. For example, there are no representatives of Xerox, and I think we
agree that LOOPS should be studied before we make any decisions.

∂02-Oct-84  1318	RPG  	Chairman 
To:   cl-validation@SU-AI.ARPA   
Now that we've basically got most everyone who is interested on the mailing
list, let's pick a chairman. I suggest that people volunteer for chairman.

The duties are to keep the discussion going, to gather proposals and review
them, and to otherwise administer the needs of the mailing list. I will
retain the duties of maintaining the list itself and the archives, but
otherwise the chairman will be running the show. 

Any takers?
			-rpg-

∂05-Oct-84  2349	WHOLEY@CMU-CS-C.ARPA 	Chairman     
Received: from CMU-CS-C.ARPA by SU-AI.ARPA with TCP; 5 Oct 84  23:49:33 PDT
Received: ID <WHOLEY@CMU-CS-C.ARPA>; Sat 6 Oct 84 02:49:51-EDT
Date: Sat, 6 Oct 1984  02:49 EDT
Message-ID: <WHOLEY.12053193572.BABYL@CMU-CS-C.ARPA>
Sender: WHOLEY@CMU-CS-C.ARPA
From: Skef Wholey <Wholey@CMU-CS-C.ARPA>
To:   Cl-Validation@SU-AI.ARPA
CC:   Dick Gabriel <RPG@SU-AI.ARPA>
Subject: Chairman 

I'd be willing to chair this mailing list.

I've been very much involved in most aspects of the implementation of Spice
Lisp, from the microcode to the compiler and other parts of the system, like
the stream system, pretty printer, and Defstruct.  A goal of ours is that Spice
Lisp port easily, so most of the system is written in Common Lisp.

Since our code is now being incorporated into many implementations, it's
crucial that it correctly implement Common Lisp.  A problem with our code is
that some of it has existed since before the idea of Common Lisp, and we've
spent many man-months tracking the changes to the Common Lisp specification as
the language evolved.  I am sure we've got bugs because I'm sure we've missed
"little" changes between editions of the manual.

So, I'm interested first in developing code that will aid implementors in
discovering pieces of the manual they may have accidentally missed, and second
in verifying that implementation X is "true Common Lisp."  I expect that the
body of code used for the first purpose will evolve into a real validation
suite as implementors worry about smaller and smaller details.

I've written little validation suites for a few things, and interested parties
can grab those from <Wholey.Slisp> on CMU-CS-C.  Here's what I have right now:

	Valid-Var.Slisp		Checks to see that all variables and constants
				in the CLM are there, and satisfy simple tests
				about what their values should be.

	Valid-Char.Slisp	Exercises the functions in the Characters
				chapter of the CLM.

	Valid-Symbol.Slisp	Exercises the functions in the Symbols chapter
				of the CLM.

Some of the tests in the files may seem silly, but they've uncovered a few bugs
in both Spice Lisp and the Symbolics CLCP.

I think more programs that check things out a chapter (or section) at a time
would be quite valuable, and I'm willing to devote some time to coordinating
such programs into a coherent library.

--Skef

∂13-Oct-84  1451	RPG  	Chairman 
To:   cl-validation@SU-AI.ARPA   

Gary Brown of DEC, Ellen Waldrum of TI, and Skef Wholey of CMU
have volunteered to be chairman of the Validation subgroup. Perhaps
these three people could decide amongst themselves who should be
chairman and let me know by October 24.

			-rpg-

∂27-Oct-84  2159	RPG  	Hello folks   
To:   cl-validation@SU-AI.ARPA   

We now have a chairman of the charter:  Bob Kerns of Symbolics.  I think
he will make an excellent chairman.  For your information I am including
the current members of the mailing list.

I will now let Bob take over responsibility for the discussion.

Dave Matthews		HP		"hpfclp!validation%hplabs"@csnet-relay
Ken Sinclair 		LMI		"khs%mit-oz"@mit-mc
Gary Brown		DEC		Brown@dec-hudson
Ellen Waldrum		TI		WALDRUM.ti-csl@csnet-relay
Skef Wholey		CMU		Wholey@cmuc
John Foderaro		Berkeley	jkf@ucbmike.arpa
Cordell Green		Kestrel		Green@Kestrel
Richard Greenblatt	LMI		"rg%oz"@mc
Richard Fateman		Berekely	fateman@berkeley
Scott Fahlman		CMU		fahlman@cmuc
Eric Schoen		Stanford	schoen@sumex
Gordon Novak		Univ. of Texas	novak@utexas-20
Kent Pitman		MIT		kmp@mc
Dick Gabriel		Stanford/Lucid	rpg@sail
David Wile		ISI		Wile@ISI-VAXA
Martin Griss		HP		griss.hplabs@csnet-relay (I hope)
Walter VanRoggen	DEC		wvanroggen@dec-marlboro
Richard Zippel		MIT		rz@mc
Dan Oldman		Data General	not established
Larry Stabile		Apollo		not established
Bob Kessler		Univ. of Utah	kessler@utah-20
Steve Krueger		TI		krueger.ti-csl@csnet-relay
Carl Hewitt		MIT		hewitt-Validation@mc
Alan Snyder		HP		snyder.hplabs@csnet-relay
Jerry Barber		Gold Hill	jerryb@mc
Bob Kerns		Symbolics	rwk@mc
Don Allen		BBN		allen@bbnf
David Moon		Symbolics	moon@scrc-stonybrook
Glenn Burke		MIT		GSB@mc
Tom Bylander		Ohio State	bylander@rutgers
Richard Soley		MIT		Soley@mc
Dan Weinreb		Symbolics	DLW@scrc-stonybrook
Guy Steele		Tartan		steele@tl-20a
Jim Meehan		Cognitive Sys.	meehan@yale
Chris Reisbeck		Yale		riesbeck@yale

∂27-Oct-84  2202	RPG  	Correction    
To:   cl-validation@SU-AI.ARPA   

The last message about Bob Kerns had a typo in it. He is chairman
of the validation subgroup, not the charter subgroup. Now you
know my secret abot sending out these announcements!
			-rpg-

∂02-Nov-84  1141	brown@DEC-HUDSON 	First thoughts on validation    
Received: from DEC-HUDSON.ARPA by SU-AI.ARPA with TCP; 2 Nov 84  11:38:53 PST
Date: Fri, 02 Nov 84 14:34:24 EST
From: brown@DEC-HUDSON
Subject: First thoughts on validation
To: cl-validation@su-ai
Cc: brown@dec-hudson

I am Gary Brown and supervise the Lisp Development group at Digital
I haven't seen any mail about validation yet, so this is to get things
started.

I think there are three areas we need to address:

 1) The philosophy of validation - What are we going to validate and
    what are we explicitly not going to check?

 2) The validation process - What kind of mechanism should be used to
    implement the validation suite, to maintain it, to update it and
    actually validate Common Lisp implementations?

 3) Creation of an initial validation suite - I believe we could disband
    after reporting on the first two areas, but it would be fun if we
    could also create a prototype validation suite.  Plus, we probably
    can't do a good job specifying the process if we haven't experimented.

Here are my initial thoughts about these three areas:

PHILOSOPHY
We need to clearly state what the validation process is meant to 
accomplish and what it is not intended to accomplish.  There are
aspects of a system of interest to users which we cannot validate.
For example, language validation should not be concerned with:
 - The performance/efficiency of the system under test.  There should
   no timing tests built into the validation suite.
 - The robustness of the system.  How it responds to errors, the
   usefulness of error messages should not be a consideration
   in the design of tests.
 - The type of support tools such as debuggers and editors should
   not be tested or reported on.
In general, the validation process should report only on  whether or
not the implementation is a legal common lisp as defined by the
common lisp reference manual.   Any other information derived from
the testing process should not be made public.  The testing process
cannot produce information which can be used by vendors as advertisements
for their implementations or to degrade other implementations.

We need to state how we will test language elements which are ill-defined
in the reference manual.  For example, if the manual states that it
is "an error" to do something, then we cannot write a test for that
situation.  However, if the manual states that an "error is signaled"
then we should verify that. 

There are several functions in the language whose action is implementation
dependent.  I don't see how we can write a test for INSPECT or for
the printed appearance when *PRINT-PRETTY* is on (however, we can
insure that what is printed is still READable).

PROCESS
We need to describe a process  for language validation.  We could
have a very informal process where the test programs are publicly
available and  potential customers acquire and run the tests.  However, 
I think we need, at least initially, a more formal process.

A contract should be written (with ARPA money?) to some third party
software house to produce and maintain the validation programs, to
execute the tests, and to report the results.  I believe the ADA
validation process works something like this:
 - Every six months a "field test" version of the validation suite
   suite is produced (and the previous field test version is made the
   official version).  Interested parties can acquire the programs
   run them and comment back to SofTech.
 - When a implementation wants to validate, it tells some government
   agency, gets the current validation suite, runs it and send all
   the output back.
 - An appointment is then set up and people the validation agency
   come vendor and run all the tests themselves, again bundle up
   the output and take it away.
 - Several weeks later, the success of the testing is announced.

This seems like a reasonable process to me.  We might want to modify
it by:
 - Having the same agency that produced the tests, validate their results.
 - Getting rid of the on site visit requirement;  it's expensive I
   think the vendor needs to include a check for $10,000 to when
   they request validation.  That might be hard for universities
   to justify.

Some other things I think need to set up are:
 - A good channel from the test producers to the language definers 
   for quick clarifications and to improve the manual
 - Formal ways to complain about the contents of test
 - Ways for new tests to be suggested.  Customers are sure to
   find bugs in validated systems, so it would be invaluable if
   they could report these as holes in the test system.

A FIRST CUT
To do a good job defining the validation process, I think we need to
try to produce a prototype test system.  At Digital we have already
expended considerable effort writing tests for VAX LISP and I assume that
everyone else implementing Common Lisp done the same.  Currently, our 
test software is considered proprietary information.  However, I believe
that we would be willing to make it public domain if the other vendors
were willing to do the same. 

If some kind of informal agreement can be made, we should try to specify
the form of the tests, have everyone convert their applicable tests
to this form and then exchange tests.  This will surely generate
a lot of information on how the test system should be put together.

-Gary Brown

∂04-Nov-84  0748	FAHLMAN@CMU-CS-C.ARPA 	Second thoughts on validation   
Received: from CMU-CS-C.ARPA by SU-AI.ARPA with TCP; 4 Nov 84  07:47:00 PST
Received: ID <FAHLMAN@CMU-CS-C.ARPA>; Sun 4 Nov 84 10:47:06-EST
Date: Sun, 4 Nov 1984  10:47 EST
Message-ID: <FAHLMAN.12060893556.BABYL@CMU-CS-C.ARPA>
Sender: FAHLMAN@CMU-CS-C.ARPA
From: "Scott E. Fahlman" <Fahlman@CMU-CS-C.ARPA>
To:   cl-validation@SU-AI.ARPA
Subject: Second thoughts on validation


I agree with all of Gary Brown's comments on the proper scope of
validation.  The only point that may cause difficulty is the business
about verifying that an error is signalled in all the places where this
is specified.  The problem there is that until the Error subgroup does
its thing, we have no portable way to define a Catch-All-Errors handler
so that the valiadtion program can intercept such signals and proceed.
Maybe we had better define such a hook right away and require that any
implementation that wants to be validated has to support this, in
addition to whatever more elegant hierarchical system eventually gets
set up.  The lack of such a unversal ERRSET mechanism is clearly a
design flaw in the language.  We kept putting this off until we could
figure out what the ultimate error handler would look like, and so far
we haven't done that.

As for the process, I think that the validation suite is naturally going
to be structured as a series of files, each of which contains a function
that will test some particular part of the language: a chapter's worth
or maybe just some piece of a chapter such as lambda-list functionality.
That way, people can write little chunks of validation without being
overwhelmend by the total task.  Each such file should have a single
entry point to a master function that runs everything else in the file.
These things should print out an informative message whenever it notices
an implementation error.  They can also print out some other commentary
at the implementor's discretion, but probably there should be a switch
that will muzzle anything other than hard errors.  Finally, there should
be some global switch that starts out as NIL and gets set to T whenever
some module finds a clear error.  If this is still NIL after every
module has done its testing, the implementation is believed to be
correct.  I was going to suggest a counter for this, but then we might
get some sales rep saying that Lisp X has 14 validation errors and our
Lisp only has 8.  That would be bad, since some errors are MUCH more
important than others.

To get the ball rolling, we could begin collecting public-domain
validation modules in some place that is easily accessible by arpanet.
As these appear, we can informally test various implementations against
them to smoke out any inconsistencies or disagreements about the tests.
I would expect that when this starts, we'll suddenly find that we have a
lot of little questions to answer about the language itself, and we'll
have to do our best to resolve those questions quickly.  Once we have
reached a consensus that a test module is correct, we can add it to some
sort of "approved" list, but we should recognize that, initially at
least, the testing module is as likely to be incorrect as the
implementation.

As soon as possible, this process of maintaining and distributing the
validation suite (and filling in any holes that the user community does
not fill voluntarily) should fall to someone with a DARPA contract to do
this.  No formal testing should begin until this organization is in
place and until trademark protection has been obtained for "DARPA
Validated Common Lisp" or whatever we are going to call it.  But a lot
can be done informally in the meantime.

I don't see a lot of need for expensive site visits to do the
validating.  It certainly doesn't have to be a one-shot win-or-lose
process, but can be iterative until all the tests are passed by the same
system, or until the manufacturer decides that it has come as close as
it is going to for the time being.  Some trusted (by DARPA), neutral
outside observer needs to verify that the hardware/software system in
question does in fact run the test without any chicanery, but there are
all sorts of ways of setting that up with minimal bureaucratic hassle.
We should probably not be in the business of officially validating
Common Lisps on machines that are still under wraps and are not actually
for sale, but the manufacturers (or potential big customers) could
certainly run the tests for themselves on top-secret prototypes and be
ready for official validation as soon as the machine is released to the
public.

I'm not sure how to break the deadlock in which no manufacturer wants to
be the first to throw his proprietary validation software into the pot.
Maybe this won't be a problem, if one of the less bureaucratic companies
just decides to take the initiative here.  But if there is such a
deadlock, I suppose the way to proceed is first to get a list of what
each company proposes to offer, then to Get agreement from each that it
will donate its code if the others do likewise, then to get some lawyer
(sigh!) to draw up an agreement that all this software will be placed in
the public domain on a certain date if all the other companies have
signed the agreement by that date.  It would be really nice to avoid
this process, however.  I see no advantage at all for a company to have
its own internal validation code, since until that code ahs been
publically scrutinized, there is no guarantee that it would be viewed as
correct by anyone else or that it will match the ultimate standard.

-- Scott

∂07-Nov-84  0852	brown@DEC-HUDSON 	test format 
Received: from DEC-HUDSON.ARPA by SU-AI.ARPA with TCP; 7 Nov 84  08:43:57 PST
Date: Wed, 07 Nov 84 11:40:37 EST
From: brown@DEC-HUDSON
Subject: test format
To: cl-validation@su-ai

First, I would hope that submission of test software will not require
any lawyers.  I view this as a one-time thing, the only purpose of which
is to get some preliminary test software available to all implementations,
and to give this committee some real data on language validation.
The creation and maintenance of the real validation software should be
the business of the third party funded to do this.  I would hope that
they can use what we produce, but that should not be a requirement.

If we are going to generate some preliminary tests, we should develop
a standard format for the tests.   I have attached a condensed and
reorganizied version of the "developers guide" for our test system.
Although I don't think our test system is particularly elegant, it
basically works.  There are a few things I might change someday:

  - The concept of test ATTRIBUTES is not particularly useful.  We
    have never run tests by their attributes but always run a whole
    file full of them.  

  - The expected result is not evaluated (under the assumption that
    if it were, most of the time you would end up quoting it.  That
    is sometimes cumbersome.

  - There is not a builtin way to check multiple value return.  You
    make the test-case do a multiple-value-list and look at the list.
    That is sometimes cumbersome and relatively easy to fix.

  - We haven't automated the analysis of the test results.

  - Our test system is designed to handle lot of little tests and I
    think that it doesn't simplify writing complex tests.  I have
    never really thought about what kind of tools would be useful.

If we want to try to build some tests, I am willing to change our test
system to incorporate any good ideas and make it available.

-Gary


!

     1  A SAMPLE TEST DEFINITION

          Here is the test for GET.

     (def-lisp-test (get-test :attributes (symbols get)
                              :locals (clyde foo))
       "A test of get.  Uses the examples in the text."
       ((fboundp 'get) ==> T)
       ((special-form-p 'get) ==> NIL)
       ((macro-function 'get) ==> NIL)
       ((progn
           (setf (symbol-plist 'foo) '(bar t baz 3 hunoz "Huh?"))
           (get 'foo 'bar))
         ==> T)
       ((get 'foo 'baz) ==> 3)
       ((get 'foo 'hunoz) ==> "Huh?")
       ((prog1
           (get 'foo 'fiddle-sticks)
           (setf (symbol-plist 'foo) NIL))
         ==> NIL)
       ((get 'clyde 'species) ==> NIL)
       ((setf (get 'clyde 'species) 'elephant) ==> elephant)
       ((get 'clyde) <error>)
       ((prog1
           (get 'clyde 'species)
           (remprop 'clyde 'species))
         ==> elephant)
       ((get) <error>)
       ((get 2) <error>)
       ((get 4.0 'f) <error>))
     Notice that everything added to the property list is taken off  again,
     so  that  the  test's  second run will also work.  Notice also that it
     isn't wise to start by testing for

             ((get 'foo 'baz)  ==> NIL)

     as someone may have decided to give FOO the property  BAZ  already  in
     another test.



     2  DEFINING LISP TESTS

          Tests are defined with the DEF-LISP-TEST macro.

     DEF-LISP-TEST {name | (name &KEY :ATTRIBUTES :LOCALS)}           [macro]
                   [doc-string] test-cases







                                   - 1 -
!
                                                                Page 2


     3  ARGUMENTS TO DEF-LISP-TEST

     3.1  Name

          NAME is the name of the  test.   Please  use  the  convention  of
     calling  a  test FUNCTION-TEST, where FUNCTION is the name of (one of)
     the function(s) or variable(s) tested by that test.  The  symbol  name
     will  have  the  expanded test code as its function definition and the
     following properties:

           o  TEST-ATTRIBUTES - A list of all the attribute  symbols  which
              have this test on their TEST-LIST property.

           o  TEST-DEFINITION -  The  expanded  test  code.   Normally  the
              function  value  of  the  test is compiled; the value of this
              property is EVALed to run the test interpreted.

           o  TEST-LIST - The list of tests  with  NAME  as  an  attribute.
              This list will contain at least NAME.




     3.2  Attributes

          The value of :ATTRIBUTES is a list of  "test  attributes".   NAME
     will  be  added to this list.  Each symbol on this list will have NAME
     added to the list which is the value of its TEST-LIST property.



     3.3  Locals

          Local variables can be specified  and  bound  within  a  test  by
     specifying the :LOCALS keyword followed by a list of the for used in a
     let var-list.  For example, specifying the list (a b c)  causes  a,  b
     and c each to be bound to NIL during the run of the test; the list ((a
     1) (b 2) (c 3)) causes a to be bound to 1, b to 2, and c to  3  during
     the test.



     3.4  Documentation String

          DOC-STRING is a normal documentation string of documentation type
     TESTS.   To  see  the documentation string of a function FOO-TEST, use
     (DOCUMENTATION 'FOO-TEST 'TESTS).   The  documentation  string  should
     include  the  names of all the functions and variables to be tested in
     that test.  Mention if there is anything missing from the  test,  e.g.
     tests of the text's examples.




                                   - 2 -
!
                                                                Page 3


     3.5  Test Cases

          TEST-CASES (the remainder of the body) is a series of test cases.
     Each  test  case  is  a  list of a number of elements as follows.  The
     order specified here must hold.



     3.5.1  Test Body -

          A form to be executed as the test body.  If it  returns  multiple
     values, only the first will be used.



     3.5.2  Failure Option -

          The symbol <FAILURE> can be used to indicate that the  test  case
     is  known  to  cause  an  irrecoverable  error  (e.g.  it goes into an
     infinite loop).  When the test case is run, the code is not  executed,
     but  a  message  is  printed  to  remind you to fix the problem.  This
     should be followed by normal result options.  Omission of this  option
     allows the test case to be run normally.



     3.5.3  Result Options -



     3.5.3.1  Comparison Function And Expected Result -

          The Test Body will be compared with the Expected Result using the
     function EQUAL if you use
             ==> expected-result
     or with the function you specify if you use
             =F=> function expected-result
     There MUST be white-space after ==> and =F=>, as they are  treated  as
     symbols.   Notice  that neither function nor expected-result should be
     quoted.  "Function" must be defined; an explicit lambda form is legal.
     "Expected-Result"  is the result you expect in evaluating "test-body".
     It is not evaluated.  The comparison function will be called  in  this
     format:
             (function test-body 'expected-value)



     3.5.3.2  Errors -

          <ERROR> - The test is expected to signal  an  error.   This  will
     normally  be  used  with  tests which are expected to generate errors.
     This is an alternative  to  the  comparison  functions  listed  above.
     There should not be anything after the symbol <ERROR>.  It checks that

                                   - 3 -
!
                                                                Page 4


     an error is signaled when the test case is run interpreted,  and  that
     an  error  is  signaled  either  during the compilation of the case or
     while the case is being evaluated when the test is run compiled.



     3.5.3.3  Throws -

          =T=> - throw-tag result - The test is expected to  throw  to  the
     specified  tag  and  return  something  EQUAL to the specified result.
     This clause is only required for a small number of tests.  There  must
     be  a  space  after  =T=>,  as  it is treated as a symbol.  This is an
     alternative to the functions given above.  This does not work compiled
     at the moment, due to a compiler bug.



     4  RUNNING LISP TESTS

          The function RUN-TESTS can be called with no arguments to run all
     the  tests,  with  a  symbol which is a test name to run an individual
     test, or with a list of symbols, each of which is an attribute, to run
     all  tests  which have that attribute.  Remember that the test name is
     always added to the attribute list automatically.

          The special variable *SUCCESS-REPORTS* controls whether  anything
     will be printed for successful test runs.  The default value is NIL.

          The special variable *START-REPORTS* controls whether  a  message
     containing  the  test  name  will be printed at the start of each test
     execution.  The default value is NIL.  If *SUCCESS-REPORTS* is T, this
     variable is treated as T also.

          The special variable *RUN-COMPILED-TESTS*  controls  whether  the
     "compiled"  versions  of the specified tests will be run.  The default
     value is T.

          The special variable *RUN-INTERPRETED-TESTS* controls whether the
     "interpreted"  versions  of  the  specified  tests  will  be run.  The
     default value is T.

          The special  variable  *INTERACTIVE*  controls  whether  you  are
     prompted  after  unexpected errors for whether you would like to enter
     debug.   It  uses  yes-or-no-p.   To  continue  running  tests   after
     enterring  debug  after  one  of  these  prompts,  type  CONTINUE.  If
     *INTERACTIVE* is set to T, the test system  will  do  this  prompting.
     The default value is NIL.



     5  GUIDE LINES FOR WRITING TEST CASES

          1.  The first several test cases in each test should be tests for

                                   - 4 -
!
                                                                Page 5


     the  existence  and correct type of each of the functions/variables to
     be    tested    in    that    test.     A    variable,     such     as
     *DEFAULT-PATHNAME-DEFAULTS*, should have tests like these:

             ((boundp '*default-pathname-defaults*) ==> T)
             ((pathnamep *default-pathname-defaults*) ==> T)


          A function, such as OPEN, should have these tests:

             ((fboundp 'open) ==> T)
             ((macro-function 'open) ==> NIL)
             ((special-form-p 'open) ==> NIL)


          A macro, such as WITH-OPEN-FILE, should have these tests:

             ((fboundp 'with-open-file) ==> T)
             ((not (null (macro-function 'with-open-file))) T)

     Note that, as MACRO-FUNCTION returns the function definition (if it is
     a  macro)  or  NIL  (if  it  isn't  a  macro),  we  use NOT of NULL of
     MACRO-FUNCTION here.  Note also that a macro may  also  be  a  special
     form,  so  SPECIAL-FORM-P  is not used:  we don't care what the result
     is.

          A special form, such as SETQ, should have these tests:

             ((fboundp 'setq) ==> T)
             ((not (null (special-form-p 'setq))) T)

     Again, note that SPECIAL-FORM-P returns the function definition (if it
     is  a  special  form)  or  NIL (if it isn't), so we use NOT of NULL of
     SPECIAL-FORM-P here.  Note also that we don't care  if  special  forms
     are also macros, so MACRO-FUNCTION is not used.



          2.  The next tests  should  be  simple  tests  of  each  of  your
     functions.   If  you  start  right  in  with complicated tests, it can
     become difficult to unravel simple bugs.  If possible, create one-line
     tests which only call one of the functions to be tested.

          E.g.  for +:

             ((+ 2 10) ==> 12)




          3.  Test each of the examples given in the Common Lisp Manual.



                                   - 5 -
!
                                                                Page 6


          4.  Then test more complicated cases.  Be sure to test both  with
     and  without each of the optional arguments and keyword arguments.  Be
     sure to test what the manual SAYS, not what you know that we do.



          5.  Then test for obvious cases which  should  signal  an  error.
     Obvious  things  to test are that it signals an error if there are too
     few or too many arguments, or if the argument is of  the  wrong  type.
     E.g.  for +

             ((+ 2 'a) <ERROR>)




     6  HINTS

          Don't try to be  clever.   What  we  need  first  is  a  test  of
     everything.   If  we decide that we need "smarter" tests later, we can
     go back and embellish.  Right now we need to have a  test  that  shows
     whether the functions and variables we are supposed to have are there,
     and that tells whether  at  first  glance  the  function  is  behaving
     properly.  Even with simple tests this test system will be huge.

          Don't write long test cases if you can help it.  Think about  the
     kind  of error messages you might get and how easy it will be to debug
     them.

          Remember that, although the test system guarantees that the  test
     cases  within  one  test are run in the order defined, no guarantee is
     made that your tests will be run  in  the  order  in  which  they  are
     loaded.   Do  not  write  tests which depend on other tests having run
     before them.

          It is now possible to check for cases which should signal errors;
     please do.

          I have found it easiest to compose and  then  debug  tests  which
     have no more than 20 cases.  Once a test works I often add a number of
     cases, however, and I do have some  with  over  100  cases.   However,
     sometimes  tests  with as few as 10 cases can be difficult to unravel,
     if, for example, the test won't compile properly.  Therefore, if there
     is  a  group  of related functions which require many tests each, I am
     more likely to have a separate test for each function.  If testing one
     function   is   made   more   easy   by  also  testing  another  (e.g.
     define-logical-name, translate-logical-name and  delete-logical-name),
     it  can  be advantageous to test them together.  It is not a good idea
     to make the test cases or returned values very large, however.   Also,
     when many functions are tested in the same test, it is likely that the
     tests can get complicated to debug and/or that some aspect of  one  of
     the  functions  tested  could be forgotten.  Therefore, I would prefer
     that you NOT write, say, four or five tests, each of which is supposed

                                   - 6 -
!
                                                                Page 7


     to  test  all  of  the  functions  in one part of the manual.  I would
     prefer that a function have a test which is dedicated to it  (even  if
     it  is  shared with one or two other functions).  This means that some
     functions will be used not just in tests of themselves,  but  also  in
     tests of related functions; but that is ok.

          Remember that each test will be run twice by the test system.  So
     if your test changes something, change it back.



     7  EXAMPLES

     7.1  Comparison Function

          If you use the "( code =F=> comparison-function result )" format,
     the result is now determined by doing (comparison-function code (quote
     result)).

             (2 =F=> < 4)   <=>   (< 2 4)
             (2 =F=> > 4)   <=>   (> 2 4)

     Notice that the new comparison function you introduce is unquoted.

          You may also use an explicit lambda form.  For example,

             (2 =F=> (lambda (x y) (< x y)) 4)   <=>  (< 2 4)




     7.2  Expected Result

          Remember  that  the  returned  value  for  a  test  case  is  not
     evaluated;  so  "==>  elephant" means is it EQUAL to (quote elephant),
     not to the value of elephant.

          Consequently, this is in error:

             (mapcar #'1+ (list 0 1 2 3)) ==> (list 1 2 3 4))

     and this is correct:

             (mapcar #'1+ (list 0 1 2 3)) ==> (1 2 3 4))


                          *Tests Return Single Values*

             A test returns exactly one value; a test of a function
             which returns multiple values must be written as:

                     (MULTIPLE-VALUE-LIST form)


                                   - 7 -
!
                                                                Page 8


                             *Testing Side Effects*

             A test of a side effecting function must  verify  that
             the  function  both  returns  the  correct  value  and
             correctly causes the side effect.  The following  form
             is an example of a body that does this:

                 ((LET (FOO) (LIST (SETF FOO '(A B C)) FOO)))
                     ==> ((A B C) (A B C)))





     7.3  Throw Tags

          The throw tag is also not evaluated.

          You must have either "==> <result>" or "=F=>  comparison-function
     <result>" or "=T=> throw-tag <result>" or "<ERROR>" in each test case.
     Remember that you may no longer use <-T- or <-S-.  For  example,  this
     would be correct:

             ((catch 'samson
                  (throw 'delilah 'scissors))
               =T=> delilah scissors)

     This test case would cause an unexpected error:

             ((catch 'samson
                     (throw 'delilah 'scissors))
               ==> scissors)




     7.4  Expected Failures

          Any test case can have the <FAILURE> option inserted to  indicate
     that  the  code  should not be run.  For example, these test cases are
     innocuous:
             ((dotimes (count 15 7)
                  (setf count (1- count)))
               <failure> ==> 7)

             ((dotimes (count 15 7)
                  (setf count (1- count)))
               <failure> =F=> <= 7)

             ((throw 'samson (dotimes (count 15 7)
                                 (setf count (1- count))))
               <failure> =T=> samson 7)


                                   - 8 -
!
                                                                Page 9


             ((car (dotimes (count 15 7)
                       (setf count (1- count))))
               <failure> <error>)
     Obviously, you are not expected to introduce infinite loops  into  the
     test cases deliberately.



     7.5  Sample Error And Success Reports

          A test with cases which all succeed will run with  no  output  if
     *SUCCESS-REPORTS*  is  NIL;  if  it is set to T, output will look like
     this:
     ************************************************************************ 
     Starting: GET-TEST 
     A test of get.  Uses the examples in the text.

     TESTS:GET-TEST succeeded in compiled cases
      1 2 3 4 5 6 7 8 9 10 11 12 13 14

     TESTS:GET-TEST succeeded in interpreted cases
      1 2 3 4 5 6 7 8 9 10 11 12 13 14


          If a test case evaluates properly but returns the wrong value, an
     error   report   will   be   made   irrespective  of  the  setting  of
     *SUCCESS-REPORTS*.  The  reports  include  the  test  case  code,  the
     expected  result, the comparison function used, and the actual result.
     For example, if you run this test:

             (def-lisp-test (+-test :attributes (numbers +))
               ((+) ==> 0)
               ((+ 2 3) ==> 4)
               ((+ -4 -5) =F=> >= 0))

     The second and third cases are wrong, so there  will  be  bug  reports
     like this:
     ←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←
     TESTS:+-TEST 
     Error in compiled case 2.

     Expected: (+ 2 3)

     to be EQUAL to: 4

     Received: 5
     -----------------------------------------------------------------------

     ←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←
     TESTS:+-TEST
     Error in compiled case 3.

     Expected: (+ -4 -5)

                                   - 9 -
!
                                                               Page 10


     to be >= to: 0

     Received: -9
     ------------------------------------------------------------------------

          Unexpected errors cause a report which includes  the  code  which
     caused  the  error,  the expected result, the error condition, and the
     error message from the error system.  As with other errors, these bugs
     are  reported  regardless  of  the  setting of *SUCCESS-REPORTS*.  For
     example:

             (def-lisp-test (=-test :attributes (numbers =))
               ((fboundp '=) ==> T)
               ((macro-function '=) ==> NIL)
               ((special-form-p '=) ==> NIL))

     The following report is given if MACRO-FUNCTION is undefined:

     ←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←← 
     TESTS:=-TEST compiled case 2 caused an unexpected 
      correctable error in function *EVAL. 

     Expected: (MACRO-FUNCTION '=) 

     to be EQUAL to: NIL 

      The error message is: 
     Undefined function: MACRO-FUNCTION.

     -----------------------------------------------------------------------




     8  RUNNING INDIVIDUAL TEST CASES

          The interpreted version of a test case can be  run  individually.
     Remember that if any variables are used which are modified in previous
     test cases, the results will not be "correct"; for example, any  local
     variables bound for the test with the :LOCALS keyword are not bound if
     a test case is run with this function.  The format is
        (RUN-TEST-CASE test-name test-case)
     Test-name is a symbol; test-case is an integer.



     9  PRINTING TEST CASES

          There are some new functions:

        (PPRINT-TEST-DEFINITION name)
        (PPRINT-TEST-CASE name case-number)
        (PPRINT-ENTIRE-TEST-CASE name case-number)

                                  - 10 -
!
                                                               Page 11


        (PPRINT-EXPECTED-RESULT name case-number)


          In each case, name is a  symbol.   In  the  latter  three  cases,
     case-number is a positive integer.

          PPRINT-TEST-DEFINITION pretty prints the expanded test code for a
     test.

          PPRINT-TEST-CASE pretty prints the test code for the  body  of  a
     test case; i.e.  the s-expression on the left of the arrow.

          PPRINT-ENTIRE-TEST-CASE pretty prints the  entire  expanded  test
     code   for   the  case  in  question,  i.e.   rather  more  than  does
     PPRINT-TEST-CASE and rather less than PPRINT-TEST.

          PPRINT-EXPECTED-RESULT pretty prints the expected result for  the
     test case specified.  This cannot be done for a case which is expected
     to signal an error, as in that case there is no comparison of expected
     and actual result.


































                                  - 11 -

∂09-Nov-84  0246	RWK@SCRC-STONY-BROOK.ARPA 	Hello   
Received: from SCRC-STONY-BROOK.ARPA by SU-AI.ARPA with TCP; 9 Nov 84  02:46:18 PST
Received: from SCRC-HUDSON by SCRC-STONY-BROOK via CHAOS with CHAOS-MAIL id 123755; Thu 8-Nov-84 21:32:33-EST
Date: Thu, 8 Nov 84 21:33 EST
From: "Robert W. Kerns" <RWK@SCRC-STONY-BROOK.ARPA>
Subject: Hello
To: cl-validation@SU-AI.ARPA
Message-ID: <841108213326.0.RWK@HUDSON.SCRC.Symbolics.COM>

Hello.  Welcome to the Common Lisp Validation committee.  Let me
introduce myself, general terms, first.

I am currently the manager of Lisp System Software at Symbolics,
giving me responsibility for overseeing our Common Lisp effort,
among other things.  Before I became a manager, I was a developer
at Symbolics.  In the past I've worked on Macsyma, MacLisp and NIL
at MIT, and I've worked on object-oriented systems on these systems.

At Symbolics, we are currently preparing our initial Common Lisp
offering for release.  Symbolics has been a strong supporter of Common
Lisp in its formative years, and I strongly believe that needs to
continue.  Why do I mention this?  Because I think one form of support
is to contribute our validation tests as we collect and organize them.

I urge other companies to do likewise.  I believe we all have
far more to gain than to lose.  I believe there will be far more
validation code available in the aggregate than any one company
will have available by itself.  In addition, validation tests from
other places have the advantage of bringing a fresh perspective
to your testing.  It is all too easy to test for the things you
know you made work, and far too difficult to test for the more
obscure cases.

As chairman, I see my job as twofold:

1)  Facilitate communication, cooperation, and decisions.
2)  Facilitate the implementation of decisions of the group.

Here's an agenda I've put together of things I think we
need to discuss.  What items am I missing?  This nothing
more than my own personal agenda to start people thinking.

First, the development issues:

1)  Identify what tests are available.  So far, I know of
the contribution by Skef Wholey.  I imagine there will be
others forthcoming once people get a chance to get them
organized.  (Myself included).

2)  Identify a central location to keep the files.  We
need someone on the Arpanet to volunteer some space for
files of tests, written proposals, etc.  Symbolics is
not on the main Arpanet currently, so we aren't a good
choice.  Volunteers?

    Is there anyone who cannot get to files stored on
the Arpanet?  If so, please contact me, and I'll arrange
to get files to you via some other medium.

3)  We need to consider the review process for proposed
tests.  How do we get tests reviewed by other contributors?
We can do it by FTPing the files to the central repository
and broadcasting a request to evaluating it to the list.
Would people prefer some less public form of initial evaluation?

4)  Test implementation tools.  We have one message from Gary Brown
describing his tool.  I have a tool written using flavors that I
hope to de-flavorize and propose.  I think we would do well to standardize
in this area as much as possible.

5)  Testing techniques.  Again, Gary Brown has made a number of excellent
suggestions here.  I'm sure we'll all be developing experience that we
can share.

6)  What areas do we need more tests on?

And there are a number of political, proceedural, and policy issues that
need to be resolved.

7)  Trademark/copyright issues.  At Monterey, DARPA voluntered to
investigate trademarking and copyrighting the validation suite.
RPG: have you heard anything on this?

8)  How do we handle disagreements about the language?  This was
discussed at the Monterey meeting, and I believe the answer, if
we can't work it out, we ask the Common Lisp mailing list, and
especially the Gang of Five, for a clarification.  At any rate,
I don't believe it is in our charter to resolve language issues.
I expect we will IDENTIFY a lot of issues, however.

I don't think the rest of these need to be decided any time soon.
We can discuss them now, or we can wait.

9)  How does a company (or University) get a Common Lisp implementation
validated, and what does it mean?  We can discuss this now, but I
don't think we have to decide it until we produce our first validation
suite.

10) How do we distribute the validation suites?  I hope we can do most
of this via the network.  I am willing to handle distributing it to
people off the network until it gets too expensive in time or tapes.
We will need a longer-term solution to this, however.

11) Longer term maintenance of the test suites.  I think having a
commercial entity maintain it doesn't make sense until we get the
language into a more static situation.  I don't think there is
even agreement that this is the way it should work, for that
matter, but we have plenty of time to discuss this, and the situation
will be changing in the meantime.

So keep those cards and letters coming, folks!

∂12-Nov-84  1128	brown@DEC-HUDSON 	validation process    
Received: from DEC-HUDSON.ARPA by SU-AI.ARPA with TCP; 12 Nov 84  11:25:11 PST
Date: Mon, 12 Nov 84 14:26:14 EST
From: brown@DEC-HUDSON
Subject: validation process
To: cl-validation@su-ai

I am happy to see that another vendor (Symbolics) is interested in sharing
tests.  I too believe we all much to gain by this kind of cooperation.

Since it seems that we will be creating and running tests, I would like
to expand a bit on an issue I raised previously - the ethics of validation.
A lot of information; either explicit or intuitive, concerning the quality
of the various implementations will surely be passed around on this mailing
list.  I believe that this information must be treated confidentially.  I
know of two recent instances when perceived bugs in our implementation of
Common Lisp were brought up in sales situations.  I can not actively
participate in these discussions unless we all intend to keep this
information private.

I disagree with the last point Bob's "Hello" mail - the long term maintenance
of the test suite (however, I agree that we have time to work this out).
I believe that our recommendation should be that ARPA immediately fund a
third party to create/maintain/administer language validation.

One big reason is to guarantee impartiality and to protect ourselves.
If Common Lisp validation becomes a requirement for software on RFPs,
big bucks might be a stake and we need guarantee that the process is
impartial and, I think, we want a lot of distance between ourselves and
the validation process.  I don't want to get sued by XYZ inc. because their
implementation didn't pass and this caused them to lose a contract and go
out of business.

Of course, if ARPA isn't willing to fund this, then we Common Lispers will
have to do something ourselves.  It would be useful if we could get
some preliminary indication from ARPA about their willingness to fund
this type effort.

∂12-Nov-84  1237	FAHLMAN@CMU-CS-C.ARPA 	validation process    
Received: from CMU-CS-C.ARPA by SU-AI.ARPA with TCP; 12 Nov 84  12:36:09 PST
Received: ID <FAHLMAN@CMU-CS-C.ARPA>; Mon 12 Nov 84 15:35:13-EST
Date: Mon, 12 Nov 1984  15:35 EST
Message-ID: <FAHLMAN.12063043155.BABYL@CMU-CS-C.ARPA>
Sender: FAHLMAN@CMU-CS-C.ARPA
From: "Scott E. Fahlman" <Fahlman@CMU-CS-C.ARPA>
To:   brown@DEC-HUDSON.ARPA
Cc:   cl-validation@SU-AI.ARPA
Subject: validation process
In-reply-to: Msg of 12 Nov 1984  14:26-EST from brown at DEC-HUDSON


I don't see how confidentiality of validation results can be maintained
when the validation suites are publically available (as they must be).
If DEC has 100 copies of its current Common Lisp release out in
customer-land, and if the validation programs are generally available to
users and manufacturers alike, how can anyone reasonably expect that
users will not find out that this release fails test number 37?  I think
that any other manufacturer had better be without sin before casting the
first stone in a sales presentation, but certainly there will be some
discussion of which implementations are fairly close and which are not.
As with benchmarks, it will take some education before the public can
properly interpret the results of such tests, and not treat the lack of
some :FROM-END option as a sin of equal magnitude to the lack of a
package system.

The only alternative that I can see is to keep the validation suite
confidential in some way, available only to manufacturers who promise to
run it on their own systems only.  I would oppose that, even if it means
that some manufacturers would refrain from contributing any tests that
their own systems would find embarassing.  It seems to me that making
the validation tests widely available is the only way to make them
widely useful as a standardization tool and as something that can be
pointed at when a contract wants to specify Common Lisp.  Of course, it
would be possible to make beta-test users agree not to release any
validation results, just as they are not supposed to release benchmarks.

I agree with Gary that we probably DO want some organization to be the
official maintainer of the validation stuff, and that this must occur
BEFORE validation starts being written into RFP's and the like.  We
would have no problem with keeping the validation stuff online here at
CMU during the preliminary development phase, but as soon as the lawyers
show up, we quit.

-- Scott

∂12-Nov-84  1947	fateman%ucbdali@Berkeley 	Re:  validation process 
Received: from UCB-VAX.ARPA by SU-AI.ARPA with TCP; 12 Nov 84  19:47:22 PST
Received: from ucbdali.ARPA by UCB-VAX.ARPA (4.24/4.39)
	id AA10218; Mon, 12 Nov 84 19:49:39 pst
Received: by ucbdali.ARPA (4.24/4.39)
	id AA13777; Mon, 12 Nov 84 19:43:29 pst
Date: Mon, 12 Nov 84 19:43:29 pst
From: fateman%ucbdali@Berkeley (Richard Fateman)
Message-Id: <8411130343.AA13777@ucbdali.ARPA>
To: brown@DEC-HUDSON, cl-validation@su-ai
Subject: Re:  validation process

I think that confidentiality of information on this mailing list is
unattainable, regardless of its desirability.

∂13-Nov-84  0434	brown@DEC-HUDSON 	Confidentially loses  
Received: from DEC-HUDSON.ARPA by SU-AI.ARPA with TCP; 13 Nov 84  04:34:11 PST
Date: Tue, 13 Nov 84 07:35:21 EST
From: brown@DEC-HUDSON
Subject: Confidentially loses
To: fahlman@cmu-cs-c
Cc: cl-validation@su-ai

I guess you are right.  I can't expect the results of public domain tests
or the communications on this mailing list to be treated confidentially.
So, I retract the issue.  I'll make that my own comments are not "sensitive".
-Gary

∂18-Dec-85  1338	PACRAIG@USC-ISIB.ARPA 	Assistance please?    
Received: from USC-ISIB.ARPA by SU-AI.ARPA with TCP; 18 Dec 85  13:36:21 PST
Date: 18 Dec 1985 11:17-PST
Sender: PACRAIG@USC-ISIB.ARPA
Subject: Assistance please?
From:  Patti Craig <PACraig@USC-ISIB.ARPA>
To: CL-VALIDATION@SU-AI.ARPA
Message-ID: <[USC-ISIB.ARPA]18-Dec-85 11:17:56.PACRAIG>

Hi,

Need some information relative to the CL-VALIDATION@SU-AI
mailing list.  Would the maintainer of same please contact
me.

Thanks,

Patti Craig
USC-Information Sciences Institute

∂12-Mar-86  2357	cfry%OZ.AI.MIT.EDU@MC.LCS.MIT.EDU 	Validation proposal 
Received: from MC.LCS.MIT.EDU by SU-AI.ARPA with TCP; 12 Mar 86  23:56:26 PST
Received: from MOSCOW-CENTRE.AI.MIT.EDU by OZ.AI.MIT.EDU via Chaosnet; 13 Mar 86 02:55-EST
Date: Thu, 13 Mar 86 02:54 EST
From: Christopher Fry <cfry@OZ.AI.MIT.EDU>
Subject: Validation proposal
To: berman@ISI-VAXA.ARPA, cl-validation@SU-AI.ARPA
Message-ID: <860313025420.4.CFRY@MOSCOW-CENTRE.AI.MIT.EDU>

We need to have a standard format for validation tests.
To do this, I suggest we hash out a design spec
before we get serious about assigning chapters to implementors.
I've constructed a system which integrates diagnostics and
hacker's documentation. I use it and it saves me time.
Based on that, here's my proposal for a design spec.

GOAL [in priority order]
   To verify that a given implementation is or is not correct CL.
   To aid the implementor in finding out the discrepancies between
      his implementation and the agreed upon standard.
   To suppliment CLtL by making the standard more prescise.
   To provide examples for future CLtLs, or at least a format
      for machine-readable examples, which will make it easier to
      verify that the examples are, in fact, correct.
   ..... the below of auxiliary importance
   To facilitate internal documentation [documenatation
      used primarily by implementaors while developing]
   To give CL programmers a suggested format for diagnostics and
      internal documentation. [I argue that every programmer of
      a medium to large program could benifit from such a facility].

RELATION of validation code to CL
   It should be part of yellow pages, not CL.

IMPLEMENTATION: DESIRABLE CHARACTERISTICS
   small amount of code
   uses a small, simple subset of CL so that:
        1. implementors can use it early in the developement cycle
        2. It will depend on little and thus be more reliable.
            [we want to test specific functions in a controlled way,
             not the code that implements the validation software.]
    We could, for example, avoid using: 
          macros, 
          complex lambda-lists,
          sequences, 
          # reader-macros, 
          non-fixnum numbers

FEATURES & USER INTERFACE:
   simple, uniform, lisp syntax

   permit an easy means to test:
     - all of CL
     - all of the functions defined in a file. 
     - all of the tests for a particular function
     - individual calls to functions.

   Allow a mechanism for designating certain calls as
      "examples" which illustrate the functionality of the
      function in question. Each such example should have
        -the call
        -the expected result [potentially an error]
        -an optional explaination string ie 
           "This call errored because the 2nd arg was not a number."

----------
Here's an example of diagnostics for a function:

(test:test 'foo
  '((test:example (= (foo 2 3) 5)  "foo returns the sum of its args.")
     ;the above is a typical call and may be used in a manual along
     ;with the documentation string of the fn
    (not (= (foo 4 5) -2))
     ;a diagnostic not worthy of being made an example of. There will
     ;generally be several to 10's of such calls.
    (test:expected-error (foo 7) "requires 2 arguments")
       ;if the expression is evaled, it should cause an error
    (test:bug (foo 3 'bar) "fails to check that 2nd arg is not a number")
      ;does not perform as it should. Such entries are a convienient place
      ;for a programmer to remind himself that the FN isn't fully debugged yet.
    (test:bug-that-crashes (foo "trash") "I've GOT to check the first arg with numberp!")
  ))

TEST is a function which sequentially processes the elements of the 
list which is its 2nd arg. If an entry is a list whose car is:
   test:example      evaluate the cadr. if result is non-nil
                     do nothing, else print a bug report.
   test:expected-error  evaluate the cadr. If it does not produce an error,
                     then print a bug report.
   test:bug          evaluate the cadr. it should return NIL or error.
                     If it returns NIL or error, print a "known" bug report.
                      otherwise print "bug fixed!" message.
                     [programmer should then edit the entry to not be wrapped in
                     a test:bug statement.]
   test:bug-that-crashes Don't eval the cadr. Just print the
                     "known bug that crashes" bug report.
  There's a bunch of other possibilities in this area, like:
  test:crash-example  don't eval the cadr, but use this in documentation
  
  Any entry without a known car, will just get evaled, if it returns nil or errors,
    print a bug report. The programmer can then fix the bug, or wrap a
   test:bug around the call to acknowledge the bug. This helps separate the
   "I've seen this bug before" cases from the "this is a new bug" cases.

With an editor that permits evaluation of expressions, [emacs and sons]
its easy to eval single calls or the whole test.
When evaluating the whole test, a summary of what went wrong can be
printed at the end of the sequence like "2 bugs found".

I find it convienient to place calls to test right below the definition
of the function that I'm testing. My source code files are about
half tests and half code. I have set up my test function such that
it checks to see if it is being called as a result of being loaded
from a file. If so, it does nothing. Our compiler is set up to
ignore calls to TEST, so they don't get into compiled files.

I have a function called TEST-FILE which reads each form in the file.
If the form is a list whose car is TEST, the form is evaled, else the
form is ignored.

Some programmers prefer to keep tests in a separate file from the
source code that they are writting. This is just fine in my implementation,
except that that a list of the source code files can't be used in
testing a whole system unless there's a simple mapping between
source file name and test file name.

Its easy to see how a function could read through a file and pull
put the examples [amoung other things].

Since the first arg to the TEST fn is mainly used to tell the user what
test is being performed, it could be a string explaining in more
detail the catagory of the below calls, ie "prerequisites-for-sequences" .

Notice that to write the TEST function itself, you need not have:
macros, &optional, &rest, or &key working, features that minimal lisps
often lack.

Obviously this proposal could use creativity of many sorts.
Our actual spec should be to make the file format, though, not
to add fancy features. Such features can vary from implementation to
implementation, which will aid evolution of automatic diagnostics and
documentation software. 
But to permit enough hooks in the file format, we need insight as to the potential
breadth of such a mechanism. Thus, new goals might also be a valuable
addition to this proposal.

FRY

∂13-Mar-86  1015	berman@isi-vaxa.ARPA 	Re: Validation proposal
Received: from ISI-VAXA.ARPA by SU-AI.ARPA with TCP; 13 Mar 86  10:12:38 PST
Received: by isi-vaxa.ARPA (4.12/4.7)
	id AA03979; Thu, 13 Mar 86 10:12:11 pst
From: berman@isi-vaxa.ARPA (Richard Berman)
Message-Id: <8603131812.AA03979@isi-vaxa.ARPA>
Date: 13 Mar 1986 1012-PST (Thursday)
To: Christopher Fry <cfry@OZ.AI.MIT.EDU>
Cc: berman@ISI-VAXA.ARPA, cl-validation@SU-AI.ARPA
Subject: Re: Validation proposal
In-Reply-To: Your message of Thu, 13 Mar 86 02:54 EST.
             <860313025420.4.CFRY@MOSCOW-CENTRE.AI.MIT.EDU>


Christopher,

Thanks for the suggestion.  Unfortunately there are already many thousands of
lines of validation code written amongst a variety of sources.  ISI is
supposed to first gather these and then figure out which areas are covered,
and in what depth.  

A single validation suite will eventually be constructed with the existing
tests as a starting point.  Therefore, we will probably not seriously consider
a standard until we have examined this extant code.  I'll keep CL-VALIDATION
informed of the sort of things we discover, and at some point I will ask for
proposals, if indeed I don't put one together myself.

Once we know what areas are already covered, we will assign the remaining
areas to the various willing victims (er, volunteers) to complete, and it is
this part of the suite which will be created with a standard in place.

Etc.,

RB

∂13-Mar-86  1028	berman@isi-vaxa.ARPA 	Re: Validation proposal
Received: from ISI-VAXA.ARPA by SU-AI.ARPA with TCP; 13 Mar 86  10:28:21 PST
Received: by isi-vaxa.ARPA (4.12/4.7)
	id AA04181; Thu, 13 Mar 86 10:27:56 pst
From: berman@isi-vaxa.ARPA (Richard Berman)
Message-Id: <8603131827.AA04181@isi-vaxa.ARPA>
Date: 13 Mar 1986 1027-PST (Thursday)
To: Christopher Fry <cfry@MIT-OZ%MIT-MC.ARPA>
Cc: berman@ISI-VAXA.ARPA, cl-validation@SU-AI.ARPA
Subject: Re: Validation proposal
In-Reply-To: Your message of Thu, 13 Mar 86 02:54 EST.
             <860313025420.4.CFRY@MOSCOW-CENTRE.AI.MIT.EDU>


Christopher,

Thanks for the suggestion.  Unfortunately there are already many thousands of
lines of validation code written amongst a variety of sources.  ISI is
supposed to first gather these and then figure out which areas are covered,
and in what depth.  

A single validation suite will eventually be constructed with the existing
tests as a starting point.  Therefore, we will probably not seriously consider
a standard until we have examined this extant code.  I'll keep CL-VALIDATION
informed of the sort of things we discover, and at some point I will ask for
proposals, if indeed I don't put one together myself.

Once we know what areas are already covered, we will assign the remaining
areas to the various willing victims (er, volunteers) to complete, and it is
this part of the suite which will be created with a standard in place.

Etc.,

RB


P.S.
I had to change your address (see header) 'cuz for some reason our mail
handler threw up on the one given with your message.


∂17-Mar-86  0946	berman@isi-vaxa.ARPA 	Re: Validation proposal
Received: from ISI-VAXA.ARPA by SU-AI.ARPA with TCP; 17 Mar 86  09:46:27 PST
Received: by isi-vaxa.ARPA (4.12/4.7)
	id AA11654; Mon, 17 Mar 86 09:46:19 pst
From: berman@isi-vaxa.ARPA (Richard Berman)
Message-Id: <8603171746.AA11654@isi-vaxa.ARPA>
Date: 17 Mar 1986 0946-PST (Monday)
To: cfry%oz@MIT-MC.ARPA
Cc: cl-Validation@su-ai.arpa
Subject: Re: Validation proposal
In-Reply-To: Your message of Mon, 17 Mar 86 04:30 EST.
             <860317043024.5.CFRY@DUANE.AI.MIT.EDU>


Thanks, and I look forward to seeing your tests.  And yes, I'm sure that
interested parties will get to review the test system before its in place.

RB



------- End of Forwarded Message

∂19-Mar-86  1320	berman@isi-vaxa.ARPA 	Re: Validation Contributors 
Received: from ISI-VAXA.ARPA by SU-AI.ARPA with TCP; 19 Mar 86  13:20:08 PST
Received: by isi-vaxa.ARPA (4.12/4.7)
	id AA08917; Wed, 19 Mar 86 13:19:50 pst
From: berman@isi-vaxa.ARPA (Richard Berman)
Message-Id: <8603192119.AA08917@isi-vaxa.ARPA>
Date: 19 Mar 1986 1319-PST (Wednesday)
To: Reidy.pasa@Xerox.COM
Cc: Reidy.pasa@Xerox.COM, berman@isi-vaxa.ARPA, CL-Validation@su-ai.ARPA
Subject: Re: Validation Contributors
In-Reply-To: Your message of 19 Mar 86 11:29 PST.
             <860319-112930-3073@Xerox>


As a matter of fact, in the end it WILL be organized parrallel to the book.
For now I'm just gathering the (often extensive) validation suites that have
been produced at various sites.  These will need to be evaluated before
assigning tasks to people who want to write some code for this.  By that time
we will also have a standard format for these tests so that this new code will
fit in with the test manager.

Send messages to CL-VALIDATION@SU-AI.ARPA rather than CL general list when
discussing this, unless it is of broader interest ofcourse.

Thanks.

RB

∂27-Mar-86  1332	berman@isi-vaxa.ARPA 	Validation Distribution Policy   
Received: from ISI-VAXA.ARPA by SU-AI.ARPA with TCP; 27 Mar 86  13:32:16 PST
Received: by isi-vaxa.ARPA (4.12/4.7)
	id AA22595; Thu, 27 Mar 86 13:32:06 pst
From: berman@isi-vaxa.ARPA (Richard Berman)
Message-Id: <8603272132.AA22595@isi-vaxa.ARPA>
Date: 27 Mar 1986 1332-PST (Thursday)
To: CL-Validation@su-ai.arpa
Subject: Validation Distribution Policy



------- Forwarded Message

Return-Path: <OLDMAN@USC-ISI.ARPA>
Received: from USC-ISI.ARPA by isi-vaxa.ARPA (4.12/4.7)
	id AA13746; Wed, 26 Mar 86 13:35:26 pst
Date: 26 Mar 1986 16:24-EST
Sender: OLDMAN@USC-ISI.ARPA
Subject: Validation in CL
From: OLDMAN@USC-ISI.ARPA
To: berman@ISI-VAXA.ARPA
Message-Id: <[USC-ISI.ARPA]26-Mar-86 16:24:40.OLDMAN>

Yes, we have tests and a manager.  I have started the wheels
moving on getting an OK from management for us to donate them.
Is there a policy statement on how they will be used or
distributed available? ...

-- Dan Oldman

------- End of Forwarded Message

I don't recall any exact final statement of the type of access.  I remember
there was some debate on whether it should be paid for by non-contributors,
but was there any conclusion?

RB

∂29-Mar-86  0819	FAHLMAN@C.CS.CMU.EDU 	Validation Distribution Policy   
Received: from C.CS.CMU.EDU by SU-AI.ARPA with TCP; 29 Mar 86  08:19:13 PST
Received: ID <FAHLMAN@C.CS.CMU.EDU>; Sat 29 Mar 86 11:19:51-EST
Date: Sat, 29 Mar 1986  11:19 EST
Message-ID: <FAHLMAN.12194592953.BABYL@C.CS.CMU.EDU>
Sender: FAHLMAN@C.CS.CMU.EDU
From: "Scott E. Fahlman" <Fahlman@C.CS.CMU.EDU>
To:   berman@λisi-vaxa.ARPA (Richard Berman)λ
Cc:   CL-Validation@SU-AI.ARPA
Subject: Validation Distribution Policy
In-reply-to: Msg of 27 Mar 1986  16:32-EST from berman at isi-vaxa.ARPA (Richard Berman)


    I don't recall any exact final statement of the type of access.  I remember
    there was some debate on whether it should be paid for by non-contributors,
    but was there any conclusion?

I believe that the idea that free access to the validation code be used
as an incentive to get companies to contribute was discussed at the
Boston meeting, but finally abandoned as being cumbersome, punitive, and
not necessary.  Most of the companies there agreed to contribute
whatever vailidation code they had, and/or some labor to fill any holes
in the validation suite, with the understanding that the code would be
pulled into a reasonably coherent form at ISI and then would be made
freely available to all members of the community.  This release would
not occur until a number of companies had contributed something
significant, and then the entire collection up to that point would be
made available at once.

I believe that Dick Gabriel was the first to say that his company would
participate under such a plan, and that he had a bunch of conditions
that had to be met.  If there are any not captured by the above
statement, maybe he can remind us of them.

-- Scott