Program Verification: The Very Idea

Collected by Brad Cox in support of the paradoxical goals of the Middle of Nowhere project.

Discussion on usenet in response to H. Fetzer, Program Verification: The Very Idea, Comm. ACM, Sept. 1988.




The first viewpoint is the intangibility imperative, which is most common in computer science circles. Its advocates view software as an solitary, mental, abstract activity akin to mathematics. For example, Fetzer attributes the following expression of this viewpoint to C.A.R. Hoare:

I hold the opinion that the construction of computer programs is a mathematical activity like the solution of differential equations, that programs can be derived from their specifications through mathematical insight, calculation, and proof, using algebraic laws as simple and elegant as those of elementary arithmetic.

From hsi!uunet!cs.utexas.edu!tut.cis.ohio-state.edu!usenet.ins.cwru.edu!mephisto!mcnc!duke!romeo!crm Fri Feb  2 13:34:16 EST 1990
Article 913 of comp.software-eng:
Path: stpstn!hsi!uunet!cs.utexas.edu!tut.cis.ohio-state.edu!usenet.ins.cwru.edu!mephisto!mcnc!duke!romeo!crm
>From: crm@romeo.cs.duke.edu (Charlie Martin)
Newsgroups: comp.software-eng
Subject: Re: CACM and Fetzer
Message-ID: <17081@duke.cs.duke.edu>
Date: 25 Jan 90 18:07:15 GMT
References: <7598@hubcap.clemson.edu> <16520013@hpisod2.HP.COM> <16865@joshua.athertn.Atherton.COM> <16964@duke.cs.duke.edu> <1990Jan22.152717.7142@mentor.com> <90831@linus.UUCP> <736@cs.nps.navy.mil> <1990Jan25.010244.10958@agate.berkeley.edu>
Sender: news@duke.cs.duke.edu
Reply-To: crm@romeo.UUCP (Charlie Martin)
Organization: Duke University CS Dept.; Durham, NC
Lines: 47

In article <1990Jan25.010244.10958@agate.berkeley.edu> bks@alfa.berkeley.edu (Brad Sherman) writes:
>
>I just don't understand this trashing of Fetzer or the cricisms of CACM.
>
>With regard to the former. While some of you may be involved in
>projects with verified code; most of us are forced to use UNIX, VMS,
>MSDOS, CMS, ORACLE, USENET, TCP/IP and other *useful* systems which
>are most definitely *not* verified. The compilers are not verified. The
>hardware is not verified. And we are writing software for use
>by human beings whose behavior must be included in the model of the
>system we might attempt to verify. Now, this is not to say that the real
>world of "software engineering" would not be improved by "proven"
>programs, nor even to say that systems *cannot* be proven correct. But
>there is a difference between mathematical models of computation and
>practical implementations of computing systems (just as there is a
>difference between theoretical and applied mathematics). Whether
>Fetzer's arguments are correct or not, the questions raised are
>important for the practicing programmer.
>

That simply isn't true. Fetzer's argument is precisely that program proofs are flawed on the face, that no proof can ever give insight or certainty to the behavior of a real program on a real machine, because of the distinction between the mathematical model and the physical world.

But one can argue on the same basis that euclidian geometry has no use and gives no insight, because the same distinction between mathematical model and physical world exists; worse, Euclid's axioms for plane geometry can be demonstrated false in the real world. Thus it is useless and morally wrong for surveyors to learn euclidiam plane geometry, because the real world doesn't fit the model.

Furthermore, the questions raised -- while they are real and significant questions, and must be considered, and while there clearly are people who needed the reminder -- have been explored in the verification literature. They form the basis of at least three research efforts I know about (the CL Inc Trusted Stack, the Odyssey hardware verficication efforts, and the UK verified hardware/software stuff). Fetzer's paper did not explore that existing research, and further used very old publications exclusively. Fetzer came up with a real issue but made no contribution other than filling a number of pages, because this literature was not reviewed. It was an extremely poor showing on the part of CACM; my personal opinion is that the editors involved should resign.

Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)




From hsi!uunet!samsung!zaphod.mps.ohio-state.edu!mips!apple!arc!steve Fri Feb  2 13:36:48 EST 1990
Article 916 of comp.software-eng:
Path: stpstn!hsi!uunet!samsung!zaphod.mps.ohio-state.edu!mips!apple!arc!steve
>From: steve@arc.UUCP (Steve Savitzky)
Newsgroups: comp.software-eng
Subject: Re: Have *YOU* ever written a program which COULDN'T be proven correct?
Message-ID: <789@arc.UUCP>
Date: 26 Jan 90 17:55:36 GMT
References: <7598@hubcap.clemson.edu> <9630012@hpirs.HP.COM>  <17080@duke.cs.duke.edu>
Sender: news@arc.UUCP
Organization: Advansoft Research Corp, Santa Clara, CA
Lines: 32

In article <17080@duke.cs.duke.edu> crm@romeo.UUCP (Charlie Martin) writes:
>...
>In real programs, the whole halting problem (and the stronger
>equivalence problem) do not apply, because all real realizations of
>program are finite state machines; both of these are decidable for
>finite state machines.
>
>If someone wants to claim they are NOT finite state machines, I want to
>see your machine with infinite memory registers.

Actually, all that is needed is for a machine to have *infinitely
expandable* memory, and mine has that -- I can always go out and buy
another disk.  (One of my professors once described a Turing Machine
as having a tape that ran out the door and into a paper factory.)

In any case you can probably show that in order to solve the halting
problem for FSM's with < N states, you need a FSM with > N states.
So the problem is still effectively unsolvable on any given set of
computers -- i.e. no computer in the set can solve the halting problem
for all computers in the set.

IMHO the *real* problem with verification is that it gives you a false
sense of security because (1) you don't know whether what you
initially specified is what you really wanted, and (2) you don't know
what real requirements you left out.  (E.g. reasonable bounds on time,
memory, and accuracy.)

--
\ Steve Savitzky      \ ADVANsoft Research Corp \ REAL hackers use an AXE!
 \ steve@arc.UUCP      \ 4301 Great America Pkwy \ #include
  \ arc!steve@apple.COM \ Santa Clara, CA 95954   \ 408-727-3357
   \__________________________________________________________________________




From hsi!uunet!wuarchive!zaphod.mps.ohio-state.edu!samsung!cs.utexas.edu!rutgers!mcnc!duke!romeo!crm Fri Feb  2 13:39:25 EST 1990
Article 930 of comp.software-eng:
Path: stpstn!hsi!uunet!wuarchive!zaphod.mps.ohio-state.edu!samsung!cs.utexas.edu!rutgers!mcnc!duke!romeo!crm
>From: crm@romeo.cs.duke.edu (Charlie Martin)
Newsgroups: comp.software-eng
Subject: Re: Have *YOU* ever written a program which COULDN'T be proven correct?
Message-ID: <17143@duke.cs.duke.edu>
Date: 29 Jan 90 20:37:00 GMT
References: <7598@hubcap.clemson.edu> <9630012@hpirs.HP.COM>  <17080@duke.cs.duke.edu> <789@arc.UUCP>
Sender: news@duke.cs.duke.edu
Reply-To: crm@romeo.UUCP (Charlie Martin)
Organization: Duke University CS Dept.; Durham, NC
Lines: 43

In article <789@arc.UUCP> steve@arc.UUCP (Steve Savitzky) writes:
>In article <17080@duke.cs.duke.edu> crm@romeo.UUCP (Charlie Martin) writes:
>>...
>>If someone wants to claim they are NOT finite state machines, I want to
>>see your machine with infinite memory registers.
>
>Actually, all that is needed is for a machine to have *infinitely
>expandable* memory, and mine has that -- I can always go out and buy
>another disk.  (One of my professors once described a Turing Machine
>as having a tape that ran out the door and into a paper factory.)

You will always always always run into a limit: the number of slots for
disk controllers, the number of disks per controller, the addressing
capability of the bus, something.

>
>In any case you can probably show that in order to solve the halting
>problem for FSM's with < N states, you need a FSM with > N states.

Given an FSM, you can immediately derive the set of all input strings which lead to acceptance, and all other strings which do not lead to acceptance. "Halting" per se isn't a problem in a deterministic machine; it always halts when the tape runs out. And equivalence, which is "harder," can be proven by reducing two machines to canonical (minimized) form, and comparing.

>
>IMHO the *real* problem with verification is that it gives you a false
>sense of security because (1) you don't know whether what you
>initially specified is what you really wanted, and (2) you don't know
>what real requirements you left out. (E.g. reasonable bounds on time,
>memory, and accuracy.)
>

This is a semantic problem; you're using the word "specification" in a different sense than I am. In my terminology you are saying that you don't know if what you stated in the requirements really meets your needs, and you're right. However, Current methods give no assurance that the results you get from programming really correspond to what you thought you asked for. The point of verification is to try to get as close as possible to having the suitability of the realized program be dependent on the suitability of the requirements stated. Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)




From hsi!uunet!aplcen!uakari.primate.wisc.edu!zaphod.mps.ohio-state.edu!rpi!batcomputer!cornell!oravax!ian Fri Feb  2 14:14:51 EST 1990
Article 937 of comp.software-eng:
Path: stpstn!hsi!uunet!aplcen!uakari.primate.wisc.edu!zaphod.mps.ohio-state.edu!rpi!batcomputer!cornell!oravax!ian
>From: ian@oravax.UUCP (Ian Sutherland)
Newsgroups: comp.software-eng
Subject: Re: Have *YOU* ever written a program which COULDN'T be proven correct?
Message-ID: <1297@oravax.UUCP>
Date: 31 Jan 90 05:41:17 GMT
References: <7598@hubcap.clemson.edu> <9630012@hpirs.HP.COM>  <17080@duke.cs.duke.edu> <789@arc.UUCP> <17143@duke.cs.duke.edu> <793@arc.UUCP>
Reply-To: ian@oravax.odyssey.UUCP (Ian Sutherland)
Organization: Odyssey Research Associates, Ithaca, New York
Lines: 38

In article <793@arc.UUCP> steve@arc.UUCP (Steve Savitzky) writes:
>the point is
>that a computer can be treated as a Turing machine FOR ALL PRACTICAL
>PURPOSES, (Oops; I forgot -- verification isn't practical! :-),
>because you can't put a finite upper bound on its number of states.

This may be true in some theoretical sense, but it's not true in a practical sense. In most computers I've worked with, objects like integers and reals are represented as strings of bits whose lengths have fixed upper bounds. You could implement integers and reals in such a way that adding more storage devices would extend the integers and reals you could represent, but such representations have drawbacks like inefficiency. If a program makes any use of representations of, say, integers which cannot be entended, then any integer variable implemented as such a representation cannot correctly be treated as varying through an arbitrarily large set.

Another relevant comment on this subject is that if the program in question is implemented in something like a satellite or the space shuttle, you can't just add more storage any time you need it. Also, if a program must perform in real time, there will probably be many situations in which the program can't just pause while an operator goes out for more memory. Such programs will have to be able to deal with running into limits on the size of their state space, and verification of such programs will have to take it into account to be realistic.

>In any case, I don't think that any practical proof methods operate by
>treating programs as FSM's.

I managed a project to build a verification system for C programs which use floating point arithmetic. This system makes the explicit assumption that the state space of the program is finite. I like to think our proof methods are getting to the practical stage :-)
--
Ian Sutherland		ian%oravax.uucp@cu-arpa.cs.cornell.edu

Sans Peur




From hsi!uunet!snorkelwacker!spdcc!merk!alliant!linus!chance!munck Fri Feb  2 14:21:42 EST 1990
Article 904 of comp.software-eng:
Path: stpstn!hsi!uunet!snorkelwacker!spdcc!merk!alliant!linus!chance!munck
>From: munck@chance.uucp (Robert Munck)
Newsgroups: comp.software-eng
Subject: "Correct" IS UNDEFINED!
Message-ID: <91033@linus.UUCP>
Date: 25 Jan 90 14:58:01 GMT
References: <7598@hubcap.clemson.edu> <9630012@hpirs.HP.COM> 
Sender: news@linus.UUCP
Reply-To: munck@chance.UUCP (Robert Munck)
Organization: MITRE-McLean Software Engineering Laboratory
Lines: 28
Keywords: proof, correctness, theory, balderdash


No one has said this yet, though James Adcock and Joshua Levy have
alluded to it:

   ***************************************************************
   **** For a large class of programs, "correct" is UNDEFINED. ***
   ***************************************************************

In general, the big, real-world programs that we spend big bucks to create, things like telephone systems, air traffic control, SDI, 1-2-3, and "unix" don't support the abstraction of "correct." Their specifications are partly on paper, but mostly exist as concepts of greater or lesser exactness in the minds of a large number of people. Obviously, these concepts are not identical from one person to another.

When a program functions in ways that most of those people decide agrees with their concepts, it is said to be "working;" when it functions in way that most decide disagrees, it is said to be "wrong." (When there is no majority opinion either way, it is "unfinished.")

I'm afraid that belief that we will someday be able to prove our largest, most important programs correct has become one of my tests of native intelligence, along with belief in astrology and support of the flag-burning amendment.

                 -- Bob , linus!munck.UUCP
                 -- MS Z676, MITRE Corporation, McLean, VA 22120
                 -- 703/883-6688




From hsi!uunet!tut.cis.ohio-state.edu!snorkelwacker!spdcc!merk!alliant!linus!chance!munck Fri Feb  2 14:24:03 EST 1990
Article 919 of comp.software-eng:
Path: stpstn!hsi!uunet!tut.cis.ohio-state.edu!snorkelwacker!spdcc!merk!alliant!linus!chance!munck
>From: munck@chance.uucp (Robert Munck)
Newsgroups: comp.software-eng
Subject: Re: "Correct" IS UNDEFINED!
Message-ID: <92182@linus.UUCP>
Date: 26 Jan 90 20:51:51 GMT
References: <7598@hubcap.clemson.edu> <9630012@hpirs.HP.COM>  <91033@linus.UUCP> 
Sender: news@linus.UUCP
Reply-To: munck@chance.UUCP (Robert Munck)
Organization: MITRE-McLean Software Engineering Laboratory
Lines: 47

In article  geoff@tom.harvard.edu (Geoff Clemm) writes:
> I realize that it involves more mental effort to decide exactly what you
> want your program to do, and then to make it do that, than it does to hack
> something together and say "the system does some combination of whatever
> vague idea each programmer had in mind while they were hacking".
>
> I propose that the effort be made ... your users would appreciate it.

Sigh. Ever get the feeling you were trying to have a discussion across the Grand Canyon? No wonder recent graduates often spend their first few years in the "real world" in a state of shock. It's a second birth trauma.

My point, of course, is that "exactly what you want your program to do" is not decidable. No one person can grasp the totality of a system of the size we're trying to create. The interactions of parts are beyond our ability to understand. There are an immense number of logic contradictions, misunderstandings, and blank areas.

Remember, no two people can be assumed to take the same meaning from a single word like "safety." How, then, can we possibly prove the meeting of a requirement that uses that word? Human limitations aside, how could we ever program a computer to consider all possible functional interaction of a system and decide if the result is "safe?"

Moreover, the world is not static (even at universities). Anecdote: through a series of accidents, I once found myself in charge of a gang of COBOL programmers maintaining a payroll system for a national company, a relatively small system of about 0.5 MLoC. A great deal of their time was spent changing the code to reflect changes in state and local tax withholding. With 50 states and a great many towns and cities churning out new rules, there was always a long queue of changes waiting to be made, some for rules that had already taken effect. It's not much of an exaggeration to say that the changes were continuous. What is a "correct" program in such circumstances? ATC, telephone, C3I, and other real-world software products have very similar situations; think of the poor bastards trying to keep running a telephone company's billing system these days!

Mr. Clemm (Ah Clemm's brother?) suggests that we "make the effort." I hope he realizes that we'll have to enlist him, his children, his grandchildren, and every other human being for a century or so to do it. Heck, we wouldn't even be able to get everyone currently working on one of these systems to whistle "Yankee Doodle" together.
                 -- Bob , linus!munck.UUCP
                 -- MS Z676, MITRE Corporation, McLean, VA 22120
                 -- 703/883-6688




From hsi!uunet!cs.utexas.edu!rutgers!mcnc!duke!romeo!crm Fri Feb  2 14:26:26 EST 1990
Article 929 of comp.software-eng:
Path: stpstn!hsi!uunet!cs.utexas.edu!rutgers!mcnc!duke!romeo!crm
>From: crm@romeo.cs.duke.edu (Charlie Martin)
Newsgroups: comp.software-eng
Subject: Re: "Correct" IS UNDEFINED!
Message-ID: <17142@duke.cs.duke.edu>
Date: 29 Jan 90 20:24:36 GMT
References: <7598@hubcap.clemson.edu> <9630012@hpirs.HP.COM>  <91033@linus.UUCP>
Sender: news@duke.cs.duke.edu
Reply-To: crm@romeo.UUCP (Charlie Martin)
Organization: Duke University CS Dept.; Durham, NC
Lines: 41
Keywords: proof, correctness, theory, balderdash
Posted: Mon Jan 29 15:24:36 1990

In article <91033@linus.UUCP> munck@chance.UUCP (Robert Munck) writes:
>
>   ***************************************************************
>   **** For a large class of programs, "correct" is UNDEFINED. ***
>   ***************************************************************
>
>In general, the big, real-world programs that we spend big bucks to
>create, things like telephone systems, air traffic control, SDI, 1-2-3,
>and "unix" don't support the abstraction of "correct." Their
>specifications are partly on paper, but mostly exist as concepts of
>greater or lesser exactness in the minds of a large number of people.

Oh, crap. I'm sorry, but this is simply the single most assinine argument that I've seen proposed. Never in twenty years of software engineering have I seen a system where the question "does it work?" was undefined. Sometimes it's POORLY defined, sometimes the definition turns out to have contradictions, sometimes it turns out to be vacuous, but all of them have some statement of the expected behavior.

Telephone systems, contrary to your assertion, have extremely stringent definitions of what is "correct", stated in everything from CCITT interface specifications to performance statements (e.g., 99.95 % correct call completion at standard load.)

Program proof does in fact require more care in specification: you must now state the conditions under which the program is considered "correct" is a more formal way. The specification must be derived from requirements, and making sure that the requirements are adequate is a hard problem. However, the only sense of correctness that makes sense in terms of program correctness is one which states that the program produced meets the specification.

>I'm afraid that belief that we will someday be able to prove our
>largest, most important programs correct has become one of my tests of
>native intelligence, along with belief in astrology and support of the
>flag-burning amendment.

I only include this paragraph as an apology in advance for the somewhat fiery tone of my reply. Clearly, I don't agree.

Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)




From hsi!uunet!snorkelwacker!apple!oliveb!pyramid!athertn!joshua Fri Feb  2 14:27:39 EST 1990
Article 936 of comp.software-eng:
Path: stpstn!hsi!uunet!snorkelwacker!apple!oliveb!pyramid!athertn!joshua
>From: joshua@athertn.Atherton.COM (Flame Bait)
Newsgroups: comp.software-eng
Subject: Re: "Correct" IS UNDEFINED!
Message-ID: <17504@joshua.athertn.Atherton.COM>
Date: 30 Jan 90 23:33:56 GMT
References: <7598@hubcap.clemson.edu> <9630012@hpirs.HP.COM>  <91033@linus.UUCP> <17142@duke.cs.duke.edu>
Reply-To: joshua@Atherton.COM (Flame Bait)
Organization: Atherton Technology, Sunnyvale, CA
Lines: 45
Keywords: proof, correctness, theory, balderdash

> = crm@cs.duke.edu (Charlie Martin) writes:
# = munck@chance.UUCP (Robert Munck) writes:

# In general, the big, real-world programs that we spend big bucks to
# create, ... don't support the abstraction of "correct."  Their
# specifications are partly on paper, but mostly exist as concepts of
# greater or lesser exactness in the minds of a large number of people.

> Oh, crap.  I'm sorry, but this is simply the single most assinine
> argument that I've seen proposed.  Never in twenty years of software
> engineering have I seen a system where the question "does it work?" was
> undefined.  Sometimes it's POORLY defined, sometimes the definition
> turns out to have contradictions, sometimes it turns out to be vacuous,
> but all of them have some statement of the expected behavior.

The second part of you paragraph suggests that the first part of it is just wishful thinking.  I have seen definitions of software behavior which are poor, vacuous, contradictions, and incorrect.  What I have not seen is a definition which was reasonably correct and which could be used as the input to a software verification system, even a human one.

# I'm afraid that belief that we will someday be able to prove our
# largest, most important programs correct has become one of my tests of
# native intelligence, along with belief in astrology and support of the
# flag-burning amendment.

I would consider it a better measure of optimism and lack of real world experiance.

#ifdef PERSONAL_ATTACK

If you go back through the "programs can be proven" thread, I think you will find that 80-90% of the people who post in defence of proving programs correct post from academic sites. You will also see that 80-90% of the people who attack this idea post from commercial sites. I do not think this is pure chance.

#endif

Joshua Levy
--------                Quote: "Vaya con dios, scumbucket."  -- WISEGUY
Addresses:
joshua@atherton.com
{decwrl|sun|hpda}!athertn!joshua    work:(408)734-9822    home:(415)968-3718




From hsi!uunet!cs.utexas.edu!rutgers!rochester!rit!mjl Fri Feb  2 14:28:43 EST 1990
Article 939 of comp.software-eng:
Path: stpstn!hsi!uunet!cs.utexas.edu!rutgers!rochester!rit!mjl
>From: mjl@cs.rit.edu
Newsgroups: comp.software-eng
Subject: Re: "Correct" IS UNDEFINED!
Message-ID: <1575@cs.rit.edu>
Date: 31 Jan 90 14:29:15 GMT
References: <7598@hubcap.clemson.edu> <9630012@hpirs.HP.COM>  <91033@linus.UUCP> <17142@duke.cs.duke.edu> <17504@joshua.athertn.Atherton.COM>
Sender: news@cs.rit.edu
Reply-To: mjl@prague.UUCP (Michael Lutz)
Organization: Rochester Institute of Technology, Rochester, NY
Lines: 48

In article <17504@joshua.athertn.Atherton.COM> joshua@Atherton.COM (Flame Bait) writes:

>#ifdef PERSONAL_ATTACK
>
>If you go back through the "programs can be proven" thread, I think you
>will find that 80-90% of the people who post in defence of proving
>programs correct post from academic sites.  You will also see that 80-90%
>of the people who attack this idea post from commercial sites.
>I do not think this is pure chance.
>
>#endif

Two responses, one personal, one general:

PERSONAL
	I actively consult with a local firm where we develop software
	embedded in high-speed output devices (hint: I'm in Rochester,
	NY and the firm's name begins with "K").  I could post
	my messages from there -- I suppose that would automatically make
	them more relevant, right?  I will note that not everyone on
	the team I work for buys into what I'm saying, but they *are*
	consistently amazed by the low defect rates in my code.  While
	I consider myself a pretty good programmer anyhow, I attribute
	most of my success to the methodical, rational, and rigorous
	approach I've adopted to designing and developing product software.
	It just works!

GENERAL
	Replace "programs can be proven" and similar phrases with
	"systems software can be written in a high level language",
	and you'll have a capsule summary of industrial wisdom
	circa 1970[*].  Substitute "structured programming" and "use of
	limited control structures" and you'll have a capsule summary of
	industrial wisdom circa 1978.  This is not to say that academics
	don't make mistakes, but harping on the supposed dichotomy
	between "academia" and the "real world" is always the last refuge
	of those who are comfortable with the way things are, and don't
	want to expend the (admittedly demanding) intellectual effort
	to advance their own knowledge and the state of the
	practice.  Sort of "compu- potatoes"?

Mike Lutz

[*] Though in fact Burroughs (now Unisys) had been writing all of their
    software in a variant of ALGOL since the early 60's.
Mike Lutz	Rochester Institute of Technology, Rochester NY
UUCP:		{rutgers,cornell}!rochester!rit!mjl
INTERNET:	mjl@csrit.edu




From hsi!uunet!aplcen!samsung!rex!uflorida!mephisto!mcnc!duke!romeo!crm Fri Feb  2 14:34:03 EST 1990
Article 911 of comp.software-eng:
Path: stpstn!hsi!uunet!aplcen!samsung!rex!uflorida!mephisto!mcnc!duke!romeo!crm
>From: crm@romeo.cs.duke.edu (Charlie Martin)
Newsgroups: comp.software-eng
Subject: Re: references for how to prove (was Re: Reasons why you don't prove your programs are correct)
Message-ID: <17079@duke.cs.duke.edu>
Date: 25 Jan 90 17:44:33 GMT
References: <7598@hubcap.clemson.edu> <9630012@hpirs.HP.COM>
Sender: news@duke.cs.duke.edu
Reply-To: crm@romeo.UUCP (Charlie Martin)
Organization: Duke University CS Dept.; Durham, NC
Lines: 111
Posted: Thu Jan 25 12:44:33 1990

In article <9630012@hpirs.HP.COM> runyan@hpirs.HP.COM (Mark Runyan) writes:
>>/ crm@romeo.cs.duke.edu (Charlie Martin) / 12:11 pm  Jan 20, 1990 /
>>I can't answer the question "why I don't prove my programs correct"
>>because I do.  All the time.  As a matter of course.  Almost without
>>exception.
>
>I've been trying to follow this string, remembering those days in college
>when the instructor indicated that program proof of correctness was
>possible, but intractable.  Now someone claims to do it.  Apparently,
>often.

I just gave a talk on myths of verification, and this was one of the ones I mentioned. It was true in a sense, a long time ago, when Floyd-style proofs were the only ones that could be done. It is no longer true now, in the sense that proof of program texts or notations that are semanitcally reasonably well behaved can be done. It is still true in the sense that there are classes of programs and programming languages which are so badly behaved that it's hard to prove anything about them. Ada and C are both in this class. Interprocess communication tends to be difficult, asynchrony is hard (but the technology is catching up, see e.g. some of Leslie Lamport's work.)
The basic trick comes down to this:

(1) describe your specification as formally as you can. Things like Z specifications are very handy, but a little mathhematical maturity and some thought can lead to a document in mathematical prose which encodes most of the hard parts in a crisp way. (And recognize that there will be requirements which are not amenable to formality: "the system will be built in a well-structed fashion"; "the human interface will be easy to use and consistent." You have to pay lip service to them, because saying "this is crap" will get you talked about at annual reviews, but you should generally try to do a good job and challenge them to show you didn't meet it if pressed.)

(2) describe your program first in a notation that is well-behaved, and for which the semantics are simple. I use Dijkstra's notation ("the programming calculus") because it is the first one I used, I'm comfortable with it, and I find the reasoning rules intuitive. But use the _Science of Programming_ [Gries] to get a start with it, not Dijkstra's book.

(3) complex programs, espeically programs that depend on the system in which they are embedded (e.g., window programs), will have things that you can't prove, but must assume. This need not stop you. You must first try to understand those things formally; second, state them formally in the semantic formalism you are using. I would, for example, try to state the weakest-precondition specification for a window routine using informal arguments. Third, when you write your program, make sure that you TEST, yes test, the program with particular attention to the things you couldn't prove.

As an aside, this does NOT mean that there is no "proof" because the proof depends on components that are unproveable; it does not. What you are doing is introducing new axioms into your theory; these axioms are statements about the semantics of the operations which you can't prove. As in any usual deductive system, either these statements are theorems or they are not. If they are theorems (i.e., the unproven chunks do behave as specified) then you're in fat city. If not, you will have a defect there, so long as the defect in the underlying stuff is exercised by what you write. Conveniently, most of these things are also things which are extremely well tested, so you have good evidence to believe that it will work. This is very much like the way mathematics has been done in history: Euclid's axioms are formal statements of things that were observed to be true in many cases. The formal system derives results that always correspond to reality on Earth so long as the scale is limited. But when you get outside of the range that Euclid could observe, or the limits of classical Greek measuring instruments, you observe the axioms do not correspond to truth in the large.

As another aside, there is a sort of competing body of work on the IBM Software Clean Room. I don't have references right at hand, but they should be easily found. They use a notation that I consider exceedingly ugly, but they get good results in their hoard-of-programmers environment. I don't personally think their methods are completely workable for smaller groups, or groups with less formal structure, which also describes the kind of group I like to work in.

(4) Code the program in any convenient language. I use C or C++ almost exclusively. This coding is translation from the design notation to an executable notation; you should argue for each translation that the semantics of the executable match what you describe in the design. This is easier than it sounds, but you have to be careful about places where the language is wierd. One that I've run into recently is that there are cases where a[i] is NOT the same as *(a+i), in all compilers I can test with.

>o- Do you use a program or tool to do it? Not regularly, although I have done so. It is much harder using tools to do the proofs, in general.

>o- How complex a program are we talking about? I've done programs up to about 10K source lines of C. The IBM Clean Room people have done several bigger ones with many programmers working together.

>o- If a tool is used, how would I go about acquiring it? If a tool isn't used, what literature (books, articles, or pamphlets) is available to the "engineer in the field"?

As I said above, I like Science of Programming by Gries. I have a book called _Program Derivation_ by Geoff Dromey (Addison Wesley 1989) which looks pretty good, but I haven't had time to read it closely. I hope someday to write The Ultimate Book On Proof In Software Engineering, but I wouldn't hold my breath if I were you. Therea re a number of other books that other may recommed, which I like more or less well.


Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)




From hsi!uunet!lll-winken!brutus.cs.uiuc.edu!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!m.cs.uiuc.edu!marick Fri Feb  2 14:34:31 EST 1990
Article 934 of comp.software-eng:
Path: stpstn!hsi!uunet!lll-winken!brutus.cs.uiuc.edu!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!m.cs.uiuc.edu!marick
>From: marick@m.cs.uiuc.edu
Newsgroups: comp.software-eng
Subject: Re: references for how to prove (wa
Message-ID: <39400059@m.cs.uiuc.edu>
Date: 31 Jan 90 01:39:09 GMT
References: <1488@gould.doc.ic.ac.uk>
Lines: 16
Nf-ID: #R:gould.doc.ic.ac.uk:1488:m.cs.uiuc.edu:39400059:000:522
Nf-From: m.cs.uiuc.edu!marick    Jan 30 12:29:00 1990


You can start with:

Certifying the Reliability of Software. P. Allen Currit, Michael
Dyer, and Harlan D. Mills.  Transactions on Software Engineering.
V. SE-12, No 1, January 1986.

Cleanroom Software Development, An Empirical Evaluation.  Richard W.
Selby, Victor R. Basili, and F. Terry Baker.  Transactions on Software
Engineering, V. SE-13, No. 9, September 1987.

Does anyone have any more recent papers that cover new ground?

Brian Marick
Motorola & University of Illinois
marick@cs.uiuc.edu, uunet!uiucuxc!marick




From hsi!uunet!aplcen!haven!grebyn!schultz Fri Feb  2 14:36:38 EST 1990
Article 931 of comp.software-eng:
Path: stpstn!hsi!uunet!aplcen!haven!grebyn!schultz
>From: schultz@grebyn.com (Ronald)
Newsgroups: comp.object,comp.software-eng
Subject: Information Systems Architecture
Message-ID: <19319@grebyn.com>
Date: 30 Jan 90 17:52:56 GMT
Followup-To: poster
Organization: Grebyn Timesharing, Vienna, VA
Lines: 167
Keywords: Zachman, OO Modeling


In the November, 1988 volume of the IBM Systems Journal, John Zachman of the IBM Los Angeles Scientific Center proposed an Information Systems Architectural Framework he found useful in communicating to a company what was involved in the design, implementation, and maintenance of information systems.

"The subject of Information Systems Architecture is currently receiving considerable attention. The increased design scopes and levels of complexity of information systems implementation necessitate the use of some logical construct (or architecture) for defining and controlling the interfaces and the integration of all of the system's components. ...

On the assumption that an understanding of information systems architecture is important to the development of a disciplined approach, the question that naturally arises is "What, in fact, is Information Systems Architecture ?"

Zachman's framework was based on his research into the paradigms associated with the engineering of complex products such as buildings and aircraft. The framework was presented as a matrix with columns representing three paradigms:

1. Material/ Structure - Thing / Relationship / Thing 2. Functional / Transform - Input / Process / Output 3. Geographic / Flow - Site / Link / Site A rough approximation of the information systems framework is presented below. This is based on Zachman's addressing organizations with background in COBOL, RDBMS, MIS kinds of applications.
---------------------------------------------------------------


                           Data              Process           Network

Objectives/Scope            D1                 P1                N1

Model of the                D2                 P2                N2
Business

Model of the                D3                 P3                N3
Information
System

Technology                  D4                 P4                N4
Design

Technology                  D5                 P5                N5
Definition

Technology                  D6                 P6                N6
Implementation
---------------------------------------------------------------


The Data Column looks at an information system from a bill of
material, or component perspective.

Objects populating Cell D1 would be things of interest to the
enterprise, such as competitors, products, services, employess,
etc.

D2 would consist of a basic ER model of the business.

D3 would involve fully attributing the ER model, and normalizing
it.  Developing it into a "Data Model".

D4 would involve taking the ER model and designing databases to
support it.

D5 would involve the language specific definition of database
tables to implement the above generated models.

And D6 would be the actual system data, stored on media.

---------------------------------------------------------------

The Process column views and information system from an
   INPUT -->  PROCESS --> OUTPUT view.

Cell P1 would consist of processes of interest to the enterprise,
such as the building of an aircraft, shipping to a customer of
product, or the development of new retail services.

P2 could be a set of Function Flow Diagrams detailing the flow
of information between business processes.

P3 is often seen as Data Flow Diagrams.

P4 as Structure Charts, and

P5 as the actual program.

The Network column views the information system from a
   NODE ---- LINK ---- NODE perspective.

I can detail these columns at a later time, if there is any
interest.

The point of this is that the Zachman framework has had a
significant impact on IBM and its dealings with customers.  At
one point there was even discussion as to organizing IBM along
the lines of the Zachman framework.  IBM has requested that
vendors supplying tools for its AD/Cycle Repository product
directly address the Zachman framework in some as yet
undetermined fashion.

As Zachman is quick to point out, the framework provides an easy
means of classifying many IS products, tools, and techniques.
A brief table of some classifications are:

Tool / Technique / Product       Appropriate Cells

ER Modeling                      D2, D3
Dataflow Diagramming             P2, P3
Structured Analysis              P3, P4
Structured Design                P3, P4
Data Normalization               D3
Code Generation                  P5
COBOL                            P4, P5
Business Systems Planning        D1, P1, N1
Critical Success Factors         D1
Information Engineering          D1, D2, P1, P2

The above is a rough cut and is arbitrary at best. No flames please.

In summary, IS professionals have found the Zachman Framework extremely powerful in aiding communications between residents of different cells within the framework. For example, a COBOL programmer views an information system probably from cell P4 or P5. He sees the system most often as a structure chart. His view of a system is characterized by modules, paragraphs, sections, and I/O calls. A CEO views an information system as somethings that assists him in dealing with objects in D1 and D2, such as profits, customers, and competitors, and the corporate organization and structure. A Data Administrator views the system as a data model. Each cell represents a different
"architectural representation" or set of representations as to what the system is. By making these representations explicit, much can be done to overcome, or at least address, the communication difficulties of translating concept to executing system.

Finally, my question. I have been asked repeatedly by attendees at IBM conferences as to how OO techniques impact on the Zachman framework. Is the Zachman framework even appropriate for an OO environment ? What would the cells of the framework be populated with in an OO approach ? What other kinds of frameworks for information systems are available ? Has anyone done any enterprise modeling based on this, or any other framework, in an object-oriented fashion ?

I will post a summary to the net if there is sufficient interest in this topic.


Ron Schultz

schultz@grebyn.com
614 841-4103
Network Solutions, Inc.
7450 Horizon Drive
Columbus, Ohio  43235




From gaboon!swlabs!uunet!munnari.oz.au!uhccux!ames!think!zaphod.mps.ohio-state.edu!usc!cs.utexas.edu!rutgers!umn-d-ub!phil Sun Mar  4 20:19:34 EST 1990
Article 1017 of comp.software-eng:
Path: stpstn!gaboon!swlabs!uunet!munnari.oz.au!uhccux!ames!think!zaphod.mps.ohio-state.edu!usc!cs.utexas.edu!rutgers!umn-d-ub!phil
>From: phil@umn-d-ub.D.UMN.EDU (Philosophy Dept)
Newsgroups: comp.software-eng
Subject: Re: Reasons why you don't prove your programs are correct
Message-ID: <3236@umn-d-ub.D.UMN.EDU>
Date: 24 Feb 90 23:27:09 GMT
Organization: U of Minnesota-Duluth, Information Services
Lines: 85
Keywords: verification, testing, validation, Charles Martin

Charles Martin's intermperate tirades concerning my position are not even coherent. He belittles what he calls "Fetzer's Dilemma" as irrelevant and insignificant, while simultaneously attacking what he calls "Hoare's Folly" as relevant and significant. Since Fetzer's Dilemma itself was a critique of Hoare's Folly, Martin's position is not even consistent. Apparently, if he criticizes a position, then it is important, but not if someone else does.

My arguments concern whether computer science should be viewed as a branch of pure mathematics or as a branch of applied mathematics. The dilemma that I posed arises from the realization that purely deductive reasoning cannot pro- vide content about preformance for any physical machine, so that if we want content about performance of a physical machine, then purely deductive reason- ing is not enough. I do not deny that correctness proofs can be pursued as a branch of applied mathematics, but their limitations--in principle and in prac- tice--must be clearly defined.

As examples of Martin's selective distortion of the real issues at stake here, observe that, in his message of 20 January 1990 at 15:11:42, he maintains,
"As a practical matter, the issues Fetzer raised in his diatribe passing false- ly as a technical article can be largely ignored, both from the standpoint of engineering and mathematically." Consider, however, what this person would have you ignore (some of the most important consequences of my position):

"If one were to assume the execution of a program could be anticipated with the mathematical precision that is characteristic of demonstrative domains, then one might more readily succumb to the temptation to conclude that de- cisions can be made with complete confidence in the (possibly unpredictable) performance of a complex causal system" (CACM 31/9 (1988), p. 1062). Perhaps this is a matter he wants to ignore, but the rest of us do so at our peril.

In his message of 25 January 1990 at 18:07:15, he states that, "Fetzer's argu- ment is precisely that program proofs are flawed on the face, that no proof can ever give insight or certainty to the behavior of a real program on a real machine, because of the distinction between the mathematical model and the real world." Since part of what he says here is correct (the last part), it gives a specious plausibility to the rest (the first part). The point, as I have now explained in several places, is that, "since program verification cannot guar- antee the performance of any program, it should not be pursued in the false belief that it can--which, indeed, might be entertained in turn as the 'ill-in- formed, irresponsible, and dangerous' dogma that my paper was intended to ex- pose" (CACM 32/3 (1989), p. 288). There is ample evidence that Martin has grasped neither the theoretical nor the practical issues involved here.

In his message of 20 January 1990 at 15:11:42, Martin also maintains that, "these issues had been examined at some length in the formal methods com- munity. Had Dr Fetzer been conversant with the literature, he would have known it; had the reviewers been appropriately chosen, they would have called him on it." He repeatedly implies that I am ignorant, that the referees were incom- petent, and that the editors were irresponsible. Surely others besides Martin and his friends are capable of making such decisions. Indeed, Peter Denning's reply to complaints of this kind stands on its own without requiring endorse- ment from me (CACM 32/3 (1989), pp. 289-290).

I have already explained my position on most of these matters (for example, CACM 32/3 (1989), pp. 288-289). For the benefit of those who cannot read or who do not understand what words mean, let me emphasize a few points that are relevant here.

(1) The position under consideration is Hoare's position, as my article made entirely clear. Hoare is an important figure within the formal methods community. Not only is his position not trivial, it cannot be defended as justifiable or true. Yet it has been highly influential.

(2) Not only does Martin not deny that Hoare has such a position, he even refers to it as "Hoare's Folly", characteristically substituting the use of labels or names for rational argument. Martin even agrees with me that Hoare's position is not acceptable and that program testing is required.

(3) Surely in attacking some specific position, especially one that is well-defined, an author is not obli- gated to consider every possible alternative. Martin seems to be offended that neither I nor the referees nor the editors consulted him about these matters, as though he were some respository of truth about formal methods.

Ler me close with a few personal observations. I have now received letters and email from around the world expressing diverse points of view. Some of these letters and messages have agreed with my position, while others have disagreed. With the possible exception of the "Gang of Ten", however, none of them has de- nied my right to examine these issues or the magazine's prerogatrive to publish it, apart from Charles Martin. His arrogant attitude and misleading contentions are not merely self-aggrandizing and obnoxious. They create the impression that there is a powerful clique within computer science that is unwilling to acknowl- edge the force of rational persuasion.

>From Martin's distorted remarks, I very much doubt that he has bothered to read the real debate over these issues in CACM (March, April, June, July, and Aug- ust) and Notices of the American Mathematical Society (September and December). There are real issues here that the community of the world cannot afford to mis-understand. For reasons such as these, I hope that other members of the Depart-ment of Computer Science at Duke University and of the profession as a whole will not be misled by infantile protestations masquerading as serious argument.

James H. Fetzer




From hsi!uunet!husc6!rutgers!mcnc!duke!romeo!crm Sun Mar  4 20:23:25 EST 1990
Article 1020 of comp.software-eng:
Path: stpstn!hsi!uunet!husc6!rutgers!mcnc!duke!romeo!crm
>From: crm@romeo.cs.duke.edu (Charlie Martin)
Newsgroups: comp.software-eng
Subject: Re: Reasons why you don't prove your programs are correct
Message-ID: <17792@duke.cs.duke.edu>
Date: 25 Feb 90 18:05:54 GMT
References: <3236@umn-d-ub.D.UMN.EDU>
Sender: news@duke.cs.duke.edu
Lines: 342
Keywords: verification, testing, validation, Charles Martin
Summary: what was useful was not original; what was original was not useful;
	 refereeing should keep this from happening in flagship journals.
Posted: Sun Feb 25 13:05:54 1990




Okay, fine.  Let's talk about these things some more.  As long as we're
at it, I hope we can avoid the kind of ad hominem arguments that have
characterized Dr Fetzer's discussion so far.

    Charles Martin's intemperate tirades concerning my position are
    not even coherent.  He belittles what he calls "Fetzer's Dilemma"
    as irrelevant and insignificant, while simultaneously attacking
    what he calls "Hoare's Folly" as relevant and significant.  Since
    Fetzer's Dilemma itself was a critique of Hoare's Folly, Martin's
    position is not even consistent.  Apparently, if he criticizes a
    position, then it is important, but not if someone else does.

I *beg* your pardon.  I believe if you actually read my notes, you
would have found that I referred to "Fetzer's Dilemma" specifically as
being a point that you had made.  I believe, as you do, that expecting
absolute certainty of program verification is foolish; that is why one
of the "myths of verification" I identified in my postings was
"Hoare's Folly", the folly of believing that one can expect a proof of
properties of a program *text* to exactly correspond to the properties
of a realized program.  Furthermore, I made the specific point in my
posting that you had identified this in your article, and that your
article led in part to my understanding of the folly described.  I
believe my exact phrasing was "...but he has a point."

In other words, within those limits, I *agreed* with you.  If what I
said is inconsistent, it can only be because your original points
were also inconsistent.

But, as I say, I only agree with you within limits.  Consider the very
title of the article "Program verification: the very idea".  The common
idiomatic meaning of the phrasing "the very idea" is understood as the
very idea so-and-so is foolish, meaningless, fallacious, nonsensical.

("Marry my daughter?  Hmph. The very idea.")


    My arguments concern whether computer science should be viewed as
    a branch of pure mathematics or as a branch of applied
    mathematics.

An excellent distinction that I believe was first stated in a letter
responding to your first article.  As an *engineer* by orientation,
rather than a pure scientist or mathematician, I am most concerned with
the applied aspects although I find the pure math aspects attractive and
aesthetically pleasing.

    The dilemma that I posed arises from the realization that purely
    deductive reasoning cannot provide content about preformance for
    any physical machine, so that if we want content about performance
    of a physical machine, then purely deductive reasoning is not
    enough.  I do not deny that correctness proofs can be pursued as a
    branch of applied mathematics, but their limitations--in principle
    and in practice--must be clearly defined.

    As examples of Martin's selective distortion of the real issues at
    stake here, observe that, in his message of 20 January 1990 at
    15:11:42, he maintains, "As a practical matter, the issues Fetzer
    raised in his diatribe passing falsely as a technical article can
    be largely ignored, both from the standpoint of engineering and
    mathematically."  Consider, however, what this person would have
    you ignore (some of the most important consequences of my
    position): "If one were to assume the execution of a program could
    be anticipated with the mathematical precision that is
    characteristic of demonstrative domains, then one might more
    readily succumb to the temptation to conclude that decisions can
    be made with complete confidence in the (possibly unpredictable)
    performance of a complex causal system" (CACM 31/9 (1988), p.
    1062).  Perhaps this is a matter he wants to ignore, but the rest
    of us do so at our peril.

Please consider carefully the phrasing that you quoted: "As a practical
matter, the issues... can be *largely* ignored...."

What I stated, apparently fairly clearly, was that in *many* areas of
programming, in many problem domains, and in many contexts, the issues
of correspondence between the mathematical model and the real-world
physical realization can be ignored.  If we took a poll, I feel certain
that most fluent English speakers would agree that the phrasing I used
makes a careful effort NOT to claim absolutely that it can ALWAYS be
ignored.

Consider for example accounting programs: they are not life critical,
they are not performance critical, they do not have real-time demands
other than sufficient throughput.  They usually have no asynchrony.
These programs are programs in which we are justified in assuming that
our mathematical model is "close enough" to the real world.  The hazard
of being incorrect is small, the cost of complete testing is extremely
high, and the likelyhood that an error caused by an unpredictable event
(say a cosmic-ray alpha particle) will cause damage is extremely slight.

In life-critical systems, the situation is different.  But
life-critical systems are being built today, WITHOUT a mathematical
basis, or worse, using mathematical methods based on assumptions that
we KNOW experimentally are flawed (cf.  n-version programming, and
Avizenis' claims for it.)  I doubt that either Dr Fetzer or I believe
that this situation is good, or that applying mathematical methods has
NO effect on correctness.  (Certainly there is real evidence from
experiments that it does; if someone wants to believe otherwise in the
face of experimental evidence, I see it as a religious problem.)

    In his message of 25 January 1990 at 18:07:15, he states that,
    "Fetzer's argument is precisely that program proofs are flawed on
    the face, that no proof can ever give insight or certainty to the
    behavior of a real program on a real machine, because of the
    distinction between the mathematical model and the real world."
    Since part of what he says here is correct (the last part), it
    gives a specious plausibility to the rest (the first part).  The
    point, as I have now explained in several places, is that, "since
    program verification cannot guarantee the performance of any
    program, it should not be pursued in the false belief that it
    can--which, indeed, might be entertained in turn as the 'ill-in-
    formed, irresponsible, and dangerous' dogma that my paper was
    intended to expose" (CACM 32/3 (1989), p. 288).  There is ample
    evidence that Martin has grasped neither the theoretical nor the
    practical issues involved here.

I'd say, rather, that there is ample evidence that Fetzer has neither
read nor understood what I said.  As I pointed out above, I agreed, and
stated rather pithily, that mathematical reasoning alone was
insufficient, and that believing it was sufficient is folly.  Hoare's
folly.

However, MY point, and the point made by many others who have responded
to Dr Fetzer, is that program verification can provide a sort of
*relative* guarantee: the likelyhood of defects in programs constructed
with proofs is very much smaller.  Under the assumption that a trusted
stack has been constructed, the number of defects observed during
execution should approach zero, limited only by age or manufacturing
failures.

Furthermore, this relative guarantee is the ONLY kind of guarantee that
ANY engineering technique (or, for that matter, any scientific
exploration in the Logik der Forschung sense) can provide.  All
physics and all engineering is based on the implicit or explicit
assumption that the mathematical model, the abstraction, that is
manipulated is sufficiently close to the behavior of the real world.
Engineers in other fields, such as civil engineers, regularly build
life-critical systems like bridges based on this assumption; the "formal
methods" of civil engineering consistently assume that the behavior of
the system is sufficiently well modelled by the behavior of the
mathematical models.

My best estimate -- I'm currently hoping to experimentally verify this
-- is that defects caused by errors in translation between
specifications and eventual executable code account for about 99.9
percent of all defects.  I make the distinction here -- not an uncommon
one -- between *defects* as places where the code does not correctly
realize what was specified, and specification errors, where what was
built turns out not to meet the user's needs or desires, but does match
the specification.

In other words, only about one defect in a thousand occurs because the
compiler was wrong, the run-time system was buggy, the power glitched,
or the processor had an intermittent failure.  Does Dr Fetzer claim that
verification gives NO guarantee that these sorts of translation errors
have been eliminated?  If so, I think he's on very shaky ground,
logically; program proofs of this sort are precisely mathematical
theorems about string-to-string transformations under a formal system.

My experimental evidence is that simply performing hand proofs of the
design lowers the defect rate to the point where it is not observable on
the scale to which I've been able to apply the methods, building real
systems programs.  Further, my experimentation suggests that hand-proven
programs serve as good firewalls which trap errors in unproven codes.  I
hope to experimentally verify this on bigger problems; there is already
substantial experimental validation of similar techniques in the IBM
cleanroom experiments.  But the experimental evidence is already pretty
clear: program verification has a substantial effect in removing defects
from codes.

    In his message of 20 January 1990 at 15:11:42, Martin also
    maintains that, "these issues had been examined at some length in
    the formal methods community. Had Dr Fetzer been conversant with
    the literature, he would have known it; had the reviewers been
    appropriately chosen, they would have called him on it."  He
    repeatedly implies that I am ignorant, that the referees were
    incompetent, and that the editors were irresponsible.  Surely
    others besides Martin and his friends are capable of making such
    decisions.  Indeed, Peter Denning's reply to complaints of this
    kind stands on its own without requiring endorsement from me (CACM
    32/3 (1989), pp. 289-290).

No, I didn't imply that you were ignorant, I stated it pretty baldly,
and will say it again.  By your own statements, your article was based
on arguing against the foolish assertion of Hoare that testing was
completely unnecessary.  However, this assertion is NOT mainstream in
the verification literature.  In fact, I don't know of anyone who really
believes it in the verification community.  But while I don't know of
anyone in the verification community who really believes it, I am aware
of a number of different projects specifically BASED on recognition that
program verification is only useful to the extent that the next layer
down is also proven.  I've noted these before: consider the CLI trusted
stack, the analogous work at Odyssey, or Warren Hunt's verified
microprocessor work.

These projects would not be funded, active, or even interesting areas
of research IF it were not recognized that the issues of correspondence
between mathematical models and their physical realizations is a
significant problem.  But your article doesn't refer to this research,
or the areas in the literature where these problems are discussed.  In
fact, your article refers to no publications in the specific
verification literature that I recall.  (If I am wrong, remind me of the
reference.)

To have failed to reference or consider	a major body of literature on
the very topic of your paper is a failure in scholarship that wouldn't
be tolerated in "Introduction to Composition" classes at Duke; I see no
reason to feel kindly about it in a "technical" publication.  However,
this isn't your field; if I wrote a paper which re-invented something
that Schopenhauer or Kant said, I'd be (marginally) proud of myself.

If I wrote a philosophy article which simply rehashed some old result
in, say, metaphysics, and it were published in a flagship refereed
journal of philosophy, I imagine you would feel it a scandal and a
shame.

The reason that journals are refereed is to catch this kind of error.
But your article wasn't caught in the refereeing process.  The article
was apparently assigned to referees who were themselves not conversant
with the appropriate literature.  Your article was then published.  I
think this is a scandal and a shame.

    I have already explained my position on most of these matters (for
    example, CACM 32/3 (1989), pp. 288-289).  For the benefit of those
    who cannot read or who do not understand what words mean, let me
    emphasize a few points that are relevant here.

    (1) The position under consideration is Hoare's position, as my
    article made entirely clear.

I wouldn't say it was entirely clear, since this was a topic of some
discussion for some time afterwards.  Further, you did not reference
papers in which Hoare says "testing not required" most clearly; you
did reference papers in which Hoare makes explicit his simplifying
assumption that we can ignore hardware errors.  It at least gave the
impression you had not even completely read the papers you cited.

    Hoare is an important figure within the formal methods community.
    Not only is his position not trivial, it cannot be defended as
    justifiable or true.  Yet it has been highly influential.

I doubt that even Hoare really believes "Hoare's Folly" -- careful
reading of his publications will certainly reveal plenty of times when
he adds a caveat to the effect that all these arguments assume the
hardware correctly executes the program.

In any case, your article is not titled "Hoare's assertion about
testing: the very idea".  It refers to the entire field of program
verification, and assumes that Hoare speaks for that community in all
things.  You claim that Hoare has been very influential, and you are
correct; however, in this area, the field had already advanced to the
point that we recognized Hoare was also WRONG, at least on this topic.
Unfortunately, you apparently did not pursue your research far enough
to recognize that the issue you raised was one that was already common
knowledge within the community.

    (2) Not only does Martin not deny that Hoare has
    such a position, he even refers to it as "Hoare's Folly",
    characteristically substituting the use of labels or names for
    rational argument.

Sure, sure, and when I refer to Russell's Paradox, that's also
substituting labels or names for rational argument.  I stated a point,
then labelled it "Hoare's Folly" because (a) Hoare said it, and (b) it's
foolish.

Really, this is the worst sort of "argument by intimidation" and no
intellectually responsible philosopher would use it consciously.  I
assume therefore it was a slip.

    Martin even agrees with me that Hoare's position is not acceptable
    and that program testing is required.

In other words, I agreed with you: what are you bitching about?  I
didn't agree with you sufficiently strongly?  Or that I dared to note
that you hadn't done your homework?

    (3) Surely in attacking some specific position, especially one
    that is well-defined, an author is not obligated to consider
    every possible alternative.  Martin seems to be offended that
    neither I nor the referees nor the editors consulted him about
    these matters, as though he were some repository of truth about
    formal methods.

Well, in a word, yes, that is pretty much what offended me.  Neither
you, nor the editors, nor the referees consulted me OR ANYONE ELSE WHO
WAS CONVERSANT WITH THE FIELD.  I'm not one of the major voices in the
area; many of the major voices in the area responded along the same
lines as I have.  (One of the major researchers in the field, John
Rushby, is an editor of the journal, and clearly the one to whom the
article should have been assigned.) It was a poor job of scholarship
and a poor job of refereeing that served the ACM very badly.

    Let me close with a few personal observations.  I have now
    received letters and email from around the world expressing
    diverse points of view.  Some of these letters and messages have
    agreed with my position, while others have disagreed.  With the
    possible exception of the "Gang of Ten", however, none of them has
    denied my right to examine these issues or the magazine's
    perogative to publish it, apart from Charles Martin.

For someone who inserts a sneer about those who read or understand what
words mean, you're playing pretty fast and loose here, bucko.

I did not claim that you had no right to examine these issues, only that
your examination recreated a well-known problem and claimed it as your
own creative effort.  It may have been original from your point of view,
but it did not raise an unknown issue, nor did it clarify points that
were not already well known.  You did not sufficiently explore the
existing literature; this is a sophomoric failure.

Nor did I claim that the magazine could not exercise the prerogative to
publish it.  (Nor, actually, did the "Gang of Ten".)  What they claimed,
and what I claim, is that for such a massive failure of scholarship to
have been promulgated in what is supposed to be -- what has in the past
been -- a major scholarly publication which claims to be the flagship
publication of the ACM was reprehensible and censurable.

You, Dr Denning, and CACM, can publish anything you like: I am not
obliged to respect poor work just because it is your right to publish
it.

    His arrogant attitude and misleading contentions are not merely
    self-aggrandizing and obnoxious.  They create the impression that
    there is a powerful clique within computer science that is unwilling
    to acknowledge the force of rational persuasion.

I'm sorry, I must have missed a point here; was it "his arrogant
attitude" or "self-aggrandizing and obnoxious" that was the "rational
persuasion"?

I find I can't respond to your personal remarks adequately without
resorting to the sort of counter-insult that I gave up years ago.
"You're one." "You're another!"  "So's your mama!" doesn't seem to me
to characterize the search for truth -- or the love of knowledge, which
is what philosophy is supposed to mean.


Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)




From hsi!uunet!seismo!ukma!tut.cis.ohio-state.edu!mailrus!iuvax!rutgers!umn-d-ub!phil Sun Mar  4 20:39:12 EST 1990
Article 1021 of comp.software-eng:
Path: stpstn!hsi!uunet!seismo!ukma!tut.cis.ohio-state.edu!mailrus!iuvax!rutgers!umn-d-ub!phil
>From: phil@umn-d-ub.D.UMN.EDU (Philosophy Dept)
Newsgroups: comp.software-eng
Subject: What's Right and Wrong About Charles Martin's Position
Message-ID: <3237@umn-d-ub.D.UMN.EDU>
Date: 25 Feb 90 22:38:16 GMT
Organization: U of Minnesota-Duluth, Information Services
Lines: 90
Keywords: correctness, validation, testing, program verification

After reviewing your USENET messages of the last month or so, I want to state,
at least in part, why I find your public behavior to be so obnoxious.  In the
process, perhaps you will be able to appreciate why I am reacting so strongly.

(1) In your article 1392, for example, you responded to Robert Munch with der-
ison and abuse. "Oh, crap", you say, belittling his argument.  "This is the
most assinine (sic) argument that I've seen proposed."  Messages like this one
lead me to suggest that you really should not call people names that you cannot
even spell.  It creates the impression that you are not merely abusive but also
asinine.  And I know you would not want to be accused of relying on ad hominems.

(2) In your article 1338, for another, you refer to my publication in CACM as a
"diatribe".  I wonder if you have consulted a dictionary lately as to the mean-
ing of that word.  According to Webster's New World Dictionary (1988), "diatribe" means a bitter, abusive criticism or denunciation.  This is a better descrip-
tion of your message than it is of my article.  It would be rather difficult for
you to find even one passage of my article that supports you.  Go back and take
a look.  And you are doing all this over a world-wide net!

(3) In your article 1427, for a third, you describe my position as that of a
"straw man".  As I have previously observed, you implicitly contradict yourself
in doing so, since you combine your diatribe against me with an assault on Pro-
fessor Hoare (who is a far more decent human being than some of his critics).
But your bitter denunciation of my CACM article only makes sense if Hoare is
right!  Otherwise, I am right in criticizing him and you are wrong in criticiz-
ing me.  Why is this something that you are unable to understand?

(4) Your problem, I believe, is confusing my critique of a specific position on
program verification with an attack upon the program verification community.  I
was attacking Hoare's position; I was not attacking the community itself.  Look
at the first sentence, the first paragraph, and the first few pages I wrote. You
seem to be unable to distinguish between the specific position that was under
consideration and the heterogeneous views of a diverse community.  Surely I am
entitled to critique Hoare's position without being responsible for also assess-ing whatever 1001 other voices may have said in other places.

(5) I just know how proud you would have been during the debate at ACM CSC 90
when I quoted some of your heroes from Computational Logic.  Yes, I cited J.
Strother Moore himself, for example, when he remarked, "In a verified system
one can make the following startling claim:  if the gates behave as formally
modeled, then the system behaves as specified" (Journal of Automated Reason-
ing 5/4 (1989), p. 409).  As I observed, this should come as welcome news to
the sports car manufacturers, the cruise missile commanders, and the space
craft managers. (It is a startling claim, especially considering its source!)

(6) I also quoted another luminary from the field whom you must also admire,
William R. Bevier, when he observed, "The key to our approach to systems ver-
ification is the use of formally-defined abstract machines" (Journal of Auto-
mated Reasoning 5/4 (1989), p. 427).  These machines, he explains, are explic-
itly defined as functions in Boyer-Moore logic, which you endorse.  But the
problem remains of establishing that our abstract models are adequate to the
physical world that they are intended to represent.  This leads directly to
Fetzer's Dilemma by a route that is too obvious to need review.

(7) I think your problem is one of public relations.  You are evidently upset
because I criticized a position to which I referred as "program verification".
You admit that the position I have criticized deserved to be criticized, yet
you insist that I am attacking a "straw man".  This leads me to ask you to go
back and reread the Viewpoint by Dobson and Randell (CACM 32/4 (1989), pp.
420-422).  They believe that your field suffers from a gap between its public
image and its private reality.  They describe the situation that you are in.

(8) The questions at stake here, by the way, are not technical points of pro-
gram verification.  They are problems in the theory of knowledge.  It would be
far more appropriate, I believe, if you were to simply admit that what I have
said is correct about the positon that I have attacked, but that you--and many
others--represent other positions.  Surely the opinions of other commentators,
including Dobson and Randell, suggest that these other views, which you are now
so eager to advance, have not been much in evidence until this debate arose.

(9) A crucial point.  Yet another interesting equivocation enters the picture
at this juncture.  The team of Ardis and Gries, who were presumably defending
formal methods in software engineering at ACM CSC 90, explicitly DISAVOWED the
thesis that, "Formal methods of program verification should be the principal
methodology of software engineering in each of its phases for assessing the
reliability of software systems."  If you agree with them and with me in deny-
ing that this proposition is true, perhaps we are not that far apart, after all.

(10) When you express your views on these matters in the future, however, it
might be very helpful to carefully distinguish between "formal proofs", "proof
sketches", and mere "deductive reasoning".  No one would deny that deductive
reasoning is indispensable to program construction, but that is hardly an im-
portant position.  I know that you think that the vast literature to which you
constantly refer makes these things completely clear.  I do not believe there
exists any single unified and coherent position which your "community" shares.

Since I have consistently maintained that program verifiction can exist as a
branch of applied mathematics (CACM 31/9 (1988), pp. 1059-1060; CACM 32/8 (19-
89), pp. 920-921; and Notices of the AMS 36/10 (1989), p. 1353), it is very
hard to see why you are carrying on such an abusive tirade. So quit bitching.

James H. Fetzer




From hsi!uunet!tut.cis.ohio-state.edu!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!uflorida!mephisto!mcnc!duke!romeo!crm Sun Mar  4 20:40:51 EST 1990
Article 1034 of comp.software-eng:
Path: stpstn!hsi!uunet!tut.cis.ohio-state.edu!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!uflorida!mephisto!mcnc!duke!romeo!crm
>From: crm@romeo.cs.duke.edu (Charlie Martin)
Newsgroups: comp.software-eng
Subject: Re: What's Right and Wrong About Charles Martin's Position
Message-ID: <17811@duke.cs.duke.edu>
Date: 26 Feb 90 17:07:12 GMT
References: <3237@umn-d-ub.D.UMN.EDU>
Sender: news@duke.cs.duke.edu
Reply-To: crm@romeo.UUCP (Charlie Martin)
Organization: Duke University CS Dept.; Durham, NC
Lines: 74
Keywords: correctness, validation, testing, program verification


In response to at least one question, the reason I continue arguing with
Dr Fetzer is  (1) I've got more dedication to the old business about the
pursuit of truth than I do sense, and (2) I really do think the issues
are important and significant.

Now, I'd like to summarize my position briefly (believe it or not):

(1) I recognize the distinction Dr Fetzer makes between deductive
    knowledge and scientific or inductive knowledge.  He was right about
    the distinction, as I've said all along.  However, it was not a new
    idea in the community or a new distinction.

(2) Hoare has said in print that once a program is proven tests are no
    longer needed.  (See e.g. pg 319 of Hoare and Jones _Essays in
    Computing Science_, in the paper "Programming is an Engineering
    Profession".)  I believe this is folly.

(3) However, I do not believe that Dr Fetzer's paper and its publication
    represented good or responsible scholarship, for the following
    reasons:

    (a) It presented no new information.  The issue he drew was one that
    is well-known in the literature.  It makes no mention of other
    considerations of the same problem is widely-accessible
    publications.  (examples follow.)

    (b) After several re-readings, it still appears to me to state that
    program verification per se is fatally flawed, not just Hoare's
    position with reference to testing.  My previous posting notes
    several reasons why I believe it gives that impression.

I promised examples.  Here are two, drawn from the books I can reach
with my left hand from my desk.

>From "Mechanical proofs about computer programs", Donald I Good, in
Hoare and Shepherdson _Mathematical Logic and Programming Languages_, pp
70-71. (Prentice-Hall, 1985)

   "However, there are several reasons why a program that has been
    proved within the Gypsy environment may not behave exactly as
    expected.  First, the formal specifications may not describe exactly
    the expected behavior of the program.  Secondly, the formal
    specification may not describe all of the aspects of program
    behavior.  Thirdly, invalid lemmas may have been assumed without
    proof. Finally, either the verification environment, the compiler,
    the Gypsy run-time support or the hardware might malfunction.

    ....These sources of error are cited not to belittle the potential
    of a scientific basis for software engineering but to make clear
    that the formal, mathematical approach offers no absolutes."

>From _Formal Methods of Program Verification and Specification_, Berg et
al., pg 36.  (Prentice-Hall 1982)

   "Another limitation arises from the necessarily abstract view we take
    of program semantics and the corresponding distance that exists
    between a formally verified program and the physical manifestation
    of its behavior. .... The functioning of the hardware is a physial
    process, subject to the laws of nature.  Ultimately, then, the
    confidence we have in the correctness of a program depends upon our
    belief that the proper physical actions will result.  As a result,
    there is no "bottom line" at which we can show that executing a
    program is guaranteed to produce the desired effect.  The best we
    can do is show that programs are correct with respect to their
    execution on some abstract machine, which we model with a deductive
    system.  The _truth_, as opposed to the validity, of the theorems in
    the deductive system depends upon the proper implementation of the
    program semantics--that is, of the abstract machine--by hardware and
    software."

I believe these two quotes show that Dr Fetzer's position was well-
understood and widely recognized long before his paper was published.
Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)




From hsi!uunet!aplcen!uakari.primate.wisc.edu!uflorida!mephisto!mcnc!duke!romeo!crm Sun Mar  4 20:42:26 EST 1990
Article 1035 of comp.software-eng:
Path: stpstn!hsi!uunet!aplcen!uakari.primate.wisc.edu!uflorida!mephisto!mcnc!duke!romeo!crm
>From: crm@romeo.cs.duke.edu (Charlie Martin)
Newsgroups: comp.software-eng
Subject: Re: What's Right and Wrong About Charles Martin's Position
Message-ID: <17813@duke.cs.duke.edu>
Date: 26 Feb 90 17:40:06 GMT
References: <3237@umn-d-ub.D.UMN.EDU>
Sender: news@duke.cs.duke.edu
Lines: 85
Keywords: correctness, validation, testing, program verification
Posted: Mon Feb 26 12:40:06 1990



Quick repsonses:

(1) It's commonly considered good form either to completely quote, or to
indicate elision by ellipsis.  In this case, I beleive what I wrote was
"Oh, crap.  I'm sorry, but this is the most assinine...."  I apologize
abjectly for sullying the discussion with the word crap, even though I
apologized in line.

W.r.t. "assinine".  Oh my goodness, I made a spelling mistake.

(2) "Diatribe: a bitter and abusive criticism or denunciation;
invective."  I calls'em as I sees'em.  No apology.

(3) "Straw man: one set up as an opponent to be easily defeated or
refuted."  As I've shown, the position you refuted is one widely known,
understood, and widely discussed in the literature to be false.  You
then proceeded to refute it.  This is a straw man.  No apology.

My criticism has always been not that you are wrong, but that your
point is as widely known as the fact that ice melts.  Taken at the
epistomological level you apparently intend, the same argument calls
into question all applied math that is used to predict physical law, or
else can be summarized as "our predictions are only as good as our
understanding."

(4) You may be right.  If so, I'm not alone in this.  You could have
made the fact that you intended only to disagree with Hoare's position
much clearer, as evidenced by the fact that *many* people have
misunderstood you the same way.

(5) I don't think I follow your point.  I have no doubt that J said
that; I agree with it, and it appears to agree with you substantial
point.  But when I took intro philosophy courses years ago, I learned
something they called the "principle of substitution": that one could
evaluate a questionable argument by recasting it using other words from
the approproate grammatical classes.  For example, recasting in terms of
a sports-car manufacturer:

   "In a mathematically analyzed engine, one can make the following
    startling claim: if the components behave as modelled, then the
    engine will behave as predicted."

Do you disagree with this statement?  It strikes me as tautological.

(6) If you haven't grasped by now that I am completely aware of the
    distinction between formal system and physical realization, nothing
    further I can say will convince you.  I notice you seem to have
    given up the position that the name "Fetzer's Dilemma" was
    derogatory.

(7) It is the fact that verification has public relations problems; I
    don't think you have been very helpful with them.  But many people
    believe in astrology; it isn't astronomy's major purpose to dissuade
    them.

(8) As my previous posting demonstrated, they have been widely known; so
    widely known that I doubt many people thought there was any point in
    rehashing them.  It is just unfortunate that you were not widely
    enough read in the literature to realize this.

(9) You know philosophy better than to actually think that what you
    state is an equivocation.  It is a denial of a thesis that was badly
    stated.

    "Formal methods of program verification should be the principal
    methodology of software engineering in each of its phases for
    assessing the reliability of software systems."

    You should have heard the moaning in the local office of CL inc when
    this thesis was published.  Who came up with the phrasing?

    In any case, I do agree that this is an unsupportable thesis.  Had
    it been stated "Formal methods should be the principle
    methodology...." I would be closer to agreeing with it, as I would
    expect that included Markov methods and queueing theory.

(10) I doubt that in as new and diverse a field as verification, you
    could get agreement on when to have lunch, much less on technical
    details.  But I suspect that the point "corrrectness of the final
    system depends on how well our model of the realization models
    physical reality" comes as close as any.  As familiarity with the
    literature would have revealed.
Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)




From hsi!uunet!aplcen!uakari.primate.wisc.edu!zaphod.mps.ohio-state.edu!usc!cs.utexas.edu!rutgers!umn-d-ub!phil Sun Mar  4 20:44:00 EST 1990
Article 1028 of comp.software-eng:
Path: stpstn!hsi!uunet!aplcen!uakari.primate.wisc.edu!zaphod.mps.ohio-state.edu!usc!cs.utexas.edu!rutgers!umn-d-ub!phil
>From: phil@umn-d-ub.D.UMN.EDU (Philosophy Dept)
Newsgroups: comp.software-eng
Subject: Some Issues Lying Beneath a Recent Exchange of Views
Message-ID: <3241@umn-d-ub.D.UMN.EDU>
Date: 26 Feb 90 17:24:13 GMT
Organization: U of Minnesota-Duluth, Information Services
Lines: 75
Keywords: verification, correctness, validation, Charles Martin

hal and others have observed that my exchange with Charles Martin has
become been something less than a model of academic discourse.  I want
to make an attempt to salvage something worthwhile from this "debate"
that might possibly shed some light on the real issues that lie below
the surface here.  In his article 2961, for example, Martin acknowl-
edges the importance of the distinction between pure and applied math-
ematics.  He also endorses the difference between absolute and relative
forms of support (proofs, demonstrations, verifications, what have you).
Whether or not he identifies the sources of these distictions relative
to this debate is less important than that these distinctions be ack-
nowledged as relevant--even fundamental--to the questions considered.

Much of the distress in the program verification community seems to me
to have arisen from a failure to adequately distinguish between vari-
ous different positions that are possible here.  I freely admit that I
was less concerned with drawing distinctions between various positions
at the time than I was to assess those that concerned me.  Let me there-
fore attempt to make a modest contribution toward that end by specify-
ing several different positions that no doubt deserve to be separated,
where some concern logic, others methodology, and others verification:

Positions of LOGIC:

(T1) Formal proofs of program correctness can provide an absolute, con-
clusive guarantee of program performance;

(T2) Formal proofs of program correctness can provide only relative, in-
conclusive evidence concerning program performance.

Positions of METHODOLOGY:

(T3) Formal proofs of program correctness should be the exclusive method-
ology for assessing software reliability;

(T4) Formal proofs of program correcness should be the principal method-
ology for assessing software reliability;

(T5) Formal proofs of program correctness should be one among various
methodologies for assessing software reliability.

Positions on VERIFICATION:

(T6) Program verifications always require formal proofs of correctness;

(T7) Program verifications always require proof sketches of correctness;

(T8) Program verifications always require the use of deductive reasoning.

My original paper was devoted to the distinction between (T1) and (T2) as
matters of logic.  I contend that (T1) is false but that (T2) is true.  In
relation to questions of methodology, I also maintain that (T3) is false,
and at the ACM CSC 90 debate, I argued further that (T4) is also false.  I
have no doubt, however, that (T5) is true.  With respect to verificaiton in
the broad sense, it is my view that (T6) is clearly false and that (T8) is
clearly true, but that the truth-value of (T7) is subject to debate.  Much
might hinge upon whether these proof sketches had to be written down, could
be merely thought-through, or whatever.  In the former case, (T7) becomes
close to (T6), in the latter case, closer to (T8).  However these matters
may finally be decided, these seem to be the principal distinctions which
need to be drawn to get at the issues that lie beneath the surface here.

Let me conclude with one further gesture of agreement with Charles Martin.
He acknowledges that there are important differences between accounting
procedures, for example, and life-critical situations.  On this we could
not more strongly agree.  The greater the seriousness of the consequences
that would ensue from making a mistake, the greater our obligation to in-
sure that it is not made.  This suggests that the role for prototyping and
testing increases dramatically as life-critical tasks come into play.  We
can afford to run a system that has only been formally assessed when the
consequences of mistakes are relatively minor (Nintendo games, for example),
but not when they are serious (radiation therapy, for example).  So it be-
comes a matter of how confident we need to be that a system will perform
as it is intended to perform.  We cannot know without testing the system.

James H. Fetzer




From hsi!uunet!cs.utexas.edu!rutgers!mcnc!duke!romeo!crm Sun Mar  4 20:44:58 EST 1990
Article 1036 of comp.software-eng:
Path: stpstn!hsi!uunet!cs.utexas.edu!rutgers!mcnc!duke!romeo!crm
>From: crm@romeo.cs.duke.edu (Charlie Martin)
Newsgroups: comp.software-eng
Subject: Re: Some Issues Lying Beneath a Recent Exchange of Views
Message-ID: <17828@duke.cs.duke.edu>
Date: 27 Feb 90 01:51:04 GMT
References: <3241@umn-d-ub.D.UMN.EDU>
Sender: news@duke.cs.duke.edu
Reply-To: crm@romeo.UUCP (Charlie Martin)
Organization: Duke University CS Dept.; Durham, NC
Lines: 96
Keywords: verification, correctness, validation, Charles Martin
Summary: Convergence at last!
Posted: Mon Feb 26 20:51:04 1990

I just want to say that this summary Dr Fetzer has posted seems a really
good summary of the substance of the debate; I find I can disagree with
effectively none of it.

In article <3241@umn-d-ub.D.UMN.EDU> phil@umn-d-ub.D.UMN.EDU (Philosophy Dept) writes:
..Positions of LOGIC:
..
..(T1) Formal proofs of program correctness can provide an absolute, con-
..clusive guarantee of program performance;
..
..(T2) Formal proofs of program correctness can provide only relative, in-
..conclusive evidence concerning program performance.
..
..Positions of METHODOLOGY:
..
..(T3) Formal proofs of program correctness should be the exclusive method-
..ology for assessing software reliability;
..
..(T4) Formal proofs of program correcness should be the principal method-
..ology for assessing software reliability;
..
..(T5) Formal proofs of program correctness should be one among various
..methodologies for assessing software reliability.
..
..Positions on VERIFICATION:
..
..(T6) Program verifications always require formal proofs of correctness;
..
..(T7) Program verifications always require proof sketches of correctness;
..
..(T8) Program verifications always require the use of deductive reasoning.

..My original paper was devoted to the distinction between (T1) and (T2) as
..matters of logic.  I contend that (T1) is false but that (T2) is true.

Agreed.

..In
..relation to questions of methodology, I also maintain that (T3) is false,
..and at the ACM CSC 90 debate, I argued further that (T4) is also false.  I
..have no doubt, however, that (T5) is true.

Also agreed.
(I'd still like to know, someday, who came up with the thesis statement
you all started with.  I'd have hated to be Ardis and Gries.)

..With respect to verificaiton in
..the broad sense, it is my view that (T6) is clearly false and that (T8) is
..clearly true, but that the truth-value of (T7) is subject to debate.  Much
..might hinge upon whether these proof sketches had to be written down, could
..be merely thought-through, or whatever.

Absolutely.  My use of hand-proof of programs is somewhere in the middle
of the T6..T8 spectrum, attempting to present a "literate program" that
is comparable with good mathematical prose in level of detail and depth
of argument.  One of the unfortunate things with most mechanical theorem
provers is that the level of proof needed is so much more complex
(deeper, less abstract) that it's kind of hard to follow the proofs at
all.  As I've said, I think this technique seems very promising, but more
experimentation is needed.  (Funding agencies please note!)

(One problem is the question of what "formal proof" means.  If I were
feeling disputatious (:-)) I'd argue that a hand proof checked by
several others *is* a formal proof.  But let's not start that debate
'till next week, okay?)

..
..Let me conclude with one further gesture of agreement with Charles Martin.
..He acknowledges that there are important differences between accounting
..procedures, for example, and life-critical situations.  On this we could
..not more strongly agree.  The greater the seriousness of the consequences
..that would ensue from making a mistake, the greater our obligation to in-
..sure that it is not made.  This suggests that the role for prototyping and
..testing increases dramatically as life-critical tasks come into play.  We
..can afford to run a system that has only been formally assessed when the
..consequences of mistakes are relatively minor (Nintendo games, for example),
..but not when they are serious (radiation therapy, for example).  So it be-
..comes a matter of how confident we need to be that a system will perform
..as it is intended to perform.  We cannot know without testing the system.

I'm with you here, in spades.  Especially since the thing that formal
methods usually give little insight into is the APPROPRIATENESS of the
specifications; does it REALLY do what we want?

For that reason, I question whether only formal assessment is EVER
appropriate.  The Nintendo game analogy is an apt one: while we can
formally get quite good assurance that our program realizes the program
for a game that was formally specified (formally specifying a game
program like trek might make an interesting exercise, at that... hmmm)

anyway, while we can get good assurance of the correspondence between
the formally-specified game and the realization, this formal assurance
tells us little about the "goodness" of the human-machine interface;
however, the user's satisfaction with the game relies heavily on this
very "goodness" as perceived.
Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)




From hsi!uunet!dino!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!m.cs.uiuc.edu!p.cs.uiuc.edu!johnson Sun Mar  4 20:45:39 EST 1990
Article 1058 of comp.software-eng:
Path: stpstn!hsi!uunet!dino!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!m.cs.uiuc.edu!p.cs.uiuc.edu!johnson
>From: johnson@p.cs.uiuc.edu
Newsgroups: comp.software-eng
Subject: Re: Some Issues Lying Beneath a Recent
Message-ID: <102100003@p.cs.uiuc.edu>
Date: 28 Feb 90 17:31:37 GMT
References: <3241@umn-d-ub.D.UMN.EDU>
Lines: 74
Nf-ID: #R:umn-d-ub.D.UMN.EDU:3241:p.cs.uiuc.edu:102100003:000:3310
Nf-From: p.cs.uiuc.edu!johnson    Feb 27 14:22:00 1990


>(T1) Formal proofs of program correctness can provide an absolute, con-
>clusive guarantee of program performance;

>(T2) Formal proofs of program correctness can provide only relative, in-
>conclusive evidence concerning program performance.

The problems with these statements is that T1 is not ~T2.
NOTHING provides an absolute, conclusive guarentee of program
performance.  Certainly testing doesn't.  However, words like
"inconclusive" are loaded with hidden meaning, and I don't
want to waste my time doing experiments if I know that they
will be inconclusive.  I want to conclude something!  As a matter
of fact, program verification lets you conclude that your program
does not have dumb coding errors.  Thus, I don't buy either T1 or T2.

>(T3) Formal proofs of program correctness should be the exclusive method-
>ology for assessing software reliability;

>(T4) Formal proofs of program correcness should be the principal method-
>ology for assessing software reliability;

>(T5) Formal proofs of program correctness should be one among various
>methodologies for assessing software reliability.

I don't agree with T3, T4, or T5, either.  Program verification has
nothing to do with assessing software reliability.  Program verification
make your software more reliable, though, but it doesn't help you measure
the reliability, and "assess" means measure to me.


"So it becomes a matter of how confident we need to be that a system will
perform as it is intended to perform.  We cannot know without testing the
system."

Djikstra: Testing never proves the absence of faults,
          it only shows their presence.

I usually disagree with Djikstra, but this quote of his seems obviously
true.  However, I will counter with a quote of my own

Johnson: If it hasn't been tested then it doesn't work.

Specifications need testing more than anything else.  Prototyping
is a way of testing specifications.  Program verification doesn't
help specifications very much.  Instead, it helps ensure that
the code that you produce matches the specification.  Thus, testing
and verification play complementary roles.

However, there are cases where program verification is much better
than testing at showing that a system behaves as expected.  For
example, a couple of years ago there were some people who were
killed because an X-ray machine emitted much more power than it
should have.  The problem was in supposedly in the user interface;
the operator missed a decimal point or something and specified
too large a number.  People claimed that the user interface should
have caught this.  Maybe, but if I had been building the system and
I had been told that NEVER, under ANY circumstances, should the
machine emit more than X units of power, then I would have found the
one subroutine that turned on power, and written it as

EmitPower(power) {
  if power < X then
      do lots of low level stuff
}

Of course, you would have to test that the low level stuff really
implemented the high level abstraction.  However, after that you
would KNOW that the program would never ask for the fatal power.
Naturally, if the < operation on the computer breaks, then a
patient might still be zapped, but that is a lot less likely than
operator mistake.

Ralph Johnson -- University of Illinois at Urbana-Champaign




From hsi!uunet!jarthur!uci-ics!nancy Sun Mar  4 20:46:31 EST 1990
Article 1060 of comp.software-eng:
Path: stpstn!hsi!uunet!jarthur!uci-ics!nancy
>From: nancy@ics.uci.edu (Nancy Leveson)
Newsgroups: comp.software-eng
Subject: Re: Some Issues Lying Beneath a Recent
Message-ID: <25EC18CB.7845@paris.ics.uci.edu>
Date: 28 Feb 90 18:30:35 GMT
References: <3241@umn-d-ub.D.UMN.EDU> <102100003@p.cs.uiuc.edu>
Reply-To: nancy@ics.uci.edu (Nancy Leveson)
Organization: UC Irvine Department of ICS
Lines: 36

I have been an expert witness in two law suits involving the Therac
25 accident alluded to below.  Unfortunately, there is a lot of
misinformation about this accident that is circulating.  I have figured
out how to get around the proprietary information aspects and am preparing
a detailed paper that should clear up the misconceptions.  Until then,
let me just clear up a couple of things to avoid further circulation of
misinformation:

In article <102100003@p.cs.uiuc.edu> johnson@p.cs.uiuc.edu writes:
>
>For example, a couple of years ago there were some people who were
>killed because an X-ray machine emitted much more power than it
>should have.

The correct radiation was released, but it was released without the
proper filter being in place to reduce the amount of radiation that
reached the patient.

>The problem was in supposedly in the user interface;

This is wrong (along with everything in the article that follows this
statement).  The problem (actually two problems found so far) was in
the code.  I think this wrong information originally appeared in an
article about this incident that appeared in Datamation, and it has been
repeated widely.  Although the user interface was far from desirable, it
did not cause the accident -- poor programming did.

My paper describing the incident details will be available in a month or
so, and I will try to get it published quickly -- I have not decided where
to send it yet.  In the meantime, please be careful when repeating rumors
about the Therac accident.  So far, everything I have seen written about it
has been completely wrong.

nancy
--
Nancy Leveson