Softpanorama (Slightly Skeptical)Open Source Educational Society

Nikolai Bezroukov. Portraits of Open Source Pioneers

For readers with high sensitivity to grammar errors access to this page is explicitly prohibited :-)

Search books on Amazon; books often can provide information that Internet is missing... :


Chapter 2: Donald Knuth: Leonard Euler of Computer Science


version 0.53 beta


 

"Computer programming is an art form, 
like the creation of poetry or music"

Donald E. Knuth[1974]

 

knuth.gif

Contents


Let me make it clear from the beginning: I really admire this man. Donald E. Knuth is not only Professor Emeritus of The Art of Computer Programming at Stanford University, the author of the multi-volume work-in-progress The Art of Computer Programming, five volumes of Computers & Typesetting, a member of the American Academy of Arts and Sciences, the National Academy of Sciences, and the National Academy of Engineering and so on and so forth. Unlike many other bearers of a similar number of prestigious titles and published books, he is a unique lonely star among computer science researchers. I would say he is man of a stature  similar to the stature of Leonard Euler in mathematics. Such men are not born every century...

Despite what some part of free software zealots believe, free software is based on what has come before it. Professor Donald Knuth is one of the largest contributors to this pool of knowledge. Free and open programs are only as good as algorithms they are using. And dismissing his contributions to free software movement as irrelevant like "professional software freedom fighter" Richard Stallman once did (see Slashdot/2nd Annual Free Software Foundation Awards ) was both stupid and disrespectful (even if we forget the fact that TeX in one of the most important free software programs ever written). 

It's amazing that despite the information explosion in computer science he is still trying to finish his monumental The Art of Computer Programming (TAoCP), a one-author written encyclopedia in algorithms and computer science. That really makes him  "The Last of the Mohicans" -- the last Renaissance man in computer science. He really has managed to make contributions to a very wide area of computer science topics. Among them:

But still nothing can compare with his encyclopedic The Art of Computer Programming (TAoCP) I am convinced that every person who suspects that he/she has a programming abilities (and BTW an early start is a good indication of abilities in any field, including programming) should buy or steal the first volume of TAoCP. For people with substantial talent it may even become the sacred book, the depth of which will only very slowly unwind with years of careful study and thinking about issues addressed in the book. The earlier the better, but it's never too late to be acquainted with this magnificent book, one of very few books that has preserved its significance more than thirty years after the initial publication.

Many people do not understand that one of the first major open source project was neither GNU not Linux. It was typesetting program TeX along with invented by Knuth mark-up language by the same name. Knuth developed the first version of TeX in in 1971-1978 in order to avoid problem with typesetting of the second edition of his TAoCP volumes. The program proved popular and he produced a second version (in 1982) which was the basis of what we use today.  The whole text of the program was published in this book The TeXbook (Addison-Wesley, 1984, ISBN 0-201-13447-0, paperback ISBN 0-201-13448-9).


Introduction

Richard M. Stallman, Linus Torvalds, and Donald E. Knuth
engage in a discussion on whose impact 
on the computerized world was the greatest.
Stallman: "God told me I have programmed
 the best editor in the world!"
Torvalds: "Well, God told *me* that I have programmed
 the best operating system in the world!"
Knuth: "Wait, wait - I never said that."

From rec.humor.funny. 
submitted by ermel@gmx.de (Erik Meltzer)

 

Even among computer pioneers in Stanford, Donald Knuth was considered to be almost legendary figure. The following  old anecdote from Alan Kay, one of the principal designers of the Smalltalk language, illustrates the point:

When I was at Stanford with the AI project [in the late 1960s] one of the things we used to do every Thanksgiving is have a computer programming contest with people on research projects in the Bay area. The prize I think was a turkey.

[John] McCarthy used to make up the problems. The one year that Knuth entered this, he won both the fastest time getting the program running and he also won the fastest execution of the algorithm. He did it on the worst system with remote batch called the Wilbur system. And he basically beat the shit out of everyone.

And they asked him, "How could you possibly do this?" And he answered, "When I learned to program, you were lucky if you got five minutes with the machine a day. If you wanted to get the program going, it just had to be written right. So people just learned to program like it was carving stone. You sort of have to sidle up to it. That's how I learned to program."

Although many people knows about tremendous contribution of Donald Knuth to the systematization of computer science, to the theory of compiling, to algorithms theory, but few know that other things Knuth suggested the name "Backus-Naur Form", wrote one of the most compact Algol compilers at the age of 22 and published the first volume of The Art of Computer Programming at the age of 28.

He has received many awards including the 1974 Turing Award, and the 1979 National Medal of Technology. He holds honorary doctorates from more than 15 universities worldwide including prestigious Russian St. Petersburg University, the university were Leonard Euler, the most prolific mathematician in the 18th century and maybe even of all time, spend his most productive years (he succeeded  Bernoulli as Professor of Mathematics of St. Petersburg University in 1733). His 866 books and articles represent about one third of the entire body of research on mathematics, theoretical physics, and engineering mechanics published between 1726 and 1800. Among other things he modernized mathematical notation including the introduction of now standard symbols  \pi  and e).  

Actually the analogy between Donald Knuth and Leonard Euler is deeper than Lutheran religion, tremendous productivity and interest in things quite remote from the main field: Leonard Euler was interested in cartography (historians believe that this contributed to his early blindness) and spent a lot of time and effort on non-mathematical studies; Donald Knuth also spend huge amount of time and effort outside his major field and authored of the TeX and Metafont (written in WEB for Pascal) -- the major open source typesetting system that become a standard de-facto in mathematics and computer science. He also published 3:16 Bible Texts Illuminated. An interesting, but little known fact is that in 1962 Knuth obtained 1,271 digits of Euler's constant, using the Euler-Maclaurin summation.

Like Euler believed in the esthetic value of mathematic theories, Knuth believes that preparing programs for a computer can be an aesthetic experience, much like composing poetry or music. As for music he probably knows what he is talking about: he plays a custom-made pipe-organ (this sixteen-rank organ was designed and built for his home by Abbott and Sieker of Los Angeles, California, as their ``Opus 67.'' It has 812 pipes, separated into three divisions).

Biographic Notes

"I decry the current tendency to seek patents on algorithms. 
There are better ways to earn a living than to prevent 
other people from making use 
of one's contributions to computer science."
Donald E. Knuth, TAoCP vol 3

Donald Knuth was born on January 10, 1938, in Milwaukee, Wisconsin. Here is how this early period of Donald Knuth's life is described in OUT OF THEIR MINDS: The Lives and Discoveries of 15 Great Computer Scientists:

His father, the first college graduate in the Knuth family, started as a grade school teacher, and later taught bookkeeping in a Lutheran high school. He also played the church organ on Sundays. Donald inherited his father's appreciation of music and education, particularly patterns in language.

I was mostly interested in what the teachers were best at. We had very good training in the diagramming of sentences. A bunch of us would have fun after class figuring out the parts of speech in sentences of poetry.

As editor of the school newspaper, Knuth invented crossword puzzles. He remembers enjoying the search for patterns in words.

He began winning awards early on. When Knuth was in eighth grade, a candy manufacturer sponsored a contest to see who could compose the most words out of the letters in the phrase, ``Ziegler's Giant Bar.'' Knuth decided to give it a try and came in first.

I found approximately 4,500 words without using the apostrophe. With the apostrophe, I could have found many more. The judges had only about 2,500 on their master list.

He won a television set (a pricey item in those days) and enough Ziegler candy bars for the entire school. In high school, Knuth won honorable mention in the Westinghouse Science Talent Search with an unusual proposal: The potrzebie system of weights and measures. With the care that would mark his later career, Knuth defined his basic units precisely: the potrzebie to be the thickness of MAD Magazine 26; a MAD to be 48 things; and a whatmeworry, the basic unit of power. In June of 1957, Mad Magazine itself bought the piece for 25, the first publication of Donald's prolific career. But music, not writing or science, took most of his time during high school.

I thought when I went to college I would be a music major. I played saxophone, but then the tuba player got into an accident and I became a tuba player. I arranged a piece for band that combined all kinds of themes off TV shows --- Dragnet, Howdy Doody Time, and Bryl Cream. I knew nothing about copyright law.

His plans to become a musician changed when Case Institute (later Case Western Reserve) offered him a physics scholarship.

The system channelled anybody with an aptitude for science into physics. It was post World War II and there was a lot of excitement in the field.

In high school, Knuth had found mathematics uninspiring. But at Case, Paul Guenther, who taught freshman calculus, persuaded him to switch from physics to math. Guenther became Knuth's mentor in the process.

I had never met a mathematician before. He had a good sense of humor, but no matter what you said to him, he was unimpressed.

In 1956, Knuth had his first encounter with a computer, an IBM 650 --- a pre-FORTRAN machine. He stayed up all night reading the manual and taught himself basic programming.

The manuals we got from IBM would show examples of programs and I knew I could do a heck of a lot better than that. So I thought I might have some talent.

In 1956 he graduated from Milwaukee Lutheran High School and enrolled at the Case Institute of Technology in Cleveland, Ohio, as a physics major. According to www.digitalcentury.com encyclopedia article about him:

Although Knuth accomplished the highest grade point average in the history of his high school, he and his advisors had doubts as to his ability to succeed at college math. Knuth reports having something of an inferiority complex during his high school and early college years, a problem to which he attributes his over-achiever status. As a college freshman he would spend hours working extra problems in math out of fear that he would fail, only to find after a few months that his abilities far exceeded any of his colleagues.

It looks like at some point he even considered a musical carrier. Here is a relevant quote from  www.digitalcentury.com encyclopedia:

Knuth’s plan to become a music major changed when he was offered a physics scholarship at Case Institute of Technology (now Case Western Reserve). His turn toward math came during his sophomore year when a particularly difficult professor assigned a special problem, offering an immediate "A" in the class to any student who could solve it. Although like most of the other students, Knuth considered the problem unsolvable, he made an attempt at it one day when he found himself with some free time, having missed the bus that was taking the marching band to a performance. By what he states was "a stroke of luck," he solved the problem in short order, earned his "A," and skipped class for the rest of the semester. Although Knuth reports that he felt guilty about skipping class, he was obviously able to overcome the lost instruction because the following year he earned an "A" in abstract mathematics and was given the job of grading papers for the very course he had failed to attend.

It's probably a lucky coincidence that Knuth entered college the year the IBM 650 became available and managed to start working at the Computing Center. That "career move" actually determined his future. The IBM650 was not just one of the most important early computers, it was the ancestor of personal computers. Prior to the IBM 650, computers were so expensive that most programmers never had physical access to the computer room. It was a faceless and frustrating "batch environment": you handed in a punched card deck with your program and data, and got it back next day (or the same day if you are lucky). I doubt that it would have satisfied Donald Knuth and he probably would navigate his way in some other direction. But on IBM650, users were actually given a block of time and could run their programs, see what went wrong, fix it and try again. It was really fascinating... Because of its cost, the machine was kept busy 24 hours a day and time slots were tiny (often 15 min). Moreover the allocated time might be late at 11:00 pm or even 1:00 am.  Often a user need to share his block of time with another programmer: when one user program stopped, or went into a loop (that can be guessed by the way the lights flickered) the user needed to record the console lights, get off, think about the bug and let his partner use the machine.  

For many universities it was to be their first computer, its attractiveness was considerably enhanced by the availability of a 60% educational discount conditional on the institution teaching certain computer-related courses. Now this 1,966 lb machine that consumed almost 30 Kw of electricity would look more like a primitive calculator than a real computer: it has the memory of just 10,000 decimal digits of storage (less than 40K or 1,000 words; 10 digit per word).  A word can be regarded either as representing ten decimal digits and sign or as an instruction. There is a Java IBM 650 Simulator that you can try. Here is the high level description of the instruction set of IBM 650:

An IBM 650 machine code instruction was of the form: xx yyyy zzzz
where xx was the op code, yyyy was the operand address and zzzz the address of the next instruction. Thus each instruction contained a jump, to allow for the possibility of "optimization." If you program a drum machine with the instructions stored sequentially, you have to wait at least one drum revolution to read the next instruction. By calculating the expected execution time for each instruction, you can place the next instruction at the correct rotation angle around the drum so that it will come up under a read head when the current instruction is done. Since instructions could execute in as little as little as 0.3 ms (for an add), versus a drum revolution time of about 4.8 ms, careful optimization could increase execution speed by a factor of 5 or more. This is similar to the way in which disk drives use an "interleave factor" which interleaves logically adjacent sectors to improve read/write performance.

Instruction words consisted of a two-digit function code, a four-digit operand address, and the four-digit address of the next instruction. The address of the next instruction was important in a drum memory environment. Since the drum was constantly rotating, it would move some distance while each instruction was being executed. So, to minimize the delay between instructions, it would be best to have the next instruction positioned on the drum at the place where the read-write head was when execution of the current instruction was completed. As a result, instructions which followed each other in program logic would be scattered around the drum, not physically next to each other. The manuals for machines gave instruction timings, so that programmers could try to reduce rotational delays. This approach was called minimum latency programming. It was complicated by the need to fetch operands, so that the programmer had to keep in mind the locations of data and of the next instruction. To aid 650 programmers, IBM published a memory chart. It had 200 rows and 10 columns, with each cell representing a word of memory. As you wrote your program, you would place each instruction and data word in an optimal location and then mark that memory cell off as used on the chart.

Today it's really amazing that you can do something useful with such a primitive and slow machine: (data below were taken from The Columbia University The IBM 650 page; more detailed info can be found in  DDJ The IBM 650 paper by Herb Kugel): 

The 650 could add or subtract in 1.63 milliseconds, multiply in 12.96 msec, and divide in 16.90 ms. The memory was a rotating magnetic drum with 2000 word (10 digits and sign) capacity and random access time of 2.496 ms. For an additional $1,500/month you could add magnetic core memory of 60 words with access time of .096ms.  ...

BTW Donald Knuth dedicated his the first volume of his classic book "to the Type 650 computer once installed at Case Institute of Technology, in remembrance of many pleasant evenings".  Due to its role as the ancestor of personal computer IBM650 was an important part of early computer folklore. For example here is an early warning attached to the computer in IBM Watson lab:

Achtung! Alles Lookenspeepers !

Das computermachine ist nicht fur gefingerpoken und mittengrabben.
Ist easy schnappen der springenwerk, blowenfusen, und poppencorken mit spitzensparken.
Ist nicht fur gewerken bei das dumpkopfen.
Das rubbernecken sichtseeren keepen hans in das pockets muss...:
Relaxen und watch das blinkenlichten.

In order to understand better the atmosphere of this years and the level of  hardware and software development at this time I will reproduce here a part of the computer timeline that covers the period from 1955 to 1960, the college years of Donald Knuth:

1954 650 medium-sized computer introduced by IBM
1955 Sperry-Rand formed by merger of Remington Rand and Sperry Corp 1955 IBM branch in Wellington, NZ starts operations
1955 Wang Laboratories incorporated
1955 Shockley Semiconductor founded in Palo Alto, California, USA
1955 II II [PP] by A. P. Ershov for BESM computer; also L. N. Korolev, L.D. Panova, V.D. Poderiugin & V.M. Kurochkin, USSR
1955 SHARE users group
1955 BACAIS compiler by Mandalay Grems & R.E. Porter, Boeing Airplane Company, Seattle, USA for IBM 701 [Boeing Airplane Company Algebraic Interpretive Computing System]
1955 IT discussed by Alan Perlis, Mark Koschman, Sylvia Orgel, Joseph W. Smith & Joanne Chipps for Datatron computer, Purdue University Computing Laboratory [Internal Translator]
1955 Kompiler 2 by A. Kenton Elsworth, Robert Kuhn, Leona Schloss & Kenneth Tiede, University of California Radiation Laboratory, Livermore
1956 IPL language by A Newell, D Shaw, F Simon (Information Programming Language)
1956 ADES declarative language by E. K. Blum, US Naval Ordnance Laboratory [Automatic Digital Encoding System]
1956 APT language by D.T. Ross (Automatic Programmed Tool) - industrial
1956 B-0 compiler by Grace Hopper's programming staff, UNIVAC [later called Procedure Translator, & FLOW-MATIC]
1956 MAILÜFTERL computer by H. Zemanek & team, University of Technology, Austria [MAILUEFTERL = Viennese spring-time breeze - ie, not a Whirlwind]
1956 Burroughs 205
1956 MADCAP language by Los Alamos Scientific Laboratory (Mark B. Wells?)
1956 IT adapted & run on IBM 650 by Perlis & Smith, with Harold van Zoeren, Carnegie Institute of Technology
1956 Physics Nobel Prize won by Bardeen, Brattain & Shockley for transistor
1956 MATH-MATIC compiler by Charles Katz et al, UNIVAC
1956 Bull S.A. announces very-large, very-fast Gamma 60 for 1960 to compete against IBM
1956 T.J. Watson Jr becomes president of IBM
1956 'Artificial Intelligence' term coined by John McCarthy, MIT
1956 Bizmac shipped by RCA
1956 US Govt antitrust suit settled. IBM has to sell not just lease
1957 Datamatic 1000 Honeywell & Raytheon
1957 Digital Equipment Corp (DEC) founded by Ken Olsen, Maynard Massachusetts
1957 Fairchild Semiconductor founded
1957 Philco 2000 by Philco Corporation - first commercially available transistorised computer.
1957 MAC compiler for Whirlwind by Laning, Philip C. Hankins & Charles P. Werner
1957 Control Data Corporation est. by William Norris & Sperry-Rand engineers
1957 Datamation first published
1958 SAGE direction centre operational, McGuire Air Force Base, New Jersey
1958 Atlas by R.M. Kilburn, University of Manchester - first virtual memory
1958 Integrated Circuit (IC) by Jack Kilby, Texas Instruments
1958 NEC-1101, NEC-1102 NEC Japan's First computer (Nippon Electric Company)
1958 Commodore Business Machines founded by Jack Tramiel
1958 FORTRAN II - 1st language accepted worldwide.
1958 CRT as output device by Frank Rosenblatt on Perceptron Mark I
1958 CDC 1604 by Seymour Cray, Control Data Corp, 1st transistorised supercomputer
1958 Planar process of making transistors by Jean Hoerni
1958 ALGOL Zurich. Originally called IAL
1958 LISP language by John McCarthy et al, MIT (LISt Processing) - for artificial intelligence
1959 Packaged program by Computer Science Corporation
1959 Monolithic idea for ICs by Robert Noyce, Fairchild Semiconductor
1959 ACM-GAMM report by John Backus
1959 Planar IC by Robert Noyce - for mass manufacturing of ICs
1959 1620 & 1790 - IBM's 2nd generation computers
1959 1401 introduced by IBM
1959 COBOL language by committee of mainframe makers (COmmon Business-Oriented Language)
1959 Teleprinter link between Auckland & Sydney

In 1960 the Case faculty took the unprecedented step of awarding Donald Knuth a Master's degree together with the B.S. At this time he was 22 years old. During the summer of 1960 he worked for Burroughs where he wrote an Algol compiler -- one of the smallest at the time. For that compiler he was paid $5,500 -- not bad money for a student at 1960, but much less that was the common price for such a project. The flowcharts of the compiler were published in Communications of the ACM (Donald E. Knuth: RUNCIBLE-Algebraic Translation on a Limited Computer. Volume 2, Number 11, November 1959. pp 18-21).

He began his doctoral studies in the fall of 1960 at the California Institute of Technology. In 1962 the Addison-Wesley suggested that he write a book on compiler construction.  The offer was accepted and lead to now famous "The Art of Computer Programming". Cal Tech awarded Knuth a doctorate in mathematics in June 1963. He then joined the faculty as an assistant professor.  He taught at the Cal Tech from 1962 until 1968, when he joined the faculty at Stanford Univ., becoming professor emeritus in 1993. Throughout this period he continued to be involved with software development, serving as consultant to Burroughs Corporation from 1960-1968 and as editor of Programming Languages for ACM publications from 1964-1967.

Unexpected side effect: Breakthrough in empirical study of programs 

It is interesting to know that like many top mathematicians Donald Knuth made important discoveries in areas that lie outside his primary specialization: analysis of computer algorithms. In 1971 he published his groundbreaking paper  "An empirical study of FORTRAN programs." ( Software---Practice and Experience, vol 1, pages 105-133, 1971). The paper was an empirical study of executing FORTRAN programs selected randomly from semi protected les stored on Stanford University s computer disks.  The paper was an empirical study of executing FORTRAN programs selected randomly from semi protected les stored on Stanford University's computer disks.  In this paper he laid a foundation of empirical analysis of computer languages by providing convincing empirical evidence about the critical influence of the level of optimization of "inner loops" on performance, the fact that programs appear to exhibit a property termed locality of reference and provided powerful argument against orthogonal languages by observing the fact that only a small rather primitive subset of the languages is used in 90% of all statements. For example most of arithmetic expressions on the right side of assignment statements are simple increments/decrements or have a form a+1 or a-1.  That means that the difference in expressiveness of high-level languages in comparison with assembler is often overstated. Moreover he discovered amazing fact that among assignment statements, the right hand sides of most has no operations (i.e. have a form a=b), of those which has most have one with most common (a+1 and a-1), and only tiny percent has  two or more operations. Knuth was also the first to suggest profile based optimization, and today it is part of many research systems as well as production compilers. This classic paper also describes how the use of run time statistics gathered from the execution of a program can be used to suggest optimizations.  He statically and dynamically analyzed a large number of Fortran programs and measured the speedup that could be obtained by hand optimizing them. He found that on the average a program could be improved by as much as a factor of 4. Among other things, he concluded that programmers had poor intuition about what parts of their programs were the most time consuming, and that execution profiles would significantly help programmers to improve the performance of their programs: the best way to improve a program s performance is to concentrate on the parts or features of a program that according to the obtains profile are eating the lions share of the machine resources. Generally most programs behave according to Pareto law with 10% of code responsible for 90% of execution time. The paper formulated classic programming maxim "the premature program optimization is the root of all evils".  Among other things the paper might inspire introduction to C increment statements (i++;  a+=b) and development of RISC computers.

Knuth held a position as professor at Caltech until 1968. After that he joined a position of professor at Stanford University and for some time, I think, was a chair of computer science department there.

In 1974 ACM presented Donald Knuth with the Turing Award, formally the most prestigious award for computer scientists. Some consider it to be highest honor in Computer Science similar to a Nobel Price for physicists, but I respectfully disagree. Here is ACM A.M. Turing Award citation

For his major contributions to the analysis of algorithms and the design of programming languages, and in particular for his contributions to the "art of computer programming" through his well-known books in a continuous series by this title.

Unfortunately his very interesting Turing lecture (Donald E. Knuth: Computer Programming as an Art. CACM 17(12): 667-673, 1974) is not available online. Those who are interested can get it along with his another ground-breaking paper Structured Programming with goto Statements [1974] in the book  Literate ProgrammingBTW Structured Programming with goto Statements is available online[PDF].

From 1968 till 1993 he was a professor of computer science at the Stanford University. During this period he made a substantial contributions to the theory of algorithms and his research papers have been instrumental in establishing several important branches of computer science including but not limited to the analysis of algorithms, LR(k) and LL(k) parsing, attribute grammars, empirical study of programming languages, and literate programming. His best-known research in mathematics is represented by the Knuth-Bendix algorithm for word problems (axiomatic reasoning), the Schensted-Knuth correspondence between matrices and tableaux, and an analysis of the big bang that occurs in the evolution of random graphs. In general, his works has been always  directed towards the search for a proper balance between theory and practice.

In 1977 Knuth abruptly shifted his career from writing next volume of the Art of Programming to writing a complex and very successful  typography system TeX and METAFOND. It took him long ten years to finish the TeX system for document preparation and the MF (MetaFont) system for alphabet design. Noteworthy byproducts of those activities were the WEB and CWEB languages for structured documentation, and the accompanying methodology of Literate Programming. One can think about literary programming approach as a hypertext  merge of program code and computer documentation that is produced with an explicit idea of explaining the program to a good competent programmer. This approach proved to be a very successful in writing TeX, but this new paradigm of software development never found widespread use. As he explained in his interview to Computer Literacy (now Fatbrain.com):

You think of yourself as writing for a human being, explaining to a human being what a computer should do, instead of thinking of yourself as talking to the computer telling it what to do. You get your act together better when you're explaining it to another person. This approach helps even for a program that you're going to throw away after an hour. CWEB is a tool that I recommend using even if you're writing a program only for yourself, for your eyes only.
...

You can create the parts in whatever order is psychologically best for you. Sometimes you can create them from the bottom up. Bottom-up means that you know somehow that you probably need a subroutine that will do something, so you write it now while you're ready, while you're psyched for it. With this bottom- up programming, your pencil gets more powerful every page, because on page nine you've developed more tools that you can use on page ten... your pencil is stronger.

With top-down programming you start at the beginning and say "I'm going to do this first and then this, and then this"... but then you have to spell out what those are--- you can wind up gasping for breath a hundred pages later when you finally figure out how you're actually going to do those things!

Top-down programming tends to look very nice for the first few pages and then it becomes a little hard to keep the threads going. Bottom-up programming also tends to look nice for a while, your pencil is more powerful, but that means you can also do more tricky stuff. If you mix the two in a good psychological way, then it works, even at the end.

I did this with TeX, a very large program: 500+ pages of code in the book . Throughout that entire program, all those lines of code, there was always one thing that had to be the next thing I did. I didn't really have much choice; each step was based on what I'd done so far. No methodology would teach me how to write a piece of software like that, if I followed it rigorously. But if I imagined myself explaining the program to a good competent programmer, all that this long program was, then there was just this one natural way to do it. The order in which the code appears in the book is the order in which I wrote it.

TeX is now used to produce most of the world's scientific literature in physics and mathematics. This one of the first large scale successful open source projects and for this contribution Knuth is considered to be a progeny of free/open source programming movement. As he mentioned in his Advocado interview [Knuth1990]:

I saw that the whole business of typesetting was being held back by proprietary interests, and I didn't need any claim to fame. I had already been successful with my books and so I didn't have to stake it all on anything. So it didn't matter to me whether or not whether I got anything financial out of it.
...

There were people who saw that there was a need for such software, but each one thought that they were going to lock everyone into their system. And pretty much there would be no progress. They wouldn't explain to people what they were doing. They would have people using their thing; they couldn't switch to another, and they couldn't get another person to do the typesetting for them. The fonts would be only available for one, and so on.

But I was thinking about FORTRAN actually, the situation in programming in the '50s, when IBM didn't make FORTRAN an IBM-only thing. So it became a lingua franca. It was implemented on all different machines. And I figured this was such a new subject that whatever I came up with probably wouldn't be the best possible solution. It would be more like FORTRAN, which was the first fairly good solution [chuckle]. But it would be better if it was available to everybody than if there were all kinds of things that people were keeping only on one machine.

So that was part of the thinking. But partly that if I hadn't already been successful with my books, and this was my big thing, I probably would not have said, "well, let's give it away." But since I was doing it really for the love it and I didn't have a stake in it where I needed it, I was much more concerned with the idea that it should be usable by everybody. It's partly also that I come out of traditional mathematics where we prove things, but we don't charge people for using what we prove.

In 1979, at age forty-one, he was awarded the National Medal of Science by President Jimmy Carter for the first three volumes of the "Art of Computer Programming".

He became a Fellow of the British Computer Society in 1980, and an Honorary Member of the IEEE in 1982

In spring 1986 he published five volume Computers and Typesetting.

On Jan 1 1990 Donald Knuth discontinued using email because of the level of noise. From this point of view he can be called one of the first registered victims of spam ;-). Later he explained his frustration with junk e-mail in the following way:

I spent fifteen years using electronic mail on the ARPANET and the Internet. Then, in January 1990, I stopped, because it was taking up too much of my time to sift through garbage. I don't have an email address. People trying to write me unsolicited email messages get a polite note saying "Professor Knuth has discontinued reading electronic mail; you can write to him at such and such an address."

Email is wonderful for some people, absolutely necessary for their job, and they can do their work better. I like to say that for people whose role is to be on top of things, electronic mail is great. But my role is to be on the bottom of things. I look at ideas and think about them carefully and try to write them up... I move slowly through things that people have done and try to organize the material. But I don't know what is happening this month.

So now I don't read electronic mail, but I do use it occasionally.

In 1993 he officially retied from the Stanford University and concentrated completely on writing his books, which are the unique attempt to cover all major areas of computer science. He continue to live on the Stanford University Campus with his wife Jill and their two children John and Jennifer.  Unofficially, he retired in 1990, on the same day he gave up email. He explained his move in the following way:

...I had looked ahead and seen that it would take twenty years of work, full-time. If I continued doing everything else that I was doing, it was going to be forty or fifty years of work. I was just not getting anywhere, I was getting further and further behind. So I said, "Enough." Naturally, I hate to give up many of these other things that I like doing very much. But there are some things I didn't hate giving up, like writing proposals. I'm very happy to give up those!

... as a professor, in order to have decent equipment for my grad students, or to have visitors for active research programs, to publish reports, etc., I needed to find sponsors. It's a lot of work begging for money. The System Development Foundation said they'd give me a million dollars so that I could finish TeX and get back to The Art of Computer Programming.

...still [it] took many, many years to finish TeX. I decided that the only way I would be able to finish The Art of Computer Programming is by going into full-time writing, and being a hermit, and telling people "No." It was hard to adjust the first couple of years. Now I feel real efficient, and the writing is going well. A nice steady state.

Still since his retirement, Knuth gives seven to eight lectures annually under the title of “Computer Musings.”. In essence this is half-lectures half reports about his work on current problems connected with writing the Art of Computer Programming:

I give lectures at Stanford every month or so, when I'm in town, called "Computer Musings" . I plan to keep this up for twenty years, to give a talk on whatever I find interesting that month, on neat ideas I've picked up... I bring up problems that I can't solve, so that somebody will do it for me. Now, if I can't solve a problem in two hours, I've got to give it up and tell somebody else to work on it; otherwise, I'll get behind again. As I write the book, I've got to move from topic to topic, and my attention span is maybe three weeks on any particular topic.

Some of them are available on the Internet.

In 1996 he was awarded Kyoto Prize  for lifetime achievement in the arts and sciences, Japan’s equivalent of the Nobel Prize and the country’s highest private award for lifetime achievement. The Kyoto Prize is awarded each year in three categories: advanced technology, basic sciences and creative arts. Knuth won in advanced technology. The "Advanced Technology" prize is approximately $460,000, along with a certificate and a gold medal. “This is just a dream and I’ll have to wake up to see who really won the prize,” Knuth said. He and his wife have decided to donate the money to charity.

He is one of few people, who has an award in his name while still alive -- The Knuth Prize:

The Donald E. Knuth prize for outstanding contributions to the foundations of computer science is awarded every 1.5 years by the ACM Special Interest Group on Algorithms and Computing Theory (SIGACT) and the IEEE Technical Committee on the Mathematical Foundations of Computing. The Prize includes a $5000 award and a $1000 travel stipend (for travel to the award ceremony) paid by ACM SIGACT and IEEE TCMFC. The Prize is awarded for major research accomplishments and contributions to the foundations of computer science over an extended period of time. The first Knuth Prize was presented at the 1996 ACM Symposium on Theory of Computing (STOC). Future prizes will be awarded alternately at the ACM STOC and the IEEE Conference on Foundations of Computer Science (FOCS).

The Prize is named in honor and recognition of the extraordinary accomplishments of Prof. Donald Knuth, Emeritus at Stanford University. Prof. Knuth is best known for his ongoing multivolume series, The Art of Computer Programming, which played a critical role in establishing and defining Computer Science as a rigorous, intellectual discipline. Prof. Knuth has also made fundamental contributions to the subfields of analysis of algorithms, compilers, string matching, term rewriting systems, literate programming, and typography. His TeX and MF systems are widely accepted as standards for electronic typesetting. Prof. Knuth's work is distinguished by its integration of theoretical analyses and practical real-world concerns. In his work, theory and practice are not separate components of Computer Science, but rather he shows them to be inexorably linked branches of the same whole.

The Knuth Prize winner is selected by a Prize Committee consisting of 6 individuals selected by the SIGACT and TCMFC Chairs.

In selecting the Knuth Prize winner, the Committee will place particular attention on a sustained record of high-impact, seminal contributions to the foundations of computer science. The selection can also be partly based on educational accomplishments and contributions such as fundamental textbooks and high-quality students. The award is not based on service work for the community, although service might be included in the citation for a winner if it is appropriate.

Nominations for the prize will be considered by the Committee, but are not required in order to receive the Prize. All nominations should be sent to the Chair of the Prize Committee.

Past Winners

Funny enough he was nominated for the 2nd Annual Free Software Foundation Awards and lost to Miguel de Icaza. After reading about this event on Slashdot,  I begin to suspect that Richard Stallman and his pet The Free Software Foundation completely had lost contact with reality and became more like a variation of an second-rate leftist party or, if you wish, an obscure cult :-(.  As one Slashdot reader pointed out:

Why not Knuth?
by the_tsi (wNiOlSlPiAeM@perigee.net) on Tuesday December 14, @07:01PM EST (#21)
(User Info)

I recognize that Gnome is an important step for linux (err.. unices in general) to move towards the desktop, but it certainly isn't ``core functionality'' that I need my computers to do. I mean, there are parallel technologies which allow similar things.

TeX on the other hand, has been around for a long time and is used non-stop in the lab where I work. Without it, the reports we dump out would probably take forever to make. I can't imagine using Word (or any word processor for that matter) to create documents that change as much in revision as ours do. TeX is a much more earth-shattering development than a spiffy new interface to X.

I think the FSF awards copped out and picked based on ``current and trendy'' instead of deserving of an award. Of course, if there is a monetary award involved (since there's no article, I can't tell, but I imagine there is), then I can see the politics behind it. Gnome sure needs cash more than any TeX-related project.

Congrats to the nominees and the winner.

-Chris

Currently Donald Knuth is still writing Volume 4 of  The Art of computer programming.  It is expected that the volume will be finished in 2007 but let's keep our fingers crossed.

He estimate that it might take another twenty years to write the remaining five volumes. Let's hope that he will manage to finish them all. Actually the gap between the third and the forth volume is such that he can get to Guinness Book of Records as well ;-)


TAoCP and its Influence of Computer Science

Fools ignore complexity. Pragmatists suffer it.
Some can avoid it. Geniuses remove it.

Perlis's Programming Proverb #58,
SIGPLAN Notices, Sept. 1982

There was a lot of "political correctness" about how to program in those days...
 All through history, people have taken ideas and misunderstood the limitations of them.

Donald E. Knuth[1996]
 

If you're reading this lines, it's likely you don't need to be told how great those books are. But, just in case I would like to remind you that The Art of Computer Programming is one of the unique works in the history of science similar probably  Landau and Lifshitz's Course of Theoretical Physics. 

The history behind the books goes back to 1962, when the Addison-Wesley suggested that Donald Knuth write a book on compiler construction.  The offer was accepted but at first Knuth believed that he would write a book on compilers:

I thought I was writing only one book. But if I hadn't done that, I suppose I still would have been doing a lot of writing. Somehow it seems that all the way through, I've enjoyed trying to explain things. When I was in high school, I was editor of the student paper; in college I edited a magazine. I've always been playing around with words.

But after drafting some chapters he changed the content in such a way that it now looks like an encyclopedia of programming instead of the book on compilers. According to Knuth, The Art of Computer Programming became his main life’s work, the intention of which is "to organize and summarize what is known about the fast subject of computer methods and to give it firm mathematical and historical foundations.". And as one can expect to write such book is much more difficult.

In June 1965 he completed a first draft of twelve chapters. It was 3,000 hand-written pages long. In October after he send the draft of the first chapter to Addison-Wesley they proposed that the book be published as seven separate volumes. 

Knuth worked around the clock writing the book and that prompted a ulcer attack in the summer of 1967.  As he noted later in his interview to Innovations:

There was no reliable guide to the literature in 1962. I was the only person I knew who had read most of the journals yet had not discovered very many things myself; and I liked to write. Thus I thought I could give a more balanced and unbiased account than the people who had made the most important discoveries. Of course, after I got going, I discovered a few things of my own, so by now I'm as biased as anybody. But you asked about what was the inspiration in 1962. And the answer is: There was a huge need for a book like The Art of Computer Programming, but everybody who was capable of writing it was unfortunately likely to give a terribly slanted account!

When the first volume was published it has a tremendous affect on system programming community and all further research.  It also contained a now famous suggestion to the readers:

The first finder of any error in my books receives $2.56; significant suggestions are also worth $0.32 each. At present the books are fairly new, so your chances of finding a previously unreported error today are much better than they will be a month from now. Indeed, if you are really a careful reader, you may be able to recoup more than the cost of the books this way.

Please send your comments either by email to TAoCP@cs.stanford.edu or by old-fashioned mail to

Donald E. Knuth
Computer Science Department
Gates Building 4B
Stanford University
Stanford, CA 94305-9045 USA.

To read the first volume was not an easy task. It requires a lot of effort including attempts to reproduce the programs that are discussed. That was very difficult in 1968, when the book was published so this volume was far ahead of its time even in this respect. Only with the advent of personal computers, this book can be studied along with running each program on the computer. But reproducing and debugging programs and working on exercises is very important. that's required price of understanding, give less and the book might leave you frustrated.  I would like to stress that exercises are very important part of the book. As Knuth noted:

THE EXERCISES in this set of books have been designed for self-study as well as classroom study. It is difficult, if not impossible, for anyone to learn a subject purely by reading about it, without applying the information to specific problems and thereby being encouraged to think about what has been read. Furthermore, we all learn best the things that we have discovered for ourselves. Therefore the exercises form a major part of this work; a definite attempt has been made to keep them as informative as possible and to select problems that are enjoyable as well as instructive.

In many books, easy exercises are found mixed randomly among extremely difficult ones. This is sometimes unfortunate because readers like to know in advance how long a problem ought to takeotherwise they may just skip over all the problems. A classic example of such a situation is the book Dynamic Programming by Richard Bellman; this is an important, pioneering work in which a group of problems is collected together at the end of some chapters under the heading "Exercises and Research Problems," with extremely trivial questions appearing in the midst of deep, unsolved problems. It is rumored that someone once asked Dr. Bellman how to tell the exercises apart from the research problems, and he replied, "If you can solve it, it is an exercise; otherwise it's a research problem."

Good arguments can be made for including both research problems and very easy exercises in a book of this kind; therefore, to save the reader from the possible dilemma of determining which are which, rating numbers have been provided to indicate the level of difficulty. These numbers have the following general significance:

Rating Interpretation

  • 00 An extremely easy exercise that can be answered immediately if the material of the text has been understood; such an exercise can almost always be worked "in your head."
  • 10 A simple problem that makes you think over the material just read, but is by no means difficult.. You should be able to do this in one minute at most; pencil and paper may be useful in obtaining the solution.
  • 20 An average problem that tests basic understanding of the text material, but you may need about fifteen or twenty minutes to answer it completely.
  • 30 A problem of moderate difficulty and/or complexity; this one may involve more than two hours' work to solve satisfactorily, even more if the TV is on.
  • 40 Quite a difficult or lengthy problem that would be suitable for a term project in classroom situations. A student should be able to solve the problem in a reasonable amount of time, but the solution is not trivial.
  • 50 A research problem that has not yet been solved satisfactorily, as far as the author knew at the time of writing, although many people have tried. If you have found an answer to such a problem, you ought to write it up for publication; furthermore, the author of this book would appreciate hearing about the solution as soon as possible (provided that it is correct).

By interpolation in this "logarithmic" scale, the significance of other rating numbers becomes clear. For example, a rating of 17 would indicate an exercise that is a bit simpler than average. Problems with a rating of 50 that are subsequently solved by some reader may appear with a 45 rating in later editions of the book, and in the errata posted on the Internet (see page iv).

The remainder of the rating number divided by 5 indicates the amount of detailed work required. Thus, an exercise rated 24 may take longer to solve than an exercise that is rated 25, but the latter will require more creativity.

The author has tried earnestly to assign accurate rating numbers, but it is difficult for the person who makes up a problem to know Just how formidable it will be for someone else to find a solution; and everyone has more aptitude for certain types of problems than for others. It is hoped that the rating numbers represent a good guess as to the level of difficulty, but they should be taken as general guidelines, not as absolute indicators.

Actually each chapter on the first volume was like a lifetime achievement of a serious computer scientists. And the level was instantly noticed in both academia and the industry. In a 1995 interview, Microsoft CEO Bill Gates recommended that anyone who believes themselves to be "a really good programmer" should read the first volume, being sure to solve all the problems. Noting that his own reading took several months and incredible discipline, Bill Gates requests of readers, "send me a resume if you can read the whole thing."

The first three volumes proved to be an instant classic and a bestseller for Addison-Wesley. The book was translated into all major languages, including Russian and Chinese.

Donald Knuth essentially started the systematic, encyclopedic study of algorithms (and actually coined the term "analysis of algorithms" in the mid-sixties)  -- a field in computer science whose overall goal is an understanding of the properties as well as provide a formal measure of complexity of algorithms. Properties of random strings, permutations, trees, and graphs are essential components in the analysis of algorithms and at the same time are building blocks of system programs. that' why thee were introduced in the first volume.  As he remarked:

The original work I do in The Art of Computer Programming is to take the methods of two different authors and analyze method A from the standpoint of author B, and method B from the standpoint of author A. They have only given their sides of it, so I try to fill in ....

The first three volumes recently were recently published in the third edition using TeX as a typesetting system.

All-in-all TAoCP is probably the most famous of computer science books. It is often called a bible of computer science. Some important details are left as exercises to the reader.  Here is an interesting quote about one person experience with B-trees:

So I knuckled under and I did some study. I tracked down Algorithms + Data Structures = Programs. No use to me. I read Donald Knuth's Art of Computer Programming Volume 3, and I got a lot of good information out of that. I started writing my own indexing scheme using Btrees. The hardest part was deleting an entry from the Btree. Knuth left this as an exercise to the reader. I found another book that gave me some hints, and I sat down and developed it. I tested that unit so thoroughly. I ended up with automated tests, with standard data that would trigger edge conditions, error conditions, everything. I learnt a lot from that process. And it worked, and it was fast, and it was mine, and I had full source code. I was all set.

Like the Bible, the TAoCP books are good to have around even if you don't plan to read them :-). Also it seems has the requisite ability to change the life of the person ;-). Here is one proof of such a deep influence from a review from amazon.com:

Brilliant & Amazing. Unequaled achievement in this field.
I used to be a high-school student when I accidentally found a copy of the first volume. It moved my all life. I decided to become a computer scientist at the end of the first chapter. And today, having accomplished this, I still didn't finish the second volume and it has been a long time already. Nevertheless, I couldn't resist buying the third volume. I just hope to live long enough to get to the end of the fifth and last volume of this collection. Thank you Donald Knuth for this brilliant and inspiring work. --This text refers to an out of print or unavailable edition of this title.

As Knuth himself says, it is impossible for any one person to keep up with all the research in computer science, but these 3 volumes do a remarkably good job of distilling the most important results and explaining them with mathematical rigor.

The most important is volume 1. It contains just 2 chapters, but each chapter looks like a book of its own. Ch. 1 contains "Basic Concepts": mathematical foundations and a description of MIX, a hypothetical machine (now available in software emulations). Ch. 2, Information Structures contains a definitive tutorial and reference on lists, trees, memory allocation, garbage collection. 

Volume 2 also contain two chapters: Ch. 3, Random Numbers: how to produce series of "random" numbers and test their statistical properties. Ch. 4, Arithmetic: algorithms for integer and floating-point arithmetic. 

Volume 3 Ch. 5, Sorting: both in memory and on disks or tapes. Ch. 6, Searching: sequential, binary, hashing.

Despite the complexity of topics, and (sometimes) use of advanced mathematical concepts, Donald Knuth style makes both the algorithms and what is more important idea behind them relatively easy to grasp. If all you care about is getting a program to run, you will be better off by buying another book; but if you really want to understand how and why software works, there's no other book like this. 

It's amazing that despite all information explosion since 1968 Donald Knuth did not give up his idea, being "The Last of Mohicans" -- the last Renaissance man in computer science. As he explained he is still able to keep up with all this tremendous amount of information on each topic by concentrating on one thing at a time:

...No matter what field of computer science you're in, everyone is finding it hard to keep up. Every field gets narrower and narrower, since nobody can cover all the territory anymore. Everybody can choose two small parts of computer science and learn those two parts; if one person knows parts A and B, another knows B and C, and another knows C and D, the field remains reasonably well connected, even as it expands.

...I'm not as broad as you might think--- I only work on one thing at a time. I guess I'm a quick study; I can become an instant expert in something. I've been collecting stuff for thirty years so that I can read the literature on each topic in "batch mode"--- not swapping lots of different topics in and out. I can absorb a subject locally and get good at it for a little while... but then don't ask me to do the thing I was doing a few months ago! Also, I have lots of people helping me correct my mistakes.

When volume 4 will be published it will definitely be recorded in the Guinness Book of Records   as for the longest interval between  the first editions of the previous and next volumes (more than 20 years ;-)

The second love: typography

(to be written)

Recognition and Awards

Knuth Prize

The next Knuth Prize will be awarded at STOC 2002 in Montreal, Canada. The call for nominations for the 2002 Knuth Prize is now available. The Donald E. Knuth prize for outstanding contributions to the foundations of computer science is awarded every 1.5 years by the ACM Special Interest Group on Algorithms and Computing Theory (SIGACT) and the IEEE Technical Committee on the Mathematical Foundations of Computing. The Prize includes a $5000 award and a $1000 travel stipend (for travel to the award ceremony) paid by ACM SIGACT and IEEE TCMFC. The Prize is awarded for major research accomplishments and contributions to the foundations of computer science over an extended period of time. The first Knuth Prize was presented at the 1996 ACM Symposium on Theory of Computing (STOC). Future prizes will be awarded alternately at the ACM STOC and the IEEE Conference on Foundations of Computer Science (FOCS).

The Prize is named in honor and recognition of the extraordinary accomplishments of Prof. Donald Knuth, Emeritus at Stanford University. Prof. Knuth is best known for his ongoing multivolume series, The Art of Computer Programming, which played a critical role in establishing and defining Computer Science as a rigorous, intellectual discipline. Prof. Knuth has also made fundamental contributions to the subfields of analysis of algorithms, compilers, string matching, term rewriting systems, literate programming, and typography. His TeX and MF systems are widely accepted as standards for electronic typesetting. Prof. Knuth's work is distinguished by its integration of theoretical analyses and practical real-world concerns. In his work, theory and practice are not separate components of Computer Science, but rather he shows them to be inexorably linked branches of the same whole.

The Knuth Prize winner is selected by a Prize Committee consisting of 6 individuals selected by the SIGACT and TCMFC Chairs.

In selecting the Knuth Prize winner, the Committee will place particular attention on a sustained record of high-impact, seminal contributions to the foundations of computer science. The selection can also be partly based on educational accomplishments and contributions such as fundamental textbooks and high-quality students. The award is not based on service work for the community, although service might be included in the citation for a winner if it is appropriate.

Nominations for the prize will be considered by the Committee, but are not required in order to receive the Prize. All nominations should be sent to the Chair of the Prize Committee.

Past Winners


Last modified: January 03, 2003


Copyright © 1996-2001 by Nikolai Bezroukov.

Standard disclaimer: The statements, views and opinions presented in this e-book are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SNDP or any other organization the author may be associated with.

This document is placed under the copyright of the Open Content License(OPL). A local copy of the license is also available.

Please read, understand, acknowledge, and abide by this license before copying, translating, quoting, or distributing this document.

The mission of Softpanorama Open Source University is to provide a educational materials to schools and universities to serve as a source for the independent study of OSS in both Linux and Windows environments and this way extend the frontiers of the use of the open source software.  www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time.

Click here to submit your comments!