0% found this document useful (0 votes)
29 views

Rotar BriefHistoryofCDC6600 Transcript Final

Hollow Square talk: “A Brief History of NCAR’s CDC 6600 Computer” Presenter: Paul Rotar. Control Data Corp. (CDC) 6600 computer was installed at the National Center for Atmospheric Research in Boulder, CO as a replacement for the CDC 3600. (1964)

Uploaded by

mvannier
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Rotar BriefHistoryofCDC6600 Transcript Final

Hollow Square talk: “A Brief History of NCAR’s CDC 6600 Computer” Presenter: Paul Rotar. Control Data Corp. (CDC) 6600 computer was installed at the National Center for Atmospheric Research in Boulder, CO as a replacement for the CDC 3600. (1964)

Uploaded by

mvannier
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Hollow Square talk: “A Brief History of NCAR’s CDC 6600 Computer”

Presenter: Paul Rotar


Introducer not known
Transcribed by Cyns Nelson

Style Note: Speakers who are known are identified by their full names, positioned at the start of
remarks. Audience sounds and verbal outputs that are not words are placed in parentheses;
peripheral and editorial notes, and questions about transcript accuracy, are placed in brackets;
words spoken with emphasis appear in italics.

[??]: In the absence of Walt Roberts, it's my great pleasure to introduce the speaker today, for
this Hollow Square. I think most of you know Paul Rotar, and those of you who don't, know that
he is a member of our computer facility. He was born in Omaha, Nebraska; he went to Regis
College in Denver; he worked for a while at the Naval Ordinance Lab at Silver Spring, and then
at Martin Marietta, in Waterton [spoken in questioning tone], Colorado? He worked there for
four years, doing computer work, before he came here. And I think the feature of Paul's activity
at NCAR, that makes him particularly qualified to talk about the subject that he's chosen, is that
he was very instrumental—personally instrumental—in getting this CDC 6600, which we have,
to work for us. He actually went to the CDC company's Chippewa Falls laboratory and worked
with them to try and get this thing to work—in the early days, when people weren't sure whether
the thing would work or not. So, he speaks from personal knowledge on the subject: "A Brief
History of NCAR's CDC 6600 Computer." Paul Rotar.

[00:02:08]
Paul Rotar: It's different here when there's people, and when there's not. [Sounds of microphone
shifting, blocks out voice of narrator.] A number of restrictions on this mic: you can't have it
against your shirt, because then they can't hear it when they play it back, and so forth. (Light
laughter from audience.) Now properly adorned, I guess I can start. The title of this thing was,
sort of, a review of the problems we had. And the place to begin is to take a look at the machine
we had before we were involved with the 6600, namely the Control Data 3600. I have a little,
black diagram up here, in the best art I can do. This board was smudged before I started, and they
offered to wash it, and I said, "Gee, it took me fifteen minutes to get it up there; you can't wash
it." So, it stays. (Light laughter.)

Now, for those that are not computer users, computers are driven internally through the use of
programs. And a computer is an absolute slave; there is not a bit of democracy in it. A central
processor in a computer takes words out of a memory and interprets those words—they're
numbers, to the machine—and follows it exactly. It never deviates. Whatever it takes out of this
memory, the central processor then executes. And a computer memory can be looked at as
though it were a long sheet of paper, with perhaps 30,000 lines on it. And each line you could
write one word; and each one of these words is an address. And the central processor, then, takes
instructions from the memory, generally sequentially—unless it's told to do differently—by
instruction that, when found [?] in the memory, does one thing at a time in a serial fashion—at
least, the 3600 did, in serial fashion.

[00:04:03]

1
We had a console on there, with the usual number of lights that everybody is familiar with, from
grin-and-bear-it days. You could change the things that were in this memory, from this console.
You could also send instructions into the machine, and it would go ahead and execute, manually,
instructions put into a bunch of lights on the console. This was [?] very convenient, because you
had a handle on the machine; since the machine was gonna do what was in the memory, you
could then control what it did from the console, quite effectively. Further, we had a little helper
computer, sitting over here, and these things are tape drives. And our 3600—all the data that
went in, all the program flow, came off of tapes, into the machine; the returns were written on
tapes; these tapes were finally printed up here, on this printer. So, the card reader would go
through this 160 process, as we called it; write a reel of tape. We had a manual switch, here; an
operator could switch four reels of tape, two at a time, between the 160 and the 3600 processors.
So we ran jobs in sequence, as they were turned in; and the output came out in sequence and was
printed.

Now this machine, because of the sequential nature of it, didn't allow you very good turnaround.
By the time you got a deck card to tape, walked over to the switch, threw the switch, and got it to
read through and process, and off on the printer, you spent a fair amount of time. And, in August
of '63, CDC announced a machine that was quite a bit different in configuration from a 3600. It
was the 6600, and I have sort of a block diagram, here, of the 6600. And they came here, and
they talked to us, and IBM talked to us. And the off shoot was: by the end of June, 1964, we had
signed up to buy this machine on a rental basis. And its central processor, now, is different from
the central processors of the 3600, because it could execute instructions from memory in parallel.
It could take data words, and instead of doing an add, followed by a multiply, followed by a
divide, it could do the add and the multiply and the divide essentially simultaneously.

[00:06:26]
Further, instead of having one small computer out here, to control the printing and punching, and
the reading of input data, we had ten little computers, effectively, inside the 6600; it really
doesn't have ten small computers, it has one, but they look to the software as though they are ten.
And each one of these little computers has a memory associated with it. And if a person has
enough software, or programs, in each one of these computer's memories, it can then control the
overall picture, because a central processor in this machine can talk only to the central memory.
Data gets into the central memory; it's got to come either from the central processor or from a
peripheral processor. But in no way can this thing talk to the outside world, except my putting
data into here. It is then interrogated by these peripheral processors, in some way. And, the other
nice feature about it was: we had two channels on this machine. CDC sells these channels at a
premium, and controllers—I've left off a lot of controllers on these diagrams. But as you add
controllers and channels, you add price. The nice part about the 6600 was: they gave you a dozen
channels to start with. And, there were controllers out here, too, that were extra. But, we had ten
small computers, and a dozen channels. And the configuration initially included a disk file, and
we had the usual card reader, the printer, and the punch.

[00:08:00]
One of the differences between these two machines, that was really stark: this console couldn't
really tell that central processor very much, because it was isolated. It had to go through a
peripheral processor; from the peripheral processor's memory into the central memory; and then

2
the central processor, if it wanted to, might look at that data and do something. But the whole
thing was very exciting (a few laughs), because this central processor, here, could execute—as I
say—more than one instruction at a time, and was much faster. And the rent was about $9,000 a
month more for this thing, then it was for that. This was about $67,000 a month, when we got rid
of it; and this was $76,000 without the drums sitting in here. We purchased this microfilm unit
before, so that, I think, it was about a $100,000 item. You spend money in a computer facility
like water, so you don't worry about anything but the computer's rent, when you make the budget
(light laughter). You can bury enough money in the computer's rent to take care of all the rest.
[Some laughter from audience, perhaps in response to a member of audience.] Aw, come on,
Jim.

Now, the central processor on any computer executes general types of instructions. And to
illustrate the speed-up that would be available between the 6600 over the 3600, I have a little
chart here. One of the kinds of instructions is branching; and branching occurs in the program
when you take instruction words out of the memory, one at a time, and you want to test a
quantity, and you want to change the flow down to a different part of this—as I call it—a sheet
of paper with many lines on it, and you want to jump to an instruction far down. So the 3600, to
do a branch—and I got these out of the manuals, and I didn't take all the branches—about 1.12 to
2.38 microseconds. So it could do between, possibly, half a million to a million branches in a
second. And the 6600 took nine-tenths to one-and-a-half microseconds. So there wasn't much
speed-up there.

[00:09:56]
Now, the Boolean unit—in computer logic, in systems work, there's a great use for Boolean unit.
This thing has a special kind of arithmetic; it adds with no carries, and things of this sort, and
masking [?] data out. But it would execute a Boolean instruction about 2.07 microseconds, and
this would do it in three-tenths. So we had about a seven-to-one speed-up of the Boolean unit.

The shifter—if you take data in a register, and nobody is ever satisfied with where it is, so you
try to shove it either right, or to the left, and reposition it, and then operate on it with a Boolean
unit, take some of the characters out of it. So, the 3600 could shift in one-and-a-quarter
microseconds, and the 66 in three-tenths of a microsecond. So, we had a speed of about four-to-
one available. These two are very important kinds of instructions, as far as the overall control
system that runs that machine is concerned. Then we have integer add or subtract; well, the 3600
can do it in two-and-seven-hundredths, and the 6600 was three-tenths of a microsecond, so we
had a speed of about seven. A computer does two kinds of arithmetic, normally. One is with a
decimal point—and we call it "floating point"—and it adds numbers in scientific notation. This
takes the burden of keeping where the decimal point is off the programmer; it also caused a lot of
trouble. (Laughter.)

[00:11:17]
Looking now [?] to floating add or subtract: now, the main programs that run NCAR, the
majority of the instruction types are floating add-subtract, floating multiply, and floating divide.
We had at least speeds-up factors of six, six, and four. Now, there are two multipliers on that
machine, and you can issue—on a 6600, you can issue multiplies one-tenth of a microsecond
apart. You can start another one. So that, you almost have a factor of twelve-to-one on the

3
multiplying speed. The divider took 13 microseconds on the 36, and 2.9 on the 66, so we had a
seed-up of four to one. Now, incrementing is the kind of arithmetic that is used when the
memory is addressed. And generally you do what they call "increment instructions," when you
try to find a data word up here in this central processor's memory. You'll kick an incrementor by
some number—it's essentially an adder with a short length. And we had a speed-up of about
three to one on the incrementor.

And then, the last two kinds, I stuck up here in case anybody's ever seen what they call the
Gibson mix, the load store. So if you load a word from the memory into a register on one
machine, or store it from the operational register back in the memory, you can do one in two, and
the other in eight-tenths. So, there was about a two to one factor there. However, the
incrementors will cause words to come from the memory, or go back to the memory, two at a
time. So we really did have four to one.

[00:12:44]
Now this was a very, very exciting picture we had here—increasing cost only $9,000 a month.
So we thought we were going to get a lot of things out of that. And one of the big bugbears, of
course, is that this machine has to have software; by itself it does nothing. And a key element in
the software, if you want to achieve speeds like this—or greater—is the compiler. Now, a
compiler takes what the programmer writes and translates it into these numbers that go into
memory that control a machine. For example, if we try a little of my bad Fortran—I'm essentially
a machine-language coder, and I write poor Fortran—but this statement says to replace X by the
sum of Y and Z. And that is typed on a card, just that way. And before the central processor
could make head or tail out of that, a translation program—essentially a software—has got to
take and interpret this card and make machine-language instructions out of it [?]. In other words,
numbers, and put them in memory so that the machine can understand what to do with that. And
the quality of this compiler is really essential, because if you're gonna try to drive this machine at
its real rate, you've got to overlap these functions; you've got to find floating adds that you can
do while the floating multiply is going on. Preferably you want the divides to come off first—
they take the longest, so you'd like to start the divider going. And you'd like to get the add-
subtract unit going; you'd like to get the multiplier going. And this will give you better speed-up
than the raw machine speeds indicate. And, in 1964, when we were listening to all the
propaganda about it, LRL came out with numbers that sounded like: this machine will run some
of our codes, if written by hand, twenty-five to one over the 3600. We thought, "Wow."

[00:14:38]
And then we looked at this number, and I said, "Yeah, Glenn"—at that time, Dr. Lewis was the
manager of the Computing Center—"but these numbers, here, make this kind of mysterious,
except in special cases." And he said, "Well, you have to have a number for a justification to
NSF." He says, "How about 10 to one?" And I said, "Yeah! How about it?" We put it down on a
piece of paper. (Laughter.) We're still looking for it! Now the machine will, actually, if you code
by hand, you can get speed-ups of 10 to one on it. Or 25 to one. But this is really optimistic
thinking. Because, without a good compiler to translate these Fortran statements, you don't have
anything. So now I'll spend a good deal of the rest of this discussing the history of where we got
the compiler from.

4
[00:15:30]
Right after we signed up for the machine, we decided to go out to Los Angeles, where CDC's
official software group was working on what they called SIPROS, and this was an overall control
monitor and Fortran compiler system for the 6600. At that time, a Dr. Robert Fagan headed up a
group of about 18 people who were writing software. And he had some very good, convincing
charts. And when he was going to finish all of this, see, was now about the 19th of August. And
we were supposed to be able to make tests shortly after the 1st of November. Schumacher [?] is
sitting here, he's giving me this jaundiced look—keep those dates straight. (Light laughter.) He's
the only one that was there; the rest of these guys are either on vacation or no longer around. So,
Bob Fagan put this graph up on the board. And, to prepare for this talk, I decided: well, I'll go
back through the literature and see what the graph looked like. One of the more optimistic things
was: Fortran, by the 1st of September, you had something that looked like this [sound of chalk
on board], with months on it. Here's August, September. This is a real manager type; boy, this
guy had this down. (Light laughter.) This was 65 percent, here. It was going to be 65-percent
completed.

Now, bitter experience has taught us that a 6600 compiler has 30,000 lines of code in it,
individual instructions. So he had 10,000 to go, and he had to do that by—of course, I didn't
know this at this time, I didn't bother to analyze it. He was smoking a pipe, and looking
confident. And he'd never missed. (Laughter.) He really hadn't missed, in his life. And that's
pretty encouraging in a software guy; someone who's made his software commitments is pretty
unusual. So by this date up here, he's gonna have this compiler done. Well, as far as I can tell, it
will take him quite a while to do that, because to do 10,000 lines—the average machine-language
coder is reputed by old studies—and I quote these numbers only because I've heard them a lot,
and probably some others have—you can do 30 lines of machine-language code a day, and check
it out. So, if you take 10,000 lines, you get 300 days. (Light laughing.) But he had six people! On
the project itself. It wasn't quite that bad. It was obvious he couldn't hit November. And, in fact,
he didn't.

[00:17:58]
We went back in the meeting, kind of sober—didn't know, really, how to take this. Because, he
had a group of people that had gone to Chippewa, about this time. Chippewa Falls is where the
computer was being made. And they had brought up some of this software, and tested some of it
on a simulation. On a low-level machine like the 160, or the 1604—excuse me, Gene, it's a good
product, but wasn't up to running this. They simulated some of this software—dare [?] check it—
and then they were going to go to Chippewa to do some more checkout, and some more
simulations. About six weeks were spent by this group, up there. And they went up, and they
came back. But Seymour Cray, the guy that runs that place, he's pretty iron-fisted. I'm sure no
more than 40 people have ever been in that building at once. A whole machine was built, you
know, with a staff of 30 people—31, maybe, in there, including janitors. So it was pretty
trimmed down. And these six individuals didn't get on the machine long enough to do any good,
so they had to come back, and CDC shipped that first computer, then, to LRL.

And then they hurried up and got on what they called Serial Two—I may have the serial
numbers mixed up, slightly, but it's roughly that order. And by this time, some time in this month
of 1964, they shipped the machine to the Los Angeles office, which would set this SIPROS

5
project at ease, because now they've actually got something to work with. You're pretty
hamstrung, to sit at your desk and write this stuff, and have people come in and bug ya, and give
them estimates. It's easy to get into this syndrome; I probably would have done the same thing.
You have to give someone something.

[00:19:38]
So I spent the winter trying to decide how to entertain ourselves, because we decided to use
Chippewa—I mean, around Chippewa, but SIPROS, because that compiler matched the 3600
compiler specifications. In other words, the way you punched your cards, X = Y plus Z would be
readily interpreted by the SIPROS compiler. And there was another alternative a person could
use, is there was a Chippewa compiler, and a Chippewa operating system that had been written
up there. And this was reputed to work, but it wasn't to our specifications, so we weren't too
interested in it. And during that winter—a little sideline—we got a hold of a tape that listed off
the Chippewa operating system, and the compiler. And I printed it one afternoon; there were
about 400 pages of monitor [?], and it was written in octal [?] (some laughter). So, I threw that
listing away—it was a historic souvenir—but that thing says that this guy had written it not in an
assembly language at all, that a machine would assemble and do instructions for him. But he put
down the numbers in the machine's—that were gonna be in the machine's memory and
interpreted by the processor. He'd write: core location three, and he put the number. And this
went for 400 pages. And that's quite a feat. So we were notably impressed by that part of it,
anyhow. (Laughter.)

[00:21:02]
And we wrote some simulators, too. Fagan [?] started something good here. We put some
simulators on; we took the Chippewa compiler, that was on the back end of this tape, after the
monitor, and assembled it—we had an assembler on the 3600 for 6600 language. We read it in,
into the simulation, and did some test compilations. Even on a simulator, that compiler was
awfully fast; it was impressively speedy. Which—at the time, we didn't know it wasn't doing too
much testing for different conditions that would get you good code. However, it was not
intended to be this; it was intended to be something to check out benchmarks, and to enable
control data to test the machine. So we didn't have any complaints.

We waited, then, until about March of '65. At this time, the rumors about how the hardware was
performing were coming in from all over—how the SIPROS project was progressing. They came
from Europe; there was an outfit over there called CERN, that was currently in trouble with their
computer. We decided to have a little showdown. So, we said, "Look, get us some time on that
Los Angeles machine; we want to go out and compile some programs, and try running them."
We took ten decks with us; one of them was a general circulation model code, as it was in those
days. It was much smaller than it is now. And one of them was a benchmark we called
"Marathon." You have Marathon oil, and it can run on a dozen or so machines, so we knew
about how fast it was, and it burst every kind of computer in existence.

[00:22:29]
We went out here, to Los Angeles, and we were pretty bitterly disappointed. We managed to
compile two programs on the Chippewa operating system, and as far as I can remember, none
compiled on the SIPROS compiler, at all. And out of that, we executed one. And we didn't find

6
much out about the hardware, that way—you couldn't tell if it was running or not—but we knew
the software wasn't running. We were pretty sure of that little fact. Then, as time went on, the
summer wore in. About the 1st of June, we got involved because the letters now were flying hot
and fast from CERN about the machine's performance; and we got some from New York
University. And at these installations, they had taken this memory, and they had taken 10 percent
of the modules out, roughly, in both installations, and replaced them. Now, that's a pretty easy
job: you pull one out and you plug another one in. The trick is to find them. So that if they were
concerned—if they replaced 18 of those modules—they had a tough time. The machine at CERN
was taking eight hours a night, from midnight to eight o'clock in the morning, just to do PM [?],
to keep it running, in a non-production environment.

They, essentially, had systems people like myself in there, checking out a little bit of this, or a
little bit of that. But no one was trying to run a hardcore problem, yet. And they said, under these
circumstances, the machine was 94-percent up. (Laughter.) And this was probably the the low
ebb—or, very close to the low ebb—in our relation with CDC, because things just weren't well.
And Harrington [?] came out—he was the local office manager—around the 1st of July. Our
machine was to come September, now, '65. And it was now about July 1st. We had a little
meeting; we said, "Look, we've got a whole flock of correspondence on this computer. It's not
running well, and the software's not up. We've been there ourselves." Of course, that was—you
know, three months makes a big difference in the life of software. But, we weren't sure how
much it had progressed. "We'd like to run a pre-acceptance test, up in Chippewa. Before you ship
the machine, we'd like to put 50 programs on the machine, and run them for about a week, over
and over and over. We want the same answers coming out of those programs." (Laughter.)

[00:24:42]
They agreed to this, and we agreed to some things, too. That we'd go up and help them change
the compiler—or, write some sub routines for the compiler—to try to bring the 3600 operating
system, its compiler, and the 6600 compiler closer together, so we didn't have to change
everybody's deck. We had some pretty severe constraints, here. The 3600 had—an eight-
character variable name was allowed. Well, there are a lot of decks with that in there. You think
you could truncate an eight-character name to six, easily, but that's really not the case, unless
you're the program's author.

We went up and got our first look at what was going to be our computer. They were running
some tests on it. And the notable thing about it was: the GCM, or General Circulation Model
program, would [?] run. On almost all the others we had, it ran and ran in fashion. You run the
deck in, and you got a set of answers. You run it again, you get another set. (Laughter.) This
gave us quite a bit of concern, because they wanted Dave Kitts and myself, and Bob Working—
and Gene Schumacher was there, too—and they wanted us to sit down, at least initially, and try
to figure out where the machine had failed to properly execute the instructions coming out of that
memory. Well, we tried some of this in 3600, and it's a very unsuccessful way to spend your
time; you can spend about two weeks isolating one failure, and then they'll chase it to another
part of the machine. Because, the back of this room that the machine was in, they'd have the
lights out, and had a guy with a scope and a pair of wire cutters. And he was timing circuits,
where they'd failed, and trying to get some of these wires adjusted so that the timing of the
circuits were right, based on previously diagnosed errors—which, there weren't [?] too many. I

7
for one, never bothered, I thought: Oh, I'll just go do something else and let them figure out how
to bring the machine up.

[00:26:40]
We found that the other problem was—besides the central processor not running properly—was
the disk file. If you did manage to get any answers off, you'd get a "disk parity" error. Now this
disk file that sits up here, all the answers are written over here. You could kind of sense it was
coming; you'd hear the printer stop, and you knew that somebody was on the machine, besides
you, and his output printed partway, and yours isn't going to go out at all, because the software,
at that time, was rigged to stop. Of course, control data was very concerned with the disk errors.
They'd occur anywhere from two to three minutes apart—maybe half an hour. We worked for,
oh, I think for about four weeks, with little or no improvement. But there was one guy up there
that could really make that disk stand up, and that was Seymour Cray—the designer of the
machine—could walk in there and tighten a screw on one of the logic modules, on the base of
the disk, or just look at, perhaps. I don't know what he did. (Laughter.) And the problem just
went away for him. He was a very lucky man. (Laughter.)

We could see that the pre-acceptance, now, was going to get dragged out for quite a while.
Because, number one, the software had to come up; the disk problems had to be solved, and the
hardware problems. We decided, then, that it was time to retire back here to NCAR and see if we
couldn't, based on observations, try to write ourselves a software system—not a compiler, but a
control monitor that would take better advantage of the machine, while they're up there trying to
get the machine running. You know, we weren't really being too helpful; we had a pretty strict
pre-acceptance test, and I was just fairly much willing to let things be, if it wasn't gonna run my
_____ headache [?]. But Bob Working, one of the people who worked for me, didn't cooperate
very well. He spent night and day helping them find bugs. Until, by the end of December, '65,
the machine was in fact ready for shipping. And the Chippewa Operating System had been
modified so that most of the programs that we wanted to run, ran, and many of the specifications
that it was supposed to have, had been implemented.

[00:28:44]
Well, back at NCAR, Dave Kitts and myself, and some others, had worked up a control system
so that we could put our own programs in these memories and drive this machine ourselves. And
one of the problems of the Chippewa compiler, that was not solved at the time, was that it
compiled directly to core, in an indefinite field length. In other words, if this memory's got a
certain size, then you, as a user, had to know what that size was, with the compiler and your
object code underneath it, that the compiler generated. You had to be smart enough to predict
how long that was going to be. And if you couldn't, well the obvious thing was: you found out
how big you could make the limit on the control card, and in it would go. Well, you would never
get two people in that memory at once; and one of the thing's that the 6600 can do really well is
improve the through put, over the 3600. But you have to achieve a buffering scheme, somehow.
Buffering was attempted on the 3600, and previous machines. And it's an idea whereby: the
central processor computes while data is being moved from a tape, or some such device, in and
out of the memory, overlapped with the computing. It was a very difficult thing for a
programmer to achieve. So, in general, no one bothered.

8
We found, on the 6600—and it was purely an accident; I didn't realize, before we went up there,
to Chippewa, that this was going to be an offshoot of this—that if you got two programs
occupying space in central memory, when one of them was hung up doing input-output, the other
one could execute. And this effectively buffered; and it really worked! But it wouldn't work
unless that compiler worked in a fixed-field length. Because nobody is going to bother to figure
out how long his program is. They're all going to sit down and say, "Well, last time I tried, I got a
diagnostic, and I waited half-an-hour here at the counter. Next time I'm going to try it, it's going
to work." So you take out the core [?], and that's a pretty human way to operate this thing.
(Laughter.)

[00:30:48]
So we set about modifying the Chippewa compiler, so that it would in fact run in a fixed-field
length. The other thing is, we can build a loader, so that it would attempt to load the program.
See, a compiler generates these machine-language instructions on what we call a binary deck.
And these decks are merged, from various sources, into the memory. There wasn't enough
memory to make it on the first pass—the loader figured out how big the program had to be, and
then it would try again when memory was freed up.

Now even at this, when we got the machine up, we hadn't achieved anywhere near the speeds
that we were hoping to, over here, at this time, on this first compiler. The programs ran from
about one-and-a-half to three times—maybe three-and-a-half, sometimes—the speed of the 3600.
And I was unfortunately not able to find any of the old numbers; you know, you clean your desk
out and throw things away. But this says that the compiler, then, was generating, probably, too
many instructions. So the game is: if you can improve the compiler, you'll improve your machine
use considerably, because a long-running program is going to sit on a machine for 200 hours.
And we all know of a certain group who was accused of using vast amounts of computer time,
here. (Light laughter.) If you can improve the thing 10 or 20 percent, in execution speeds, why,
you've bought a lot of hours in a month.

[00:32:17]
And over the years that have gone by since 1966, Garner McCrossen managed—and, just one
man. And that's a pretty good effort, for one man to work on a compiler like that. He managed to
clean that compiler up, and it turns in reasonably credible performances, today. We started, about
a year-and-a-half ago, to write one of our own compilers. And sometimes it runs better than his,
and sometimes it doesn't. So he's done a darn good job. But, still, the ten to one—six, seven, so
forth to one—have not been achieved on this particular piece of gear. And it probably won't be,
until someone learns a lot more about compiler writing than they know now, how to achieve it.

The big problem with getting this speed comes out of the Fortran language. It allows you —if
you were just to write X = Y + Z, that's nifty. That's pretty easy to address. But Fortran allows
you to make a complicated pointer up [?]. See, there would be, normally, one add operation
involved here, to do this. You'd load up Y, and load up Z, and add them together; store the result
in X. They came up with Fortran in a language they call a subscript. Now, the game is different
than it was just a second ago [noises of chalk on a board]. That subscript involves, probably, 10
times the calculation this add's got in it. And if you can't come up with a good scheme for
computing your subscripts ahead of most of the use of them, you're in trouble. And this first

9
compiler that Garner McCrossen wrote for this machine, in fact, recomputed the subscripts as it
hit them. Well, it generated too much code. And he's got that sort of thing, now, out of the
important places in the compiler, and has improved it quite a bit. But still he spent a lot of time
on this sort of thing, inside of a code. And it's very difficult, doing that kind of thing, to get these
three things running together, based on the structure of the machine.

[00:34:32]
And that's really where the thing is today. We've got good through-put capabilities; a person can
lay a deck on the desk, down there, and it will probably—if it doesn't run too long—come back
in a reasonably short time. _____ machine does beautifully. But still, it has kind of an illusive
nature, as it were. Try to get that central processor executing instructions; it can issue an
instruction every tenth of a microsecond, to one of these functional units, of both classes. Now,
the functional units can't really execute at that rate. But you can safely predict that the majority
of times, most of them aren't doing very much, even as it stands now. Yeah, I guess that's about
all I have to say, in this particular talk. (Applause from audience.)

[??]: [Faint voice] We have some time for some questions. Maybe somebody else would like to
add some of their experiences. Jim?

[00:35:36]
[??]: Paul, can you [microphone is rubbing and bumping] _____ programmers, how to write
Fortran _____ [voice completely obscured by microphone noises] _____.

Paul Rotar: Well, you can do this to a certain extent, Jim. But one of the things about Fortran is:
today's good idea, and tomorrow's idea on how to optimize the Fortran code, are probably at
odds with each other. Outside of some rules that we know are probably going to stay a long time,
there really isn't too much you can do. Some of the things, for example, are inside a "do" loop.
This is strictly Fortran talk. The do variable should be used only in a subscript, and to control the
do. You shouldn't store it, you shouldn't add it to something—this kind of thing. Equivalencing
[?] causes problems, so there are a few things that we know about. But, it depends on the tack
you take on the optimizer. I try to generate the code so that, if you really slant a particular
program that isn't going to be run too many times, and you like to carry around, maybe, from this
installation to another, to one compiler, you'll probably make a mistake. Now, if the program has
a long running time—in terms of 50, a hundred, 200 hours, over many months—than it pays to
write it so that the compiler can do the best possible job on it. It's worth some time in that.
Otherwise, I don't think it is.

[Voice, too faint to discern, asking a question.]

[00:37:17]
Paul Rotar: The ILLIAC Four is a parallel processor that was conceived by Dr. Dan Slotnick,
and he works up at the University of Illinois. And it's an array processor, where he has—well, to
start with, he is gonna build processing elements, he calls them, eight on a side. So he'll have 64
processing elements in one quadrant. This machine, then, can do PDE code, and some matrix
things, quite quickly. And I'd suggest that the language will be machine language. (Light
laughter.) They're trying to define a compiler for that machine, now, and they've only hit the

10
trivia [?]. And the IO problems with the ILLIAC are fairly stark, too. If the data stays in memory,
everything is fine. And it's got a big memory, as the problem flows out. And the ILLIAC may
not run very fast. I mean, limited strictly by the data flow rate.

But, I think he's got so much to do on that. That's going to be a historic project, if he succeeds. It
really will. It's immense. Even the environmental problems, on the machine, like, what is it, 200-
horse-power fan to blow air through it? (Laughter.) This computer, here, it sits down there; it's
just a small, T-shaped box, and its got the central processor, memory—memory for the
peripheral processor—and 10 of these little things in one, little, T-shaped box. The rest of that
stuff that's sitting on the floor, down there, are controllers and these devices. But when you get to
something that was, what, 32-feet long, eight-feet high, and about eight-foot wide, four bays
tall—just the environmental problems are going to be a real engineering headache. And the
software problems—hmmm. (Light laughter.)

[00:39:12]
We've come up with techniques, for example, that we know about, to optimize the Fortran code
on this. And they work reasonably well; I think we'll, in the next year, improve what we're doing
quite a bit, down there. But this is essentially still a sequential-type processor, and no one really
knows very much, I think, about that array machine. It'll run the GCM, but it will be very hard to
use, and might be pretty inflexible. Of course, that's just my opinion, today.

[??]: [Extremely faint] Where do you see big computers heading? I gather that some of the
recent _____ putting the servers [?] _____ [unintelligible] continuing that kind of cycle.

Paul Rotar: We're heading for more of that kind of cycle! This machine, when it came out,
Aksel Winn-Nielsen was here, and he told the control data people that—and everyone else—they
were really looking for a machine that was 10,000 times the speed of this, if I recall the number,
when I first came to NCAR. Well, to get 10,000 times the speed, if that's really what you're
looking for—and I think they need it here—we're going to have to accept a lot of bad knocks.
You take an early machine—you know, this 6,000 runs beautifully now. When we were on one
of the first machines—I should put this thing in perspective, with where we were. We were so
early, that during the software hassle, literally nothing but the prototype worked, and not very
well. And the only people that got to that were those that wrote the Chippewa software; the ones
on the SIPROS projects never saw it. And I think we'll see more of this kind of thing. If you try
to make a big breakthrough in speed, you've got to pay some price in effort, and a few setbacks.

[Voice too faint to discern.]

[00:41:08]
Paul Rotar: Well, these numbers, multiplied by five to six, might actually hope to affect—if
your deck runs now, at a certain speed, you should be able to divide that by about a fifth, or a
sixth, and get the running time—if it's not IO bound, on the 7600. That's what we hope. And, I
think it will make it. The fellow that's designing that, now, he's got the software—we're not
fighting the software hurdle, anymore, we're just fighting the hardware. And he's got a very fast
memory, here. He sped up the memory by about four to one. And he sped the function units up

11
so that each one of the things that does one of those functions, on that side of the board,
anywhere from four, five, six, seven to one over the 66, it should make it.

That would get us [?] quite a powerful computer. NSF, I guess, this week, was at least listening
to us, back there, in a year of tight money. Anymore?

[??]: [Voice of introducer] Any other questions, or comments?

[??]: [Extremely faint] Is there any attempt, anywhere, to devise a computer that makes [?] a
software problem easier [?] so you can you start putting it the other way around? [?].

[00:42:26]
Paul Rotar: Um. Well, some of the earlier machines—for example, this machine, the 3600, was
quite easy to design software for. Because it had powerful instructions that would internally do
macro loops in machine language. You'd write one instruction, and it could conform a whole
loop right within that. And IBM has built machines of this type. And it works out that: if you had
a reasonably good lead time, and no in-company politics going for you—so that you start the
software effort about the time the hardware was designed, there probably isn't much advantage in
this. To build a machine specifically oriented toward compiling, and this, you can have a
machine with 200 and 300 instructions in the central processor.

These machines—this machine had many instructions and had all kinds of modifiers do very
many things. This thing has a basic set of about 64, and has a few modifiers that allow more than
that. But, they're right down to earth. And if you have enough time to do your planning, you
should be able to turn out with a fixed-instruction set, quite a good software. IBM, back at their
plant, they told us that they were no longer considering any changes to the, say, the 360 design
based-on instruction set. Because, as far as they could tell, once the compiler got done, only use
a limited number of them, anyhow. The generated code out of one of those compilers usually
doesn't run the gamut of all the machine's instructions. And all they do is get in the way; they're
executed a relatively low number of times, so their reliability is down on that kind of an
instruction.

[00:44:07]
Burroughs [?] is making a machine that you can microprogram your own instructions; you want
to make a floating add work in a certain way, you can put into it your own floating-add package.
That's another approach to this, a design-your-own kind of thing, from very fundamental
operations. And that's the 8500—at least, what they described to us as the 8500. We look at that
as: they came out here, and it was going to be competition for the 76, and by the time you got
enough instructions in there to do a floating add or multiply, you would take about three or four
microseconds of machine time. It's dead before it starts.

[??]: [Introducer] Seeing no other questions, for the moment, Paul, I want to thank you, again,
for a very making this very complicated business party clear [?]. [Audio ends.]

[00:45:03] [End of recording.]

12

You might also like