We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 912
And now for something completely different,
‘Monty Python’ Flying Circus1A Introduction 1
12 The Task of a Computer Designer 3
13 Technology and Computer Usage Trends 6
14 Cost and Trends in Cost a
4.5 Measuring and Reporting Performance 18
148 Quantitative Principles of Computer Design 29
17 Putting It All Together: The Concept of Memory Hierarchy 39.
1.8 Fallacies and Patal aa
1.9 Concluding Remarks 51
1.10 Historical Perspective and References 53
Exercises oo
1.1 | introduction
‘Computer technology has made incredible progress in the past half century. In
1945, there were no stored-program computers. Today, a few thousand dollars
‘will purchase a personal computer that has mere performance, more main meio-
rj and more disk storage than a computer bought in 1965 for $1 million. This
pd rate of improvement has come both frem advances in the technology used
‘otuild computers and from innovation in computer design. While technological
improvements have been fairly steady, progress arising from better computer
architectures has been much less consistent. Dusing the first 25 years of eloo-
lnenic computers, both forces made a major contribution: but beginning in about
1970, computer designers hocame lagely dependent upon integrate circuit tech
nology. During the 1970s, performance continued fo imprive at about 25% to
30F per year for the mainframes and minicompuers that dominated the industry.
“The late 1970s saw the emergence of the microprocessce. The ability of the
microprocessor (0 ride the improvements in integrated circuit technology more
cloely than the less integrated mainframes and minicomputers led 10 a higher
rac of improvemeni—roughly 35% growth per yearin performance.(Chops 1 Funckmentals of Computer Desi
‘This growth rate, combined with the cost advantages of a mass-produced
microprocessor, Ied to an increasing faction of the computer business being
besed on microprocessors. In alton, two significant changes in the computer
markerplace mace it easier tan ever befor: to be commercially successful with a
new architecture. First the virtual climinalicn of assembly language program-
ming reduced the need for object-code compatibility. Second, the creation of
Standardized, vendorndependent operating systems, sich as UNIX, lowered the
cost and risk of bringing outa new architecture. These changes made it posible
to successively develop a new set of architectures, called RISC architectures, in
the early 1980s, Since the RISC-based microprocessors reached the market inthe
mid 1980s, thee machines have grown in perfomance af an annual rate of over
S0Fé. Figure |. 1 shows this difference in performance growth rates.
becagte 9
vecane12 The Eskofa Computer Destner 3
“The effect of this dramatic grow rate has been (wotold, Fist it has Signifl>
‘cally enhanced the capability available to computer users. Asa simple example,
‘consider the highes-perfommance workstation announced in 1998, an IBM.
Power? machine, Compared with a CRAY Y-MP supercomputer introduced in
1988 (probably the fastest machine inthe world at thal poind, the workstation of
fers comparable performance cn many floating-point programs (the performance
forthe SPEC floating-point benchirwss is similar) and better performance cn ine
leger programs for a price that is ess than one-tenth of the supercomputer!
Second, this dramatic rac of improvement has led t the dominance of micro
rocessor-based computers across the entire range of the computer design, Work=
SMations and PCs have emerged as major products in the computer industry
Minicomputers, which were Waditinally made from off-the-shelf logic o from
‘gale amas, have been replaced by servers made using microprocessors. Main=
frames are Slowly being replaced with muliprocessors consisting of small num
bets of off the-shelf micropracessors. Even highend supercomputers are being
built with collections of microprocessors,
Freedom from compatibility with old designs and the use of microprocessor
‘technology Jed toa renaissance in computer design, wich emphasized both ar-
chitectural innovation and efficien use of technology improvements, This rei
sance is responsible forthe higher perfomance growth shown in Figure Ll
rate that is unprecedented in the computer industry. Tis rae of growth has com-
‘Pounded so that by 1995, the idference between the highes-performance micio-
processors and what would have been obtained by relying solely on technology is
‘mere than a factor of five. Ths text is about the awchitectural ideas andl accom
‘panying compiler improvements tat have made dis incredible growth ate Posse
bie. At the center of this dramatic revolution has been the development of a
‘quantitative approach to computer cesign and analysis that uses empirical obser-
vations of programs, expesimentation, and simulation as its tools Ii this style
and approach to computer design that i reflected inthis text.
‘Susaining the recent improvements in cos and performance will equire con-
‘inuing innovations in computer design, and the anhors believe such innovations
will be founded cn this quantitative approach lo computer design, Hence, this
book has been waitten not only to document this design styl, but also to simu-
late you 1 contribute to this progress.
1.2 | the Task of a Computer Designer
‘The task the computer designer faces is a complex one: Determine what
tributes are important fora new machine, then design a machine to maximize
performance while staying within cost constrains. This task has many aspects
Including instruction set design, functional organization, logic design, and imple
mentation, The implementation may encompass integrated circuit design,(Chops 1 Funckmentals of Computer Desi
packaging, powes and cooling, Opimizing the design requies amilicsty with a
vary wide range of technologies, from compilers and operating systems fo logic
«esign and packaging.
In the past, the tem computer architecte often refered only to inspuction
set design, Other aspects of computer design were called implementation, often
insinuating that implementation is uninteresting or less challenging. The authors
believe this view is not only incorrect, butis even responsible for mistakes inthe
design of new instruction sts. The architect's or designer's job is much more
than instruction set design, and the technical hurdles in the other aspects of the
Project ae certainly as challenging as those encountered in doing instucion set
design. This is particularly tue al the present when the differences among ine
ssruction sets are small Gee Appendix).
In this book the term dnwrction se architecnre refers to the actual programmer
visible instruction sot. The instruction sot architecture serves asthe bounclary bo-
{wen the software ana hartwvare, and that topic i the focus of Chapter 2. The im
plementation of a machine has two components: organization and hariware, The
{ermoneanization includes the high-level aspects of acomputer’s design, such as
the memory system, the bus sucture, and the intemal CPU (Cental processing
Uuit—where arithmetic, logic, bunching, and data wansfer are implemented)
design. For example, two machines with the sane instruction set architecture but
siffeven organizations are the SPARCStation-2 and SPARCsiation-20, Harare
is wal 0 refer 0 the specifics of machine, This would include the detailed
Jogic design andl the packaging technology of the machine. Often tine of ma
Chines contains machines with identical instruction set architectures and nearly
‘identical organizations, but they differ in the detailed hardware implementation,
For example, two versions of the Silicon Graphics Indy lifer in clock rate and in
sdetailed Cache structure, In this book the word architecture is intended to cover
all ree aspects of computer design—insruction set architecture, orzanization,
and haucware.
‘Computer architects mast design a computer to meet functional requirements
as well as price and performance goals, Often, they also have 10 determine what
the functional requirements ar, and this can be a major task. The requirements
muy be specific features, inspired by the market. Application software often
«ives the choice of cerain functional equizements by determining how the mae
chine willbe used. If a lange body of softwaue exists for a certain instruction sot
architecture, the achitect may decide that a new machine should implement an
‘existing instruction set, The presence of a large mauket fora particular class of
‘plications might encourage the designers to incorporate requirements that
‘would make the machine competitne in that market. Figure 1.2 summarizes
some requirements that need to be considered in designing a new machine, Many
of these requirements ancl features will be examined in depth in ater chapters.
(Once sot of functional requirements has been established, the architect must
{ty o optimize the design, Which design choices are apimal depends, of course,
‘on the choice of metrics. The most commen metrics involve cost and perfor12 The Eskofa Computer Destner
Siapof addres space
“Memory management
Functional requirments “Typical features resuined or supported
‘Application area “Tarzt of computer
General purpose Balaoal perormance fora rngecf tks (Ch2245)
Scientific Higheperformance floting point (App AB)
Commercial ‘Supper for COBOL (decimal arithmetic support for dats an transaction
processing (C27)
TLeve ofsaftvare compatibility — Determines amount of existing software far machine
Atpropramming nennwe “Most flexible fr designer ned new compiler (Ch 2.8)
‘Objectoadecrbinay compasble —Instuction set architecture is compel deff exibiligbut no ine
‘extent naked seria r fees FOTN
‘Operating system requirements — Necosy Rens tosuppar chosen OS (Ch537)
‘Very pore seaure (Ch 5); may imi applications
Required for madem OS: may be paged or segmented (Ch)
Programming news
Prowction Diffent OS and application napa vs. segment protection (Ch)
‘Standards ‘Coren sandals maybe reared by masketphce
lating point Format and aridhmetic: EEE; DEC, IBM (App A)
Lobes For LO devices: VME, SCSI, bechamel (Ch7)
(Operating systerss UNIX, DOS, or vendor proprietary
Networks ‘Support required for lffrent networks: Fihemet, ATM (Ch6)
Langsiges (ANSI C, Fontan 77, ANSI COBOL) aft nsrction set (Ch)
[FIGURE 12_ Summary some of he most mporent nctional requirements an arctvect foes. Te lrrerdco-
lum describes the dass of requirement, hie the rghthardl coli gies earTpes of specie fsabres tal might be
"eeced The nghtend coke also cortans reerences b crapiers anappendoes Pal del Wi He spect SLES
‘mance. Given some application domain, the architect can try to quantify the per-
formance of the machine by a set of programs that are chosen to represent that
pplication domain. Other measurable requirements may be important in seme
markets reliability and fault tolerance are offen crucial in transaction processing
‘environments. Thicughout this text we will focus on optimizing, machine cos
performance.
In choosing between two designs, one factor that an architect must consider is
‘design complexity. Complex designs take longer to complee, prolonging time 10
market. This means a design that takes longer will nd to have higher perfor=
‘mance to be compeitive The architect must be constantly aware of the impact of|
his design choices onthe design time for both hardware and software.
In adktition to performance, costs the oer key paramcter in eptimizing cos
performance. In alton fo cost, designers must be aware of impostant trends in
both the implementation technology and the use of computers Such trends not
‘only impact fume cost, but also determine the longevity of an architecture. The
next two sections discuss technology and cost tends(Chops 1 Funckmentals of Computer Desi
1.3 | Technology and Computer Usage Trends
fan instruction set architecture is to be successful it must he designed to survive
changes in hardware technology, software technology, and application characte
istics The designer must be especially zwvare of tends in computer usvee and in
‘computer technology. fir alla successful new instruction set architecture may
last decades—4he core of the IBM mainframe has been in use ince 1964. An ar
chitect must plan for technology changes that can increase the lifetime of a suc-
cessful machine.
‘Trends in Computer Usage
‘The design of a computer is fundamentally affected both by how it will be wed
and by the characteristics ofthe underlying implementation technology. Changes
in usage or in implementation technology affect the computer design in cilferent
ways; fom motivating changes in the instuction set to shifting the peyes from
portant techniques Such as pipelining or caching,
“Trends in software technology and haw programs will use the machine have a
Jongeterm impact on the instruction set architecture. One of the most impertant
software tends isthe increasing amount of memory used by Programs and their
‘deta, The amountof memory needed by the average program has grown bya fac
tor of LS to 2 per year! This transates to a consumption of adress bits a arate
‘of approximately 1/2 bit. 1 bit per year. This rapid rte of growth is liven both
by the noods of programs as well as by the improvements in DRAM technology
tht continually improve the cost per bil. Underestimating addessspace growth
is often the major reason why an instuction set architecture must be abandoned,
(For funher discussion, see Chapter 5 on memory hirarchy.)
Another important software trend in the past 20 years has been the replaco-
iment of assembly languaee by high-level languages, This tend has resulted in a
linger role for compilers, forcing compiler writersand architects fo work together
closely to build a competitive machine. Compilers have become the primary
inerface between user and machine,
In addition to this interface role, compiler technology has steadily improved,
taking on newer functions and increasing the efficiency with which a program
‘can be ri on a machine. Tis improvement in compiler technology has included
traditional optimizations, which we discuss in Chagxer 2, as well as transform
tions aimed at improving pipeline behavior (Chapters 3 and 4) and memory sys
tem behavior Chapter 5} How 10 balance the esponsibility for efficient
‘exccution in madem processors between the compiler and the hardware condine
us fo be one of the hotles architecture debates of the 1990s, Improvements in
‘compiler technology played a major role in making vector machines (Appendix
'B) successful. The development of compiler technology for parallel machines is
Tikely to have large impact in the future,13_Tecincdogyand Corpuier Usage Trends 7
‘Trends in Implementation Technology
“To plan for the evolution of a machine, the designer must be especially aware of
npidly occurring changes in implementation technology. Three implementation
technologies, which change ta dramatic pace arc critical to medem implemen
tations:
= Integrated circuit logic technology—Tewnsisior density increases by about
0% per year, quadrupling in astover toe years. Increases in dic size are less
‘redicable, raging from 10F% 10 25% pet year. The combined effect is a
growth rae in tansisior count ona chip of between 6OFé and 80% per year, Do+
Vice speed increases nearly as fas; however, metal technology used for wising
‘does not improve, causing cycle times to improve a a Slower rae, We discuss,
this futher in the next section,
1 Scmiconductr DRAM-—Densityinreases by just under 60% per year, que
Plingin tree yeas Cycle times improved very lowly, docresing by about
‘nes in 10 yeas. Bandvickh per chipincreasesasthe hency decreases In
auton, changes to the DRAM interface have aso improved the bach
these ar discused in Chaper Sn de post. DRAM (@ynamic raniom- access
memory) technology Tus improved faster thn logic technology. This cif
cence has occumed bocase of reductions in the number of Uansisors po
DRAM celland he creation of specialized technology for DRAMAS thei
‘provement from these sources diminishes, the density growth in logic technol-
‘gy and memary technology should booome comaable.
Magnetic disk tecinotogy—Recenly, disk dexsity bas boen improving by
about 50% por year, almost quadrupling in Uree year. Price to 1950, desty
increased by about 25% per year, doubling in three years. It appears that disk
technology Will continu he fase density strat for some time w come,
Acces time as inproved by onc in 10 year. This technology is cena
to Chaper 6,
‘These mpidy changing echnologics impact the design of microprocessor
that may, with speed and technology enkancaments, havea lifetime of five ae
mare years, Even within the span ofa single product cycle (evo years of design
and 600 yeas of prodhcticn), key echnologs, such as DRAM, change suli=
Gently thatthe designer must plan for use charges Indeed, designers often do-
‘sign for the next technology, knowing that when a product begins shipping in
volume tha next ecology may be the mest costeffective o may have perfor-
mance advantages. Tdticrally, cot has dacresed very closely to the male a
‘which density increas
“hase ecology changes are not coninucus but ofien occur in crete steps
Forexample, DRAM sizes are always increased by factors of four because of the
basic dein suc, Ths rather han doubling every 18 meas, DRAM ech
nology quadkuples every tree years. This stepwise change in etiology leads 0(Chops 1 Funckmentals of Computer Desi
thresholds that can enable an implementation technique that was previously im
possible. For example, when MOS technology reached the point where it could
‘Put between 25,000 and 50,000 tansistors ona Single chip in the early 1980s, it
became possible vobuilda 32-bit microprocessor ona single chip. By eliminating
chip crossings within the processor, a rami increase in cosVperfomance was
‘possible. This design was Simply infeasible until the technology reached a certain
point, Such technology thresholds are not rare and have a significant impact on a
wide variety of design decisions.
1.4 | cost and Trends in Cost
Although there are computer designs where costs tend to be ignored—
specifically supercomputers—costsenstive designs are of growing importance,
Indeed in the past 15 years, the us of technology improvements to achieve low
‘ercost as Well as increased performance, has been a major theme in the comput-
cr industry. Textbooks often ignore the cost half of cos/performance because
costs change, thereby dating books, and because the issues are complex. Yet an
understanding of cos and its facts is e senda for designers to be able to make
intelligent decisions about whether or not anew feature shouldbe included in do=
sans where cost is an issue. (imagine architects desiening skyscrapers without
ay information on costs of sel beams and concrete) This section focuses on
cost, specifically on the components of cost and the majar enc. The Exercises
and Examples use specific cost data that will change over time, though the basic
sdeteminant of cost are ess time Sensitive,
Entre books are writen about costing, pricing strategies, and the impact of
volume. This section can only introduce you to these topics by discussing some
Of the major factors that influence cost of a computer design and how these fac-
tors are changing overtime.
‘The Impact of Time, Volume, Commodization,
and Packaging
‘The cost of a manufactured computer component decreases over time even With-
‘out major improvements in the basic implementation technology. The underlying
Principle tha drives costs down isthe learning curve manufacturing cows do
crease overtime, The leaming curve itself is best measured by change in yiel—
the percentage of manufactured devices that survives the testing procedure,
Whether itis a chip, a board, or a system, desians that have twice the yield will
have basically half the cost. Understanding how the leaming curve will improve
yield iskey o projecting coss over the life ofthe product. As an example of the
Jeaming curve in action, the cost per megabyte of DRAM drops over the lang
temby 40% per year, A more dramatic version of the same information i shown14 Coster Tends in Oost 2
in Figure 1.3, where the cost ofa new DRAM chip is depicted over its lifetime,
[Between the start of a project and the shipping of a product, say two years, the
‘os of a new DRAM dlops by afactor of etween five and 10 incanstant dollars
Since not all component costs change at the same rate, designs base on project=
‘ed costs resin diferent cos/periormance traie-ofls than those using current
‘cos. The caption of Figure 1.3 discusses some of the long-term trends in DRAM
ox.
FIGURE.3_Pricesof four gaerations of RAMs overtime n 1977 dolar, showing the eaming curve at works A
197 cals isnerh about $2 1985; rest of is Filan ceauTedin fe pared! 197, ur) HAA fe ve
hanged lo $1.81. The oad of amegabye of meron has doeped noedblychrirg Hs period, fmm over SEOOD in 1977 to
Justover $n 1995 fn 1977 dolars} Each ganeraion cops n constant doll price by a facor of 8 10 over its ten.
“The noes cos of facalion ex. ipmeat foreach new gerezalo has lado stow butstesdy neTeeses Non we stare
{ng prie ofa tecnoboy and he erent, Ines price, Periods when cand exceeded supply, such as 1987-88 and
"922-60, have ed Exmeorary higher pcg, which shows up as a sowing he rae of rive decrease(Chops 1 Funckmentals of Computer Desi
Volume is a second key factor in detemnining cos, Increasing volumes alect
cost in several ways. Fs they decrease the time necded to get down the leaning
‘curve, which i parly proportional o the number of system (ce chips) manufac-
tured, Second, volume decreases cos, since it increases purchasing an manufac
luring efficiency. As a rule of thumb, some designers have estimated that cost
secreases about 10% for each doubling of volume, Also, volume decreases the
amount of development cost that must be amontized by each machine, ths
alloving cos and selling price tobe closer. We will retum to the other factors ine
fluncing selling price shortly.
Commeuites are products that are sold by multiple vendors in large volumes
and are essentially identical. Virtually ll the prociucts sold onthe shelves of gro
cry stores are commodities, as are Standard DRAMS, small disks, monitors, and
Feyboutds Inthe pas 10 years, much of the low end of the computer business
hs become a commedity business focused on building IM-compatible PCS.
“There area vaviety of vendors that ship vinually identical products and are highly
‘competitive, Of course, this competition decreases the gap between cust and sell
ing price, butt also decreases cow. This occurs because a commodity market has
both volume and a clear product definition. This allows multiple suppliers 0
compete in building components for the commetty prockct. AS a rex, the
‘overll product cos is ower because of the competition among the supplicrs of
the components and the volume efficiencies the supplins can achieve,
Cost of an Integrated Circuit
Why woulda compu acditecue took have a section on inegrated cit
ces? In an incensingly conpolive compuler maritpace were stn
Reds, DRAMK, and octane becoming asic pation ofan s-
tems cost, neg uit cosa coming sel poten of Ue cost Ua
‘arc fecen machine pel inte hghevolumecostscsiveprton
the market. Thus computer designers must understand the costs of chips to under-
stand te cos of cent compute, We flow hore tke US. acount a>
proach he costs of chips
‘While he cos of iterated crits kine dropped exponential, de basic
rocede of ican menuacure unanswered ard
hopped into des Ua ae packaged Gee Figs and 1.5). Ths he costo
‘Packaged integrated Circuit is:
Cone ie Coste win die Cos of aan a flit
Tiel ed
In this section, we focus on the cost of dies, summarizing the key issues in testing
and pacoging ate end Along Gscusion ofthe sting css an paclaging,
cents pean he Exrises
(Cox of negra14 Coster Tends in Oost "
FIGURE 14 Photograph of a Snch wafer containing intel Pentium microprocessors. The ce si is 80.7 ri?
andife tobnutoe fds 63, Caurkesy Fel).
FIGURE 5 Photograph of an 8.nch water containing PowerPCED! microprocessors. The ciesize fs 122m? The
"uber fies onthe wale 200 afr subacing te tet des (he ocHockng dies hater scateredarcurt).(Qouresy
Ew)(Chops 1 Funckmentals of Computer Desi
“To leam how to predict te number of good chips per wafer requites frst
Jeaming how many dies fit on a wafer and then Jeaming how to predict the per=
‘centage of those that will work. From there itis simp to predict Cost:
Costof water
DispervaterxDieyia
“The mos interesting feature ofthis rs term of the chip cost equation i its sens-
tivity to diesiz, shown below.
‘The number of cies per wale is basically the area ofthe wafer divided by the
areaof the die, Itcan be more accurately estimated by
Cxtet die
Diespor wat
x (Water dimen _ 7x Wafer diameter
De 2
“Te fist ye ise rato of wafer ea (v2) fo de pea The second compensa
for te “Square pein round hole” problem rectangular dies near the pexiphery
of round wafers Diving the cieumterence (ui) by the diagonal ofa square cis
approcimaly te number of diesalng te alge, For eample, a wafer 2 om = 8
inch) indiameter produces 3.14 x 100= (314% 20/141) = 269 loam dies
EXAMPLE Findthenurberof des per 20cm water for adie hat 6 1.5omona side.
ANSWER — Thetota die aeais2.250r8. Thus
But this only gives the maximum number of dies per wafets The critical ques
tion is, What is the fraction or percentage of good dies on a water nunber, oF the
de yield? A simple empirical model of integrated circuit yield, which assumes
that defects are randomly cisibuted over the wafer and that yield is inversely
‘proportional to the complexity ofthe fabrication process, eas io the following:
bil = Wr Petunia
\where wafer yield accounts for wafers that are completely bad and so noed net be
‘ested, For simplicity, we'll ust assume the Wafer Yield is 100%, Defects per unit
area isa measure of the random an manufacturing defects that occur. In 1995,
these values typically ringe between 0.6 and 1.2 per square centimeter, depend
ing on the maturity of the process (recall the leaning curve, menticned cari).
Lal, cs a parameter that corespones roughly to the number of masking lev
‘els, meceure of manufacturing complexity, critical to die yield. Fr today’s mule
tievel metal QMOS processes, a good estimate is c= 30,EXAMPLE
ANSWER
14 Coster Tends in Oost 13
Find he do yet tr ds that are t cm ana side ar 1.5m cna side,
assuming a dsc dorsi of OB pax cr
“Tretotal de areas are 1 on? and 2.251? Forthe smaller cle the yield &
‘rebate nes the abr of god espe war, which comes fom mu
‘plying dies per wafer by die yield, The examples above predict 132 good 1om?
(des fem the 2m vate and 5 good 205? dies Most higheend mice
processors fall between these two sizes, with some being as large as 2.75 cm? in
1095, Laven processors are sometimes as sal as (8 Gn? while prOSsOrS
used for embeded control (in printers automobiles, etc) are often just 5 em,
(Figure 1.22 on page 63 in the Exercises shows the die size and technology for sev-
eral current microprocessors.) Occasionally dies become pad limited: the amount
of de aea is dtd by he princerraer han te lg the inka This
‘may lead to a higher yield, since defects in empty silicon are less serious!
Processing a ammeter wale nu keaingedse technology with 3
intl ayers cots batvoen SSO and SOD in 995 Assan a pce ar
fer cost of $3500, the cost of the I-cm? dle is around $27, while the cost per die
‘of the 2,25-cm" die is about $140, oF slightly over 5 times the cost for a die that is
225 times ire.
‘What should a computer designer remember about chip costs? The manufiac-
ting paces dca he wafer cos, Wale il ae dts pr Ut at,
thesole canola hedsignr ide ae. Since as typically 3 forth achanced
processes in use today, die costs are proportional to the fourth (or higher) power
oftedeae
Cost of die =f Die are
‘The computer designer affects die size, and hence cost, both by what functions
sareincludedon or excluded from the dig and by the number of VO pins.
Before we have a part that is ready for use in a computer, the part must be
tested (10 separate the good dies from the bad), packaged, and tested again after
packaging, These spe all add costs, These process and their contribution 10
‘cos are discussed and evaluated in Exercise 18.4
(Chops 1 Funckmentals of Computer Desi
Distribution of Cost in a System: An Example
“To put the cost of silicon in perspective, Figure 1.6 shows the approximate cost
breakdown for acolor desktop machine in the late 1960s. While costs for units
like DRAMS will surely drop over time from those in Figure 1,6, costs for units
‘whose prices have already been cul, like displays and cabinets will change very
litle. Funhermore, We can expect that future machines will have larser memories
and disks, meaning that prices drop more slowly than the technology improve.
men
“The processor subsystem accouns for only 6% ofthe overall cost Although in
‘amicerange or highend design this number would be larger, the overall break
sdowmacross major subsystems is likely tobe similar
‘Sytem ‘Subsystem Fraction of otal
Cabinet Sect met pass 1%
Power supply fans 2%
Cables, ms ls 1%
Shipping box, mans om
Subyotal te
Prasertunl Proscar oe
DRAM (64MB) he
Videosysiem 14
vosysem oe
Primed iui art 1%
Subtotal wre
Todas Keybourlandmane 1%
Meriter 2%
Ha disk (1 GB) 1
DATasive co
Subtotal x
[FIGURE 1.6 Estimated ciarbution of costs of te components in a tawend, te
1890s color desidop workstation assuring 100,000 units. Notoe fat he lagest sine
jtem meray! Cost for ahigrerd PC woudbe stir, 20201 fat the amourtotmemay
migtbe 16-42 MB ae Fen 64MB. Ths Gat Based oncbla tom Ay Beige
of Sun Mizcsystans, be Touma [1965] decuses worstiim css andrbng.
Cost Versus Price—Why They Differ and By How Much
(Costs of componcnis may confine a designer's desires, but they are stl far from
representing what the customer must pay. But why should a computer architec
ture book contain pricing information? Cost goes through a number of changes14 Coster Tends in Oost 15
before it becomes price, and the computer designer should understand how a de
sign decision will affect the potential selling price. For example, changing cost
by S1000 may change price by $3000 10 $4000. Without understanding the rela-
tionship of cos to price the computer designer may not understand the impact on
price of aking, delting, or replacing components, The relationship between
price and volume can increase the impact of changes in cost, especially at the low
‘end ofthe market, Typically, fewer computers ae soldas the price increases Fur
themore, as volume decreases, costs rise, leading to further increases in price.
‘Thus, small changes in cost can have a larger than obvious impact. The relaion-
ship between cost and price is a complex cne with entre books writen cn the
subject. The purpose of this section isto give you a simple introduction to what
factors determine price and typical ranges for these factors.
“The categories that mcke up price can be shown either as at on cost oF as.
‘percentage ofthe price. We will look atthe information both ways. These diffe
‘ences between price and cost also depend on where in the computer markerplace
acompany is selling, To show these differences, Figures 1,7 and 1,8 on page 16
show how the difference between cost of materials and list price is decomposed,
with the price increasing from left to right as we add each type of overhead
Direct cast refer to the cows directly related to making a product. These ine
‘panics’ annual reports and tabulate in national magne, so this peoentage is
unlikely to change over time.
‘The information above suggests that a company unifommly applies fred
‘overhead percentages fo tum cost info price, and this is tre for many companies.
‘But another point of view is that R&D should be considered an investment. Thus
an investment of 4% 10 12% of income means that every SI spent on R&D shoul
lead to $8 to $25 in sales This allemative point of view then suggests a cifferent
Toss margin for each product depending on the number Sold an the Size of the
inyesmment,
Large, expensive machines generally cost more to develop— machine cost=
ing 10 times as much to manufacture May cost many times as much to develop.
Since large, expensive machines generally do not sell as well as sll ones, the
“gross margin must be greater on the big machines for the company to maintain a
profitable retum on its investment. This investment model places lange machines
in double popanly—because there ae fewer sold aid they requite larger R&D
‘cox’s—anl gives one explanation fora higher rato of price to cost versus smaller
machines.
“The isa of cost ard cosperformance is complex one, Thar is no singe
tart for computer designers At one exteme, lifleperfomance desn spas
‘nocost in achieving its goal. Supercomputers have traditionally fit into this cate-
ony At the aber exe is fo-cns design, where porfomance is sacri 10
achive lowest cos, Computers like the IBM PC clones belong hare. tween,
these extremes is cast/performance design, where the designer balances cos ver-
‘sus performance. Most of the workstation manufacturers operate in this region, In
the pst 10 years as computers have dewnsial, both locost design ane cs
performance design hine become increasingly important, Even the supercom-
puter manufactures have found that cost plays an increasing role. This sxcton
fas intodced some ofthe most important factors in detemnining cos the next,
section deals with performance,18
(Chops 1 Funckmentals of Computer Desi
1.5 | measuring and Reporting Performance
When we sy one computer is faster than another, what do we mean? The con
Pler user may say a computer is faster when a program runs in less time, while
the computer center manager may say a computer is faster when it completes
meee jobs in an hour The computer user is interested in reducing response
‘me—Ahe time between the stat and the completion of an event—aso refered to
as execution time, The manager of a large data processing center may be ineresi-
‘ein increasing rough puede tka amount of work done in a.given ime.
In comparing design allematives, we often want (o rele the performance of
«wo different machines, say X and Y. The phrase “X is faster than Ys used here
to mean that the response time or execution time i lower on X than on Y for the
sven tas In particular, "X is times faster than” will mean
Execution timey.
‘Eamon ney
Since exccution time isthe eciprocal of performance, the following relationship
holds:
1
Exotica timy Rey
rformansey,
"= ERnoN UM T Peromaney,
Peroemanaey
“The phase“ throught of Xs 13 times higher han” signifi here that
the mumberof tasks competed per unt time on machine Xs 13 tims the mame
ber completed cnY.
Bocuse performance ar exeeutien time are rcproca,inreasing perfer-
mance decreases excction tne To help avoid consion between the tems
increasing and dereasng, we Uslly Sy “improve perfomance” or mprcve
‘section time” when we mean increase performance and devrense exotic
time.
‘Whether we are interested in throughput or response time, the key measure-
ment sme: The compar tht peforms the sme amount of werk in the Ist
time is the fastest. The ferences whether we measre one tk response time)
my tasks hroghyu. ifort) times nt aways the mec quoted
incorparing th: performance of computer. urn of poplar meres ave
been ake in the ques fra easly uderstced univers mesure of computer
rerfonmanee, wih the rest that a fev innocent tems have been sharehaid
from ther welldefined enviterment and forced ino service fer which they
were nove intended. The autho position is thatthe cay consistent and eae
measure of performance isthe exccution tie of el programs a that al Fro-
osedahematives time asthe metric orf rel progam ashe ems messin15 Messuing ard Reporting Permarce 19
hve eventually led to misleading claims or even mistakes in computer design,
“The dangers of a few popular alleratives are shown in Fallacies and Pitfalls,
section LS.
Measuring Performance
Byen execution time can be defined in diffrent ways depending on what we
‘count. The mest sruightforward definition of time is called wullcfook rime, e=
sponse time, 0 elapsed time, which isthe latency to complete task, including,
disk accesses, memory accesses, inpuVouIPUL activities, operating system over
head—cverything. With mulliprogramming the CPU works on ancther program
while waiting for VO and may not necessarily minimize the elapsed time of one
rogram, Hence we need a tem 10 take this activity into account, CPU time ree-
‘ognizes this distinction and means the time the CPU is computing, nor including
the ime waiting for VO or running other programs. (Clealy the response time
seen by the wer isthe elapsed time ofthe program, not he CPU time.) CPU time
‘can be further divided info the CPU time spent in the program, called user CPU
time, and the CPU time spent in the operating system performing asks requested
by the program, called system CPU ime,
‘These distinctions are reflected in the UNIX time commend, which retums,
fourmeasuements when applied t an exceuting program:
90.7 12.98 2239 658
User CPU time's 90:7 seconds, stem CPU time's 12.9 seconds, elapse time is
2 minutes and 39 seconds (159 seconds) and the percentage of elapsed time that
isCPU tine is O17 + 12.9159 or 65%. Mare than a thin the elapsed time in
thisexample was spent waiting for UO Gr running her programs or both, Many
measurements ignore sytem CPU time because ofthe imccurcy of operating
systems’ slf-mcasurement (the above inaccurate measurement can iva UNDX)
and the inequity oF including system CPU time when comparing perfomance be
twoen machines with fering sytem codes. On the che hand, system code on
some machines s user code on others and no program rus without some operal-
ing system running onthe Harvare, 0 case can be made for using te sim of
user CPU time and system CPU time.
In te proent discussion, a distinction is maintained betwoon porfemance
base on elapsed time anc it based on CPU time, The term system performance
is used to refer to lapsed tine on an doled sytem, while CPU performance
refers towser CPU time on an unloaded system. We wll concentrate on CPU per
formance inthis chars,20
(Chops 1 Funckmentals of Computer Desi
Choosing Programs to Evaluate Performance
Dinystone does not wse floating points Typical programs don't
Rick Richuchon, Clariaaicn of Dystne (988)
‘This program is he result of extensive research to determine the struction mix
ofa ypical Fortran prograan. The result of this program on diferent machines
‘Should give a goad indication of which machine periorms better aera typical
loud of Fortran prograns, The statements are purpasely arrnge o deat epti=
‘izations by the compiler.
HL.J.Cumow and B.A. Wichmam [1976], Comments in he Whetstone Benchmark
A computer user who runs the same programs day in and day out would be the
perfect candidate to evaluate a new computer. To evaluate anew system the user
‘would simply compare the execution time of her workload—the mixture of pro
‘gums and operating sysem Commands that users un on a machine. Few are in
this happy situation, however. Most must rely on other methods t evaluate me
chines and often other evaluators, hoping that these methods will preci por
formance for their wsge of the new machine. There ae four levels of programs
wad in such circumstances, listed below in decreasing onder of accuracy of ro
diction.
16 Real programs—While the buyer may not know what fraction of time is spent
‘on these programs, she knows that some users will run them to solve real prob
Jems. Examples are compilers for C. text-processing software like TeX, and CAD
tools like Spice, Real programs have int, cutput, ancl opdions that user can se
Jeet when running the program.
2. Kemels—Several atempss have been made to extract small, Key pioces from
‘eal programs ancl use them to evaluate performance, Livermore Loops aod Line
pack thebest known examples. Unlike real programs, nouser would run kemel
‘programs, for they exis solely o evaluate performance. Kemels ave best use 10
‘solute performance of individual features of a machine to explain the reasons for
[the Morgan Kaufmann Series in Computer Architecture and Design] David a. Patterson, John L. Hennessy - Computer Organization and Design, Third Edition_ the Hardware_Software Interface, Third Edition (2004, Morgan