0% found this document useful (0 votes)

5 views

The Limits of Semiconductor Technology and Oncoming Challenges in Computer Microarchitectures and Architectures

The document discusses the rapid advancements in semiconductor technology and microprocessor design over the past few decades, highlighting the implications of Moore's Law on performance and complexity. It reviews various microarchitectural techniques that enhance instruction processing and explores future trends in microprocessor systems, including power consumption minimization. The paper emphasizes the challenges posed by increasing transistor counts and design complexities as technology continues to evolve.

Uploaded by

benevolence0809

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

The Limits of Semiconductor Technology and Oncoming Challenges in Computer Microarchitectures and Architectures

Uploaded by

benevolence0809

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

FACTA UNIVERSITATIS (NI Š)

S ER .: E LEC . E NERG . vol. 17, December 2004, 285-312

The Limits of Semiconductor Technology and Oncoming

Challenges in Computer Microarchitectures and
Architectures

Mile Stojčev, Teufik Tokić, and Ivan Milentijević

Abstract: In the last three decades the world of computers and especially that of mi-
croprocessors has been advanced at exponential rates in both productivity and perfor-
mance. The integrated circuit industry has followed a steady path of constantly shrink-
ing devices geometries and increased functionality that larger chips provide. The
technology that enabled this exponential growth is a combination of advancements
in process technology, microarchitecture, architecture and design and development
tools. Together, these performances and functionality improvements have resulted in
a history of new technology generations every two to three years, commonly referred
to as ”Moore Law”. Each new generation has approximately doubled logic circuit
density and increased performance by about 40%. This paper overviews some of the
microarchitectural techniques that are typical for contemporary high-performance mi-
croprocessors. The techniques are classified into those that increase the concurrency
in instruction processing, while maintaining the appearance of sequential processing
(pipelining, super-scalar execution, out-of-order execution, etc.), and those that ex-
ploit program behavior (memories hierarchies, branch predictors, trace caches, etc.).
In addition, the paper also discusses microarchitectural techniques likely to be used in
the near future such as microarchitectures with multiple sequencers and thread-level
speculation, and microarchitectural techniques intended for minimization of power
consumption.
Keywords: Embedded systems, computer microarchitectures, semiconductor tech-
nology.

1 Introduction

During the past 40 years the semiconductor VLSI IC industry has distinguished
itself both by rapid pace of performance improvements in its products, and by a
Manuscript received March 2, 2004
The authors are with Faculty of Electronic Engineering, University of Niš, Beogradska 14, PO
Box 73, 18000 Niš, Serbia and Montenegro (e-mail: [email protected]).

285
286 M. Stojčev, T. Tokić, and I. Milentijević:

steady path of constantly shrinking device geometries and increasing chip size.
Technology scaling has been the primary driver behind improving the perfor-
mance characteristics of IC’s. The speed and integration density of IC’s have dra-
matically improved. Exploitation of a billion transistor capacity of a single VLSI
IC requires new system paradigms and significant improvements to design produc-
tivity. Structural complexity and functional diversity of such IC’s are the challenges
for the design teams. Structural complexity can be increased by having more pro-
ductive design methods and by putting more resources in design work. Functional
diversity of information technology products will increase too. The next generation
products will be based on computers, but the full exploitation of silicon capacity
will require drastical improvements in design productivity and system architecture
[1].
Together these performances and functionality improvements are generally iden-
tified in a history of new technology generations with the growth of the micropro-
cessor, which is frequently described as a ”Moore’s Law”. Moore’s Law states that
each new generation has approximately doubled logic circuit density and increased
performance by 40% while quadrupling memory capacity [2]. According to Inter-
national Technology Roadmap for Semiconductor (IRTS) projections, the number
of transistors per chip and the local clock frequencies for high-performance mi-
croprocessors will continue to grow exponentially in the next 10 years too. The
2003 IRTS predicts that by 2014 microprocessor gate length will have been 35 nm,
voltage will drop to 0.4V, and clock frequency will rise to almost 30 GHz. Fig.
1. presents some of these predictions. As a consequence, experts expect that in
the next 10 years the transistor count for microprocessors will increase to 1 billion,
providing about 100 000 MIPS [3].

Fig. 1. Trends in future size over time

The aim of this paper is to present, in more details, the trends and challenges
The limits of semiconductor technology and oncoming challenges ... 287

related to architecture and microarchitecture aspects of microprocessor design. In

order to provide a comprehensive review with clear predictions we use the follow-
ing structure: Section 2. deals with Moore’s Law. The evolution of semiconductor
VLSI IC technology, from aspect of Moore’s Law I, is discussed at the beginning.
After that, the general trends, that we expect in the next ten years, according to
ITRS projections, we expect, are presented. The effects of these trends over time,
concerning: a) increasing of transistor count for microprocessors and DRAM mem-
ory elements; b) shrinking of linewidths of IC’s; c) growing chip die sizes; and d)
increasing semiconductor fabrication process complexity, are considered. At the
end of this section, the consequences of Moore’s Law II related to increasing the
system cost over time are presented. Section 3. covers evolution of design objec-
tives. The main topics considered here relate to the impact of technology scaling,
low power, frequency scaling, and circuit reliability on IC’s design. In Section
4. future trends in microprocessor systems design are discussed. First, the state
of art in microprocessing is presented, then we try to answer the question what
will happen with microprocessors tomorrow, focusing attention on increasing con-
currency in instruction processing. Section 5. points to the IC design today and
its perspectives in the future. Section 6. concentrates on embedded systems. In
addition it provides details concerning system-on-a-chip (SoC), and network-on-a-
chip (NoC) design, and points to the next research trends in computer architecture
design. Finally, Section 7. gives a conclusion.

2 Moore’s Laws: Evolution of Semiconductor VLSI IC Technology

2.1 Moore’s law I

The pace of IC technology over the past forty years has been well characterized by
Moore’s Law. It was noted in 1965 by Gordon Moore, research director of Fairchild
Semiconductor, that the integration density of the first commercial integrated circuit
was doubled approximately every year [4]. From the chronology in Table 1, we
see that the first microchip was invented in 1959. Thus, the complexity was one
transistor. In 1964, complexity grew up to 32 transistors, and in 1965, a chip in the
Fairchild R&D lab had 64 transistors. Moore predicted that chip complexity would
be doubled every year based on data for 1959, 1964, and 1965 [5].
In 1975, the prediction was revised to suggest a new, slower rate of growth:
Doubling of the IC transistor count every two years. This trend of exponential
growth of IC complexity is commonly referred to as Moore’s Law I. (Some people
say that Moore’s Law complexity predicts a doubling every 18 months).
As a result, since the beginning of commercial production of IC’s in the early
1960’s, circuit complexity has risen from a few transistors to hundreds of mil-
288 M. Stojčev, T. Tokić, and I. Milentijević:

Table 1. Complexity of microchip and Moore’s Law

Year Microchip Complexly Moore’s Law:
Transistors Complexity: Transistors
1959 1 20 = 1
1964 32 25 = 32
1965 64 26 = 64
1975 64,000 216 = 64,000

lions/billion transistors functioning together on a single monolithic substrate. Fur-

thermore, Moore’s law is expected to continue at a comparable pace for at least
another decade [3].
Memory size has also increased rapidly since 1965, when the PDP-8 came with
4 kB of core memory and when an 8 kB system was considered large. In 1981, the
IBM PC machine was limited to 640 kB memory. By the early 1990’s, 4 or 8 MB
memories for PCs were rule, and in 2000, the standard PC memory size grew to
64-128 MB, in 2003 it was in the range from 256 up to 512 MB [6, 7].
Disk memory has also increased rapidly: from small 32 - 128 kB disks for
PDP 8e computer in 1970 to 10 MB disk for the IBM XT PC in 1982. From
1991 to 1997, disk storage capacity increased by about 60% per year, yielding an
eighteenfold increase in capacity. In 2001, the standard desktop PC came with a
40 GB hard drive, and in 2003 with 120 GB. If Moore’s law predicts a doubling of
microprocessor complexity every two years, disk storage capacity will increase by
2.56 times each two years, faster than Moore’s Law [5].
Tendencies in capacity and speed increasing for random logic, DRAM, and
disk, during the past period, are given in Table 2.

Table 2. Capacity and speed increasing during the past period

Capacity Speed (latency)
Logic 2 in 3 years 2 in 3 years
DRAM 4 in 3 years 2 in 10 years
Disk 4 in 3 years 2 in 10 years

The evolution of the integration density of microprocessor and DRAM memory

ICs are shown in Fig. 2 [3] along with the SIAR’s prediction given in Table 3 [8].
As it can be seen from Fig. 2., DRAM IC complexity has been growing at an even
higher rate in respect to microprocessor, quadrupling roughly every three years.
Similar exponential growth rates have occurred for other aspects of computer
technology such as clock speed and processor performance. These remarkable
The limits of semiconductor technology and oncoming challenges ... 289

Table 3. Semiconductor Industry Association Roadmaps (SIARs) prediction summary for high-end
processors
Specification/year 1997 1999 2001 2003 2006 2009 2012
Feature size (micron) 0.25 0.18 0.15 0.13 0.1 0.07 0.05
Supply voltage (V) 1.8-2.5 1.5-1.8 1.2-1.5 1.2-1.5 0.9-1.2 0.6-0.9 0.5-0.6
Transistors/chip (millions) 11 21 40 76 200 520 1400
DRAM bits/chip (mega) 167 1070 1700 4290 17200 68700 275000
Die size (mm2 ) 300 340 385 430 520 620 750
Global clock freq. (MHz) 750 1200 1400 1600 2000 2500 3000
Local clock freq. (MHz) 750 1250 1500 2100 3500 6000 10000
Maximum power/chip (W) 70 90 110 130 160 170 175

growth rates have been the major driving forces of the computer revolution dur-
ing the past period.
One of the key drivers behind the industry abilities to double transistor counts
every 18 to 24 months, is the continuous reduction in linewidths (see Fig. 3).
Shrinking linewidths not only enables more components to fit onto an IC (typi-
cally 2x per linewidth generation) but also lower costs (typically 30% per linewidth
generation).

Fig. 2. Evolution of transistor count of CPU/microprocessor and memory ICs

Shrinking linewidths have slowed down the rate of growth in die size to 1.14x
per year versus 1.38 to 1.58x per year for transistor counts, and since the mid
nineties accelerating linewidth shrinks have halted and even reversed the growth
290 M. Stojčev, T. Tokić, and I. Milentijević:

Fig. 3. Shrinking linewidths over time

in die sizes (Fig. 4).

Shrinking linewidths isn’t free. Linewidth shrinks require process modifica-
tions to deal with a variety of issues that come up from shrinking the devices -
leading to increasing complexity in the processes being used (Fig. 5).
The principal driving force behind this spectacular improvement in circuit com-
plexity and performance has been the steady decrease in the future size of semicon-
ductor devices. Advances in optical lithography have allowed manufacturing on-
chip structures with increasingly higher resolution. The increase in component per
chip comes from the following three key factors: a) The factor of two in component
density comes from a 2 shrink in each lithography dimensions; b) an additional

factor of 2 comes from an increase in chip area; c) a final factor of 2 comes from

a device and circuit cleverness [2].

2.2 Moore’s law II

In 1996 Intel augmented Moores law (the number of transistor on processor doubles
approximately every 18 mounts) with Moores law II.
Law II says that as sophistication of chip increases, the cost of fabrication rises
exponentially (Fig. 6).
For example: In 1986 Intel manufactured 386 that counted 250 000 transistors
in fabs costing $200 million. In 1996 for Pentium processor that counted 6 million
The limits of semiconductor technology and oncoming challenges ... 291

Fig. 4. Growth in die sizes over time

transistors $2 billion facility was needed to produce it [8].

3 Evolution of Design Objectives

Advances in fabrication technology and the emergence of new applications have

induced several shifts in the principle objectives of designing IC over the past forty
years. The evolution of IC design paradigm is pictured in Fig. 7 [4].
In 1960’s and 1970’s yield concerns served as the primary limitation to IC in-
tegration density, and, as a consequence, circuit compactness and die area were
the primary criteria in the IC design process. By 1980’s circuit speed had become
the design criteria of highest priority. Concurrently, a new class of applications
emerged, principally restricted by the amount of power consumed (classical such
products are digital wrist watches, handheld calculators, pacemakers, etc). The ap-
plications established a new design concept - design for ultra-low power, i.e. prod-
ucts for which power dissipation represents the key design criteria. In 1990’s the
main trend has been focused on optimization of both speed and power, borrowing
a number of design approaches typical for ultra-low power products. Aggressive
scaling and increasing circuit complexity have caused noise problems in high-speed
ICs, so design for low-noise ICs becomes a necessity. In fact, all changes in speed
and speed/power are oriented nowadays towards speed/power/noise design criteria
[4, 9, 10].
292 M. Stojčev, T. Tokić, and I. Milentijević:

Fig. 5. Process Complexity over time

3.1 Design challenges of technology scaling

Advances in optical litography have allowed manufacturing of on - chip structures
with increasingly higher resolution. The area, power, and speed characteristics of
transistors with a planar structure, such as MOS devices, improve with the decrease
(i.e. scaling) in the lateral dimensions of the devices. Therefore, these technologies
are referred as scalable [6, 11, 12].
Generally, scalable technology has three main goals: 1) reduce gate delay by
30%, resulting in an increase in operating frequency of about 43%; 2) double tran-
sistor density; and 3) reduce energy per transition by about 65%, saving 50% of
power, at a 43% increase in frequency [3, 11, 12, 8].
Scaling a technology reduces gate by 30% and the lateral and vertical dimen-
sions by 30%. Therefore, the area and fringing capacitance, and consequently the
total capacitance, decrease by 30% to 0.7 from nominal value normalized to 1.
Since the dimensions decrease by 30%, the die area decrease by 50%, and capaci-
tance per unit of area increases by 43% [1, 4, 11, 8].

3.2 Design challenges of low power

Low power consumption is one of the crucial factors determining the success of
personal mobile communications and portable computing systems in the fastest
growing sectors of the consumer electronics market. Mobile computing system
and biomedical implantable devices are just a few examples of electronic devices
The limits of semiconductor technology and oncoming challenges ... 293

Fig. 7. Evolution of design criteria in CMOS

integrated circuits
Fig. 6. System cost over time

whose power consumption is a basic constraint to be met, since their operativity in

the time domain depends on a limited energy storage [9, 10].
The electronic devices at the heart of such products need to dissipate low power,
in order to conserve battery life and meet packaging reliability constraints. Lower-
ing power consumption is important not only for lengthening battery life in portable
systems, but also for improving reliability, and reducing heat-removal cost in high-
performance systems. Consequently, power consumption is a dramatic problem for
all integrated circuits designed today [10, 13].
Low power design in terms of algorithms, architectures, and circuits has re-
ceived significant attention and research input over the last decade [14, 10]. The
implementation can be categorized into system level, algorithm level, architecture
level, circuit level, and process/device level. Fig 8. shows the relative impact on
power consumption of each phase of the design process. Essentially higher - level
categories have more effect on power reduction.
The system level is the highest layer which strongly influences power consump-
tion and distribution by partitioning system factors.
The algorithm level is the second level,which defines a detailed implementation
outline of the required original function, i.e. it determines how to solve the problem
and how to reduce the original complexity.
At the architecture level there are still many options and wide freedom in im-
plementation, such as, for example, CPU - microprocessor, DSP (Digital Signal
Processor), ASIC (Application Specific Integrated Circuit) - dedicated hardware
logic, reconfigurable logic, etc [13].
The circuit level is the most detailed implementation layer. This level is ex-
plained as a module level such as multiplier or memory and basement level like
voltage control that affects wide range of the chip.
The process level and the device level are the lowest levels of implementation.
This layer itself does not have drastic impact directly. However, when it is oriented
294 M. Stojčev, T. Tokić, and I. Milentijević:

Fig. 8. Each level impact for low - power design

towards voltage reduction, this level plays a very important role in power saving
[13].
Present day general purpose microprocessor designers are faced with the daunt-
ing task of reducing power dissipation since power dissipation quickly becomes a
bottleneck for future technologies.
For all the integrated circuits used in battery-powered portable devices, power
consumption is the main issue. Furthermore, power consumption is also the main
issue for high-performance integrated circuit due to heat dissipation. Consequently,
power consumption is a dramatic problem for all integrated circuits designed today
[10, 13].
Advances in CMOS technology, however, are driving the operating voltage of
integrated circuits increasingly lower. The forecast of operating voltages in CMOS
technology is shown in Fig. 9. From these general trends, it is clear that circuits
will need to operate at 0 5V and even below within next ten years [15].

3.3 Frequency scaling

In order to evaluate how past technologies have met the performance goal, we have
been plotting product frequencies over time, as shown in Fig. 10 [3].
Assuming a technology generation spans two to three years, the data show that
microprocessor frequency has doubled every generation. Several factors may be
accounted. Consider the data plotted on the right - hand y - axis in Fig. 10. The
average number of gate delays in a clock period is decreasing because both the new
The limits of semiconductor technology and oncoming challenges ... 295

Fig. 9. Forecast of operatingvoltage for CMOS technology over time

microarchitectures use shorter pipelines for static gates, and because the advanced
circuit techniques reduce the critical path delays even further. This could be the
main reason that the frequency is doubled in every technology generation. One
may suspect that this frequency increase comes at the expense of overdesign or
oversize of transistors. Fig. 11. shows how transistor size scales across different
process technologies.

Fig. 10. Processor frequency doubles each generation

According to the previous discussion, we can conclude that the twofold fre-
quency improvement for each technology generation is primarily due to the fol-
lowing factors [11]:

The reduced number of gates employed in a clock period, what makes the
296 M. Stojčev, T. Tokić, and I. Milentijević:

Fig. 11. Process technology (microns)

design more pipelined.

Advanced circuit design techniques that reduce the average gate delay be-
yond 30% per generation.

3.4 Challenges in VLSI circuit reliability

Shrinking geometries, lower power voltages, and higher frequencies have a neg-
ative impact on reliability. Together, they increase the number of occurrences of
intermittent and transient faults [16, 17, 18].
Faults experienced by semiconductor devices fall into three main categories:
permanent, intermittent, and transient [19, 20].
Permanent faults reflect irreversible physical changes. The improvement of
semiconductor design and manufacturing techniques has significantly decreased
the rate of occurrence of permanent faults. Figure 12 shows the evolution of per-
manent - fault rates for CMOS microprocessors and static and dynamic memories
over the past decade. The semiconductor industry is widely adopting copper in-
terconnects. This trend has a positive impact on permanent - faults rate of occur-
rence, as copper provides a higher electro-migration threshold than aluminium does
[16, 12, 17, 18].
Intermittent faults occur because of unstable or marginal hardware; they can be
activated by environmental changes, like higher or lower temperature or voltage.
Many times intermittent precede the occurrence of permanent faults.
Transient faults occur because of temporary environmental conditions. Sev-
eral phenomena induce transient faults: neutron and alpha particles; power supply
and interconnect noise; electromagnetic interference; and electrostatic discharge.
Higher VLSI integration and lower supply voltages have contributed to higher oc-
currence rates for particle - induced transients, also known as soft errors. Fig. 13
The limits of semiconductor technology and oncoming challenges ... 297

plots measured neutron - and alpha - induced soft errors rates (SERs) for CMOS
SRAMs as a function of memory capacity [16].

Fig. 12. The evolution of permanent-fault rates for CMOS microprocessors and
SRAM and DRAM memories over the past decade

Fault avoidance and fault tolerance are the main approaches used to increase
the reliability of VLSI circuits [16, 17, 20]. Fault avoidance relies on improved
materials, manufacturing processes, and circuit design. For instance, lower - alpha
emission interconnect and packaging materials contribute to low SERs. Silicon on
insulator is commonly used process solution for lower circuit sensitivity to particle-
induced transients [16].
Fault tolerance is implementable at the circuit or system level. It relies on

Fig. 13. SERs for two CMOS SRAM technology generations

298 M. Stojčev, T. Tokić, and I. Milentijević:

concurrent error detection, error recovery, error correction codes (CEDs), and space
or time redundancy. Designers have successfully built both hardware and software
implementations [17, 18].
Intermittent and transient faults are expected to represent the main source of
errors experienced by VLSI circuits. In general, the semiconductor industry is ap-
proaching a new stage in the development and manufacturing of VLSI circuits.
Failure avoidance, based on design technologies and process technologies, would
not fully control intermittent and transient faults. Fault - tolerant solutions, presently
employed in custom - designed systems, will become widely used in off-the-shelf
ICs tomorrow, i.e. in mainstream commercial applications [21]. Designers will
have to embed these solutions into VLSI circuits especially microprocessors, in or-
der to provide better fault and error handling, and to avoid silent data corruption
[16, 22].
As an example of transient errors we will consider the influences of changes in
the supply voltage referred to as power supply noise. Power supply noise adversely
affects circuit operation through the following mechanisms: a) signal uncertainty;
b) on-chip clock jitter; c) noise margin degradation; and d) degradation of gate ox-
ide reliability. For correct circuit operation the supply levels have to be maintained
within a certain range near the nominal voltage levels. This range is called the
power noise margin. The primary objective in the design of the distribution sys-
tem is to supply sufficient current to each transistor on an integrated circuit while
ensuring that the power noise does not exceed the target noise margins. As an il-
lustration, the evolution of the average current of high-performance Intel family of
microprocessors is given in Fig. 14 [3, 4].

Fig. 14. Evolution of the average current in Fig. 15. Evolution of transient current over time
higher performance Intel’s family microproces-
sors

In general, the current of the contemporary microprocessors has currently reached

100 A, and will furthermore increase with technology scaling. The forecast de-
The limits of semiconductor technology and oncoming challenges ... 299

mands in the transient and average current of higher-performance microprocessors

are shown in Fig. 15 [3, 4].
As it can be seen from Fig. 15 the transient current in modern high-performance
microprocessors is approximatiely TA/s (10 12 A s), and is expected to rise, ex-

ceeding 100 TA/s by 2016. Switching hundreds of ampers within a fraction of

a nanosecond (GHz operation) causes excessive overshoots of the supply voltage,
directly increases the rate of transient faults, and affects circuit reliability.

4 Future Directions in Microprocessor Systems

Deep-submicron technology allows billions of transistors on a single die, poten-

tially running at gigahertz frequencies. According to Semiconductor Industry As-
sociation projections, the number of transistor per chip and the local clock frequen-
cies for high performance microprocessors will continue to grow exponentially in
the near future, as it is illustrated in Fig. 16. This ensures that future microproces-
sors will become even more complex [8].

Fig. 16. The national Technology Roadmap for Semiconductors: (a) total transistor
per chip, (b) on - chip local clock

As the processor community prepares for a billion transistor on a chip, re-

searchers continue to debate the most effective way to use them. One approach
is to add more memory (either cache or primary) to the chip, but the performances
gain from memory alone are limited. Another approach is to increase the level of
system integration, bringing support functions like graphics accelerators and I/O
controllers on chip. Although integration lowers system costs and communication
latency, the overall performance gain to application is again marginal [23].
In the sequel we will point to some of the new directions oriented towards
system/microprocessor performance improvement mainly intended to enhance sys-
tem/processor’s computational capabilities.
300 M. Stojčev, T. Tokić, and I. Milentijević:

4.1 Microprocessor today - microprocessor tomorrow

Microprocessors have gone through significant changes during the last three decades.
However, the basic computational model has not been changed much. A program
consists of instructions and data. The instructions are encoded in a specific instruc-
tion set architecture (ISA). The computational model is still a single instruction
stream based on, sequential execution model, operating on the architecture states
(memory and registers). It is a job of the microarchitecture, the logic, and the
circuits to carry out this instruction stream in the ”best” way [24].
Figure 17.a shows the level of transformation that a problem, initially described
in some natural languages like English, German, Serbian or Macedonian, has to
pass through in order to be solved. When we say ”microprocessor today” we gen-
erally assume the shaded region of Fig. 17.a, where each microprocessor consists
of circuit that implement hardware structure (collectively called the microarchitec-
ture) that provide an interface (called ISA) to the software [25].

Fig. 17. a) The microprocessor today b) microprocessor tomorrow

As it can be seen from Fig. 17.a the compiled program uses to tell the micro-
processor what it (the program) needs to be done, and the microprocessors use to
know what it must be carried out in behalf of the program. The ISA is implemented
by a set of hardware structures collectively referred to as the microprocessor’s mi-
croarchitecture. If we take our levels of transformation and include the algorithm
and language into microprocessor, the microprocessor then becomes the thing that
uses device technology to solve the problem (see Fig. 17.b)[25].

4.2 Increasing concurrency in instruction processing

Many of the architectural mechanisms found in a modern microprocessor serve to
increase the concurrency of instruction processing. The processing of instructions
can be overlapped via pipelining and with techniques for instruction level paral-
lelism (ILP) [6, 26].
The limits of semiconductor technology and oncoming challenges ... 301

Pipelining overlaps the micro-operations required to execute different instruc-

tions. Instruction processing proceeds in a number of steps or pipeline stages where
the various micro-operations are executed. Fig. 18 measures the impact of increas-
ing the number of pipeline stages on performance using a synthetic model of an
in-order superscalar machine. Performance scales less than frequency (e.g., going
from 6 to 23 yields only a 1.75 times speed up, from 6 to 23 yields only 2.2 times).
[24, 26].

Fig. 18. Frequency and performance improvements

Instruction level parallelism is a family of processor and compiler design tech-

niques that speed-up execution, by causing individual machine operations such as
memory load and stores, integer additions and floating point multiplications, to ex-
ecute in parallel. A typical ILP processor has the same type of execution hardware
as a normal RISC machine. If ILP is to be achieved by choosing between the com-
piler and the runtime hardware, then the following functions must be performed
[27]:

the dependences between operations must be determined,

the operations, that are independent of any operation that has not been com-
pleted yet, must be determined, and

these independent operations must be scheduled to execute at some particular

time, in some specific functional unit, and must be assigned to register into
which the result may be deposited.

Fig. 19 shows the breakdown of these three tasks, between the compiler and
runtime hardware for the three classes of architecture.
Current superscalars can execute four or more instructions per cycle. In prac-
tice, however, they achieve only one or two, because current applications have low
ILP.
302 M. Stojčev, T. Tokić, and I. Milentijević:

Fig. 19. Division of responsibilities between the compiler and the hard-
ware for the three classes of architecture

Instead of issuing n n 4 5 6 instructions per cycle, the same performance

could be achieved by pipelining the functional units and instruction issue hardware
n times, in this way speeding up the clock rate by a factor of n, but issuing only one
instruction per cycle. This strategy is termed superpipelining [27, 26].
Superscalar and VLIW machines represent two different approaches to the
same ultimate goal, which is achieving high performance via instruction-level par-
allel processing. The two approaches have evolved through different historical
paths and from different perspectives. It has been suggested that these two ap-
proaches are quite synergistic and there is strong motivation for pursuing poten-
tial integration of the two approaches [26]. Let us now point in brief to the main
features of the VLIW and superscalar processors. VLIW processors rely on the
compiler to schedule instructions for parallel execution by placing multiple opera-
tions in a single long instruction word. All of the operations in a VLIW instruction
are executed in the same cycle, allowing the compiler to control which instruction
to execute in any given cycle. VLIW processors can be relatively simple, allow-
ing them to be implemented at high clock speeds, but they are generally unable to
maintain compatibility between generations because any change to the processor
implementation requires programs to be recompiled if they are to execute correctly
[28]. Superscalar processors, on the other hand, contain hardware that examines a
sequential program to locate instructions that can be executed in parallel. This al-
lows them to maintain compatibility between generations and to achieve speedups
on program that were compiled for sequential processors, but they have a limited
The limits of semiconductor technology and oncoming challenges ... 303

window of instructions that the hardware examines to select instructions that can
be executed in parallel, which can reduce performance [28].
VLIW architectures have the following properties: a) there is one central con-
trol logic unit issuing a single long instruction per cycle; b) each long instruction
consists of many tightly coupled independent operations; c) each operation requires
a small statically predictable number of cycles to execute; d) operations can be
pipelined.
Figure 20 illustrates how instruction processing is conceptually carried out
in a modern, high-performance processor that use ILP [26, 29]. Instructions are
fetched, decoded, and renamed in program order. At any given cycle, we may actu-
ally fetch or decode multiple instructions. Many current processors fetch up to four
instructions simultaneously. Branch prediction is used to predict the path through
the code, so that fetching can run-ahead of execution. After decode, instruction is
allowed to execute once its input data becomes available and provided that suffi-
cient execution resources are available. Once an instruction executes, it is allowed
to complete. At the last step, instructions commit to program order.

Fig. 20. Execution progresses in modern high performance processor

4.3 Future directions in microarchitectures

Future microprocessors will be faced with new challenges. Numerous techniques
have been proposed. Most of them have multiple sequencers, and are capable of
processing multiple instruction streams. In the sequel, we will discuss some mi-
croarchitectural techniques that are likely to be used commercially in the near fu-
ture [30, 29]:
A) Multithreading or multiprocessing: the processor is composed as a col-
lection of independent processing elements (PEs), each of which executes a sepa-
304 M. Stojčev, T. Tokić, and I. Milentijević:

rate thread or flow control. By designing the processor as a collection of PEs, (a)
the number of global wires is reduced, and (b) very little communication occurs
through global wires. Thus, much of communication occurring in the multi- PE
processor is local in nature and occurs through short wires. The commonly used
model for control flow among threads is the parallel threads model. The fork in-
struction specifies the creation of new threads and their starting addresses, while
the join instruction serves as a synchronizing point and collects the threads. The
thread sequencing model is illustrated in Fig. 21 [30].

Fig. 21. Parallelism profile for a parallel thread models

B) Simultaneous-multithreading (SMT): is a processor design that consumes

both thread-level and instruction-level parallelism. In SMT processors thread-level
parallelism can come from either multitread, parallel programs or individual, inde-
pendent programs in a multiprogramming workload. ILP comes from each single
program or thread. Because it successfully (and simultaneously) exploits both types
of parallelism, SMT processors use resources more efficiently, and both instruction
throughput and speedups are greater [23]. Fig. 22 shows how three different archi-
tectures partition issue slots (functional units)

Fig. 22. How three different architectures partition issue slots: a) super-
scalar; b) multithreaded superscalar; and c) SMT
The limits of semiconductor technology and oncoming challenges ... 305

The rows of squares represent issue slots. The processor either finds an in-
struction to execute (filled box) or it allows the slots to remain unused (empty box)
[30].
C) Chip multiprocessor (CMP) - the idea is to put several microprocessors
on a single die (see for example Fig. 23). The performance of small-scale CMP
scales close to linear with the number of microprocessors and is likely to exceed the
performance of an equivalent multiprocessor system. CMP is an attractive option
to use when moving to a new process technology. New process technology allows
us to shrink and duplicate our best existing microprocessor on the some silicon die,
thus doubling the performance at the same power [30, 29].

Fig. 23. Chip multiprocessor

5 IC Design Today and in Future

Market-related trends continue to drive innovation in the semiconductor industry.

Today, they are particularly driving the design of systems on a chip, the new breed
of complex, highly integrated systems. In essence, everyday the products become
more integrated. Designs integrate for example, radio frequency and base-band
functions on the same chip, based on the use of RF CMOS processes. This results
in very compact and inexpensive solutions for wireless connectivity like cellular
phone [31].
Many new products, such as software radio, realize functions through software
running on the hardware, allowing silicon reuse and, thus, lowering cost. This
software-driven functionality also offers flexibility and upgradibility, even when
systems are already in the field. It transforms silicon vendors into system solution
providers that also sell the software. IC design increasingly depends on intellectual
property (IP): predesigned and silicon-proven function blocks interconnected by
buses or other communication channels. Providers are emerging to provide these
IP blocks in hard, firm, or soft versions [31, 32].
In the past, a chip was just a component of a system; today, a chip is a system
in itself, referred as a system-on-chip solution [33].
306 M. Stojčev, T. Tokić, and I. Milentijević:

Twenty five years ago information processing was associated with large main-
frame computers. At the end of the last century, this shifted towards information
processing based on personal computers. These trends continue towards miniatur-
ization of product. Nowadays, more and more information processing devices are
portable computers integrated into larger products. These new types of information
technology applications are called ubiquitous computing, ”pervasive computing”,
and ambient computing. Embedded systems are one of the origins of these three
areas and they provide a major part of the necessary technology [34].

6 What is an Embedded System?

Embedded systems (ESs) are computers incorporated in consumer products or

other devices in order to perform application specific functions. ESs can contain
a variety of computing devices, such as microcontrollers, application-specific in-
tegrated circuits (ASICs), application specific integrated processors (ASIPs), and
digital signal processors (DSPs). Unlike computers, the electronics used in these
applications are deeply embedded and must interact with the user and the real word
through sensors and actuators. A key requirement is that these computing devices
continuously respond to external events in real time [35].
Embedded electronic systems are often highly distributed, and their parts must
interact to implement a complete application. Because of performance and cost
pressures ESs are built using a wide variety of techniques, including software,
firmware, ASICs, ASIPs, general purpose and domain-specific processors, FPGA,
CPLD, analog circuits, and sensors and actuators. The design of complex ESs is
a difficult problem, requiring designers with skills and experience to identify the
best solution [36]. Typical applications of ESs include medical electronics (pace-
makers), personal communication devices (wireless phones), automobiles (antilock
braking systems), aviation (fly-by-wire flight control systems), railroad (high-speed
train control), and others.

6.1 What is SoC design?

During the last ten years embedded systems have moved toward system-on-a-chip
(SoC) and high-level multichip modules solutions. A SoC design is defined as a
complex IC that integrates the major functional elements of a complete end-product
into a single chip or chipset [37, 38].
In general, SoC design incorporates a programmable processor, on-chip mem-
ory, and accelerating function units implemented in hardware. It also interfaces
peripheral devices and/or the real world. SoC designs encompass both hardware
and software components. Because SoC designs can interface the real world,
The limits of semiconductor technology and oncoming challenges ... 307

they often incorporate analog components, and can in the future, also include
opto/microelectronic mechanical system components [37].
Short time to market, large gate counts, and high-performance characterize to-
day’s VLSI design environment. SoC technology holds the key for previously men-
tioned complex applications by enabling high-performance, embedded processing
solutions at a low single-chip cost.
To quickly create SoC designs with the required complexity, designers must
use predesigned intellectual property (IP) blocks, also referred as macros, cores,
or virtual components. For SoC designs, this means reusing previously designed
cores wherever possible. The more design reuse we have, the faster the SoC time
to market is [38].
From the system architect’s point of view quick SoC assembly cores using is
not an easy job due to the following reasons: CPU selection, decision which func-
tions will be performed in hardware versus software, integrating cores into SoCs,
achieving correct timing, physical design of large systems, testing and system ver-
ification, and others [39].

6.2 Nanoelectronic components into electronic microsystems

More and more modern information systems require an analog input-output inter-
face. System’s inputs typically come from analog sensors. With aim to provide
easy processing, the systems convert these signals as quickly as possible into digi-
tal format. The system subsequently reconverts these signals back to analog output
through actuators such as lamps, motors, speakers, and display [40]. Examples of
such systems include everyday products like TVs, phones, PCs, and PDAs. Such
products also include the equally pervasive but invisible engine control units that
manage, for example, an internal combustion of engine’s functions [41].
A complete system includes not only electronic functions but also sensors and
actuators. Microelectromechanical system (MEMS) technology makes it possible
to build these systems using silicon and thus allowing new levels of integration
among the system’s electronics, mechanical, optical, and/or fluidic elements [41,
40]

6.3 Networks on chips

According to ITRS prediction [3], by the end of the decade, SoCs using 50nm
transistors and operating below 1V, will grow up to 4 billion transistors running
at 10 GHz. The major design problem accompanied with these chips will be the
challenge how to provide correct function and reliable operation of the interacting
308 M. Stojčev, T. Tokić, and I. Milentijević:

components. On-chip physical interconnections will present a limiting factor for

performance, and possibly for energy consumption.
Synchronization of future chips with a single clock source and negligible skew
will be extremely difficult, or even impossible. The most likely synchronization
paradigm for future chips - globally asynchronous and locally synchronous - in-
volves using many different clocks [24, 42, 43].
In the absence of a single timing reference, SoC chips become distributed sys-
tems on a single silicon substrate. In these solutions, components will initiate data
transfer autonomously, according to their needs, i.e. the global communication
pattern will be fully distributed [42].
On-chip networks relate closely to interconnection networks for high perfor-
mance parallel computers with multiple processors, where processor is an indi-
vidual chip. Like multiprocessor interconnection networks, nodes are physically
closer to each other and have high link reliability. From the design stand point,
network reconfigurability will be a key in providing plug-and-play component use
because the components will interact with one another through reconfigurable pro-
tocols [44].

6.4 What is next in computer architecture?

Human appetite for computation has grown even faster than the processing power
that Moore’s law predicted. We need even more powerful processors just to keep up
with modern applications like interactive multimedia, mobile computing, wireless
communications, etc. To make matters more difficult, we need these powerful pro-
cessors to use less energy than we have been accustomed to, i.e. to design power-
aware components/systems. To achieve this functionality we must rethink the way
we design our contemporary computers. Namely, rather than worrying solely only
about performance, we need now to judge computers by their performance-power-
cost product. This new way of looking at processors will lead us to new computer
architectures and new ways of thinking about computer system design [45].
Performance of microprocessor system, over the past time period of thirty
years, (i,e. from 1965. up to 1995.), in respect to performance of minicomputers,
mainframes, and supercomputers, respectively, are presented in Fig. 24. More than
8 years ago, performance increasing of modern microprocessor system surpassed
that of supercomputers.
In the area of deep-submicron technology two classes of microprocessors are
evolving: clent and server processors. At a fixed feature size, area can be traded
off for time. VLSI complexity theorists have shown that an AT n bound exists for
microprocessor designers, when n usually falls between 1 and 2. Bay varying the
supply voltage, it is possible to trade off area A for power P with PT 3 bound. Figure
The limits of semiconductor technology and oncoming challenges ... 309

25 shows the possible trade-off involving area, time T , and power in processor
design. The power and area axes are typically optimized for server processors [8].

Fig. 24. Performance of microprocessors, mainframes, su- Fig. 25. Area, performance (time),
percomputers and minicomputers over time and power trade-off trends in server
and client processor designs

7 Conclusion

As technology scales, important new opportunities emerge for VLSI ICs design-
ers. Understanding technology trends and specific applications is the main criterion
for designing efficient and effective chips. There are several difficult and exciting
challenges facing the design of complex ICs. To continue its phenomenal historical
growth and continue to follow Moore’s law, the semiconductor industry will require
advances on all fronts - from front-end process and lithography to design innova-
tive high-performance processor architectures, and SoC solutions. The roadmap’s
goal is to bring experts together in each of these fields to determine what those
challenges are, and potentially how to solve them.
The presented discussion says that there are a lot of challenging problems left
in systems research. If we look for a progress we need to think hard, for a long time
and about where to direct our efforts.

References

[1] R. Ronen, A. Mendelson, K. Lai, F. Pollack, and J. Shen, “Comming challenges in

microarchitecture and architecture,” Proc. of the IEEE, vol. 83, no. 3, pp. 325–339,
2001.
[2] J. Plummer and P. Griffin, “Material and process limits in silicon vlsi technology,”
Proceeedings of the IEEE, vol. 89, no. 3, pp. 240–258, March 2001.
310 M. Stojčev, T. Tokić, and I. Milentijević:

[3] The 2003 international technology roadmap for semiconductors. [Online]. Available:
http://public.irts.net
[4] A. Mezhiba and E. Friedman, Power Distribution Networks in High-speed Integrated
Circuits. Boston: Kluwer Academic Publishers, 2004.
[5] M. Shooman, Reliability of Computer System and Networks: Fault Tolerance, Anal-
ysis. and Design. New York: Wiley - Interscience Publication, 2002.
[6] J. Hennessy and D. Patterson, Computer Architecture: A Quantitive Approach,
3rd ed. Amsterdam: Morgan Kaufman Pub., 2003.
[7] D. Adams, High Performance Memory Testing: Design Principles, Fault Modeling
and Self - Test. Boston: Kluwer Academic Publishers, 2003.
[8] M. Flynn et al., “Deep-submicron microprocessor design isues,” IEEE Micro, vol. 19,
no. 4, pp. 11–22, 1999.
[9] V. Zaccaria et al., Power Estimation and Optimization Methodologies for VLIW
Based Embeded Systems. Boston: Kluwer Academic Publishers, 2003.
[10] Varadarajan et al., “Low power design issues,” in The Computer Engineering Hand-
book, V. Oklobdzija, Ed. Baca Raton: CRC Press, 2002.
[11] S. Borkar, “Design challenges of technology scaling,” IEEE Micro, vol. 19, no. 4,
July - August 1999.
[12] S. Kang and Y. Leblebici, CMOS Digital Integrated Circuits: Analysis and Design.
Boston: Mc Graw Hill, 2003, vol. 3/1.
[13] K. Seno, “Implementation level impact on low power design,” in The Computer En-
gineering Handbook, V. Oklobdzija, Ed. Baca Raton: CRC Press, 2002.
[14] N. Nikolici and B. A. Hashimi, Power Constrained Testing of VLSI Circuits. Boston:
Kluwer Academic Publishers, 2003.
[15] Semiconductor industry association. (03) The National Technology Roadmap for
Semiconductors. [Online]. Available: http://www.sematech.org/
[16] C. Constrantinescu, “Trends and challenges in vlsi circuit reliability,” IEEE Micro,
vol. 19, no. 4, pp. 14–19, July - August 1999.
[17] P. Lala, Self Checking and Fault Tolerant Digital System Design. San Francisco:
Morgan Kaufman Pub., 2001.
[18] M. Stojcev et al., “Implementation of self checking two level combinational logic
on fpga and cpld circuits,” Microelectronics Reliability, vol. 44, no. 1, pp. 173–178,
January 2004.
[19] M. Esonu et al., “Fault tolerant design methodology for systolic array architectures,”
IEE Proc. Compt. Dig. Tech.,, vol. 141, no. 1, pp. 17–28, 1994.
[20] B. Johnson, Design an Analysis of Fault Tolerant Systems. Reading MA: Addison
Wesley, 1990.
[21] K. Mohanram et al., “Synthesis of low-cost parity-based partially self checking cir-
cuits,” Journal of Electronic Testing: Theory and Application (JETTA),, vol. 16, no.
1/2, pp. 145–153, 2001.
The limits of semiconductor technology and oncoming challenges ... 311

[22] D. K et al., “Rsyn: A system for automated synthesis of reliable multilevel circuits,”
IEEE Trans. VLSI Systems, vol. 2, no. 2, pp. 186–195, 1994.
[23] S. Eggers et al., “Simultaneous multitreading: A pplatform for next - generation
processors,” IEEE Micro, vol. 17, no. 5, pp. 12–19, 1997.
[24] J.-P. Soininen and H. Hensala, Jontash A., and Tenhunen H., Eds Networks on Chips.
Boston: Kluwer Academic Publishers, 2003, ch. A Design Methodology for NoC -
Based Systems.
[25] Y. Patt, “Requirements, bottlenecks, and good fortune: Agents for microprocessor
evolution,” Proc. of the IEEE, vol. 89, no. 11, pp. 1553–1559, 2001.
[26] J. Shen and M. Lipasti, Modern Processor Design: Fundamentals of Superscalar
Processors. New York: McGraw Hill Book Comp., 2003.
[27] R. Rau and J. Fisher, “Instruction-level parallel processing: History, overview and
perspective,” The Journal of Supercomputing, vol. 7, no. 1, pp. 1–56, 1993.
[28] N. Carter, Computer Architecture. New York: McGraw Hill Book Company, 2002.
[29] A. Mosholos and S. Guirindar, “Microarchitectural innovations: Boosting micropro-
cessor performance beyond semiconductor technology scaling,” Proc. of the IEEE,
vol. 89, no. 11, pp. 1560–1575, 2001.
[30] D. Burger and J. Goodman, “Billion-transistor architectures: There and back again,”
IEEE Computer, vol. 37, no. 3, pp. 22–28, 2004.
[31] T. Claasen, “System on chip: Changing ic design today and in the future,” IEEE
Micro, vol. 23, no. 3, pp. 20–26, 2004.
[32] R. Gupta and Y. Zorian, “Introducing core-based system design,” IEEE Design and
test of computer, vol. 14, no. 4, pp. 15–25, 1997.
[33] P. Rashinkar et al., System-on-a-Chip: Methodology and Techniques. Boston:
Kluwer Academic Pub., 2001.
[34] P. Marwedel, Embedded System Design. Boston: Kluwer Academic Pub.,, 2003.
[35] H. Al-Asad et al., “Online bist for embedded systems,” Design&Test of Computers,
vol. 15, no. 4, pp. 17–24, 1998.
[36] R. Leupers, Code Optimization Techniques for Embedded Processors: Methods, Al-
gorithms, and Tools. Boston: Kluwer Academic Pub., 2000.
[37] H. Chang et al., Surviving the SoC Revolution: A Guide to Platform-Based Design.
Boston: Kluwer Academic Pub., 1999.
[38] M. Birnboun and H. Sachs, “How vsia answers the soc dilemma,” IEEE Computer,
vol. 32, no. 6, pp. 42–50, 1999.
[39] R. Bergamaschi et al., “Automating the design of socs using cores,” IEEE Design&
Test of computers, vol. 18, no. 5, pp. 32–45, 2001.
[40] R. Waser, Nanoelectronics and Information Technology: Advanced Electronic Mate-
rials and Novel Device. Weinheim: John Wiley and Sons, 2003.
[41] B. Murari, “Integrating nanoelectronic components into electronic microsystems,”
IEEE Micro, vol. 23, no. 3, pp. 36–44, 2003.
312 M. Stojčev, T. Tokić, and I. Milentijević:

[42] A. Jantsch, “Nocs: A new contract between hardware and software,” in Proc of Eu-
romicro Symposium on Digital System Design, DSD 2003, Belek-Antalya, Turkey,
2003, pp. 10–16.
[43] M. Pflanz and H. Vierhaus, “Online check and recovery techniques for depandable
embedded processors,” IEEE Micro, vol. 21, no. 5, pp. 24–40, 2001.
[44] L. Benini and D. D. Micheli, “Networks on chips: A new soc paradigm,” IEEE Com-
puter, vol. 35, no. 1, pp. 70–78, 2002.
[45] T. Austin, “Mobile supercomputers,” IEEE Computer, vol. 37, no. 5, pp. 81–83,
2004.

Lecture 2 - Evolution of VLSI
No ratings yet
Lecture 2 - Evolution of VLSI
18 pages
Moore's Law - SoC
No ratings yet
Moore's Law - SoC
35 pages
MECA587 Advanced Digital Design For Mechatronics: The Origin of Moore's Law ... and Its Consequences
No ratings yet
MECA587 Advanced Digital Design For Mechatronics: The Origin of Moore's Law ... and Its Consequences
58 pages
VLSI Technology: Scaling Moore's Law 3D Vlsi
No ratings yet
VLSI Technology: Scaling Moore's Law 3D Vlsi
32 pages
VLSI Trends Revan
No ratings yet
VLSI Trends Revan
16 pages
Hardware Complexity of
No ratings yet
Hardware Complexity of
7 pages
vlsi technology
No ratings yet
vlsi technology
32 pages
MOSFET Scaling Pages 1
No ratings yet
MOSFET Scaling Pages 1
15 pages
Latest Advancements in Microprocessor
100% (1)
Latest Advancements in Microprocessor
12 pages
1.'motivation For SoC Design - by Raveendra Somana
100% (1)
1.'motivation For SoC Design - by Raveendra Somana
13 pages
VLSI Design: Professor: S.Ramasamy/ECE B64, R307 Rams@aastu - Edu.et
No ratings yet
VLSI Design: Professor: S.Ramasamy/ECE B64, R307 Rams@aastu - Edu.et
79 pages
EE310: Introduction To VLSI Design
No ratings yet
EE310: Introduction To VLSI Design
73 pages
Is Moore Law Still Relevent
No ratings yet
Is Moore Law Still Relevent
2 pages
Cmos Vlsi Design: A Systems & Circuits Perspective
No ratings yet
Cmos Vlsi Design: A Systems & Circuits Perspective
44 pages
Photographs: Intel Corp
No ratings yet
Photographs: Intel Corp
8 pages
Lecture 15: Moore's Law and Dennard Scaling: William Gropp WWW - Cs.illinois - Edu/ Wgropp
No ratings yet
Lecture 15: Moore's Law and Dennard Scaling: William Gropp WWW - Cs.illinois - Edu/ Wgropp
14 pages
Intro
No ratings yet
Intro
37 pages
ITRS and Its Roadmap
No ratings yet
ITRS and Its Roadmap
5 pages
Intel Moores Law Investor Meeting Paper Final
No ratings yet
Intel Moores Law Investor Meeting Paper Final
4 pages
Low Power Microelectronics: Retrospect and Prospect: James D. Meindl, Fellow, Ieee
No ratings yet
Low Power Microelectronics: Retrospect and Prospect: James D. Meindl, Fellow, Ieee
17 pages
Ece 374 Part 7 Mosfet 4
No ratings yet
Ece 374 Part 7 Mosfet 4
24 pages
Moore's Law Is The Observation That
100% (1)
Moore's Law Is The Observation That
14 pages
Moore's Law and Its Implications For Information Warfare: Invited Paper
No ratings yet
Moore's Law and Its Implications For Information Warfare: Invited Paper
23 pages
pdf (1)
No ratings yet
pdf (1)
9 pages
Multicore Processor
No ratings yet
Multicore Processor
18 pages
COMORG The Third Generation
No ratings yet
COMORG The Third Generation
20 pages
akaash-sai-sagi_BQTMX
No ratings yet
akaash-sai-sagi_BQTMX
6 pages
Little Kaiya How Moores Law Works - Close Reading Excercise
No ratings yet
Little Kaiya How Moores Law Works - Close Reading Excercise
7 pages
Lecture2 - Si CMOS Basics
No ratings yet
Lecture2 - Si CMOS Basics
36 pages
Moors Law Report (1)
No ratings yet
Moors Law Report (1)
6 pages
Internship Report For VLSI
No ratings yet
Internship Report For VLSI
35 pages
Science and Engineering Beyond Moores Law
100% (1)
Science and Engineering Beyond Moores Law
30 pages
Fifty Years of Moore's Law: Chris A. Mack
No ratings yet
Fifty Years of Moore's Law: Chris A. Mack
6 pages
The Future Evolution of High-Performance Microprocessors: Norm Jouppi HP Labs
No ratings yet
The Future Evolution of High-Performance Microprocessors: Norm Jouppi HP Labs
57 pages
Moors Law Report (1)
No ratings yet
Moors Law Report (1)
6 pages
Research: Moore's Law and What Is Going To Happen After 2021
No ratings yet
Research: Moore's Law and What Is Going To Happen After 2021
7 pages
IC Technology: Dr. Sachin D. Pabale Matosri College of Engineering and Research Centre, Nasik
No ratings yet
IC Technology: Dr. Sachin D. Pabale Matosri College of Engineering and Research Centre, Nasik
154 pages
Moore's Law Is The Observation That The Number of
No ratings yet
Moore's Law Is The Observation That The Number of
9 pages
Moores_Law_Report
No ratings yet
Moores_Law_Report
2 pages
verilog module notes
No ratings yet
verilog module notes
13 pages
Ass1 PDF
No ratings yet
Ass1 PDF
2 pages
VLSI
No ratings yet
VLSI
7 pages
PDF
100% (2)
PDF
39 pages
EE4415 Chap1
No ratings yet
EE4415 Chap1
14 pages
Future Trend of Microprocessor Design (Invited Paper) : Intel Corporation, Santa Clara, California USA
No ratings yet
Future Trend of Microprocessor Design (Invited Paper) : Intel Corporation, Santa Clara, California USA
4 pages
Moores Law at 50
No ratings yet
Moores Law at 50
27 pages
Moore Law
No ratings yet
Moore Law
12 pages
Assignment 2: Rayyan Sayeed 1MS12EC098
No ratings yet
Assignment 2: Rayyan Sayeed 1MS12EC098
17 pages
Establishing Mooreapos's Law
No ratings yet
Establishing Mooreapos's Law
14 pages
LPVLSI
No ratings yet
LPVLSI
281 pages
Patrick Warrington On FPGAs
No ratings yet
Patrick Warrington On FPGAs
2 pages
Moores Law Past Present and Future
No ratings yet
Moores Law Past Present and Future
8 pages
The Future of Two-Dimensional Semiconductors Beyond Moore's Law
No ratings yet
The Future of Two-Dimensional Semiconductors Beyond Moore's Law
12 pages
ASIC Design Flow
No ratings yet
ASIC Design Flow
33 pages
Digital Vlsi Design Lecture 1
No ratings yet
Digital Vlsi Design Lecture 1
33 pages
Nanoscale CMOS: Innovative Materials, Modeling and Characterization
From Everand
Nanoscale CMOS: Innovative Materials, Modeling and Characterization
Francis Balestra
No ratings yet
Circuit Board Revolution
From Everand
Circuit Board Revolution
Leo Musk
No ratings yet
Innovations Beyond the Wires: Exploring the Frontiers of Electrical Engineering
From Everand
Innovations Beyond the Wires: Exploring the Frontiers of Electrical Engineering
Farzam Mohammadiazar
No ratings yet
Chiplet Architecture Rise
From Everand
Chiplet Architecture Rise
Mei Gates
No ratings yet
Nanotechnology in Chips
From Everand
Nanotechnology in Chips
Kai Turing
No ratings yet
Embedded System Design-NPTEL-NOTES
100% (1)
Embedded System Design-NPTEL-NOTES
38 pages
A Comparative Study On Recent Mobile Phone Processors
No ratings yet
A Comparative Study On Recent Mobile Phone Processors
6 pages
UNIT 5 (DSP Processor)
78% (9)
UNIT 5 (DSP Processor)
51 pages
DSP Processor
No ratings yet
DSP Processor
29 pages
DSP Processor
No ratings yet
DSP Processor
5 pages
Max MSP Filters
No ratings yet
Max MSP Filters
4 pages
Datasheet
No ratings yet
Datasheet
38 pages
Parallelism in Uniprocessor System and Granularity
100% (5)
Parallelism in Uniprocessor System and Granularity
5 pages
DSP Architectures
No ratings yet
DSP Architectures
71 pages
5.1. Unit V - DSP Processor
No ratings yet
5.1. Unit V - DSP Processor
83 pages
On The Classification of Computer Architecture
No ratings yet
On The Classification of Computer Architecture
11 pages
Lec02 Superscalar SW VLIW 22 23
No ratings yet
Lec02 Superscalar SW VLIW 22 23
34 pages
Processors
100% (4)
Processors
44 pages
Chapter 04 Processors and Memory Hierarchy
75% (8)
Chapter 04 Processors and Memory Hierarchy
50 pages
Distributed and Cloud Computing topic 1
No ratings yet
Distributed and Cloud Computing topic 1
10 pages
P11Mca1 & P8Mca1 - Advanced Computer Architecture: Unit V Processors and Memory Hierarchy
No ratings yet
P11Mca1 & P8Mca1 - Advanced Computer Architecture: Unit V Processors and Memory Hierarchy
45 pages
DSP Processor
No ratings yet
DSP Processor
24 pages
Super Scalar Architecture With Dynamic Branch Prediction
No ratings yet
Super Scalar Architecture With Dynamic Branch Prediction
5 pages
Edge Detection
No ratings yet
Edge Detection
9 pages
Introduction To DSP Processors: K. Vijaya Kumar Asst. Prof. Usharama College of Engineering & Technology
No ratings yet
Introduction To DSP Processors: K. Vijaya Kumar Asst. Prof. Usharama College of Engineering & Technology
45 pages
COA Chapter6
No ratings yet
COA Chapter6
9 pages
Fifty Years of Microprocessor Evolution: From Single Cpu To Multicore and Manycore Systems
No ratings yet
Fifty Years of Microprocessor Evolution: From Single Cpu To Multicore and Manycore Systems
32 pages
Regulation - 2018 (CBCS Scheme) 18CS733-Advanced Computer Architecture
No ratings yet
Regulation - 2018 (CBCS Scheme) 18CS733-Advanced Computer Architecture
40 pages
121 - A. B. C. D.: View Answer Discuss Too Difficult!
No ratings yet
121 - A. B. C. D.: View Answer Discuss Too Difficult!
10 pages
Computer Architecture Mcqs
No ratings yet
Computer Architecture Mcqs
10 pages
03 - Instruction Set Architecture Design
No ratings yet
03 - Instruction Set Architecture Design
38 pages
Instruction-Level Parallelism 2
No ratings yet
Instruction-Level Parallelism 2
77 pages
CS 6303 Computer Architecture TWO Mark With Answer
100% (1)
CS 6303 Computer Architecture TWO Mark With Answer
14 pages
Seminar
No ratings yet
Seminar
85 pages
Instruction Level Parallelism
95% (21)
Instruction Level Parallelism
11 pages

The Limits of Semiconductor Technology and Oncoming Challenges in Computer Microarchitectures and Architectures

Uploaded by

The Limits of Semiconductor Technology and Oncoming Challenges in Computer Microarchitectures and Architectures

Uploaded by

FACTA UNIVERSITATIS (NI Š)

S ER .: E LEC . E NERG . vol. 17, December 2004, 285-312

The Limits of Semiconductor Technology and Oncoming

Mile Stojčev, Teufik Tokić, and Ivan Milentijević

Fig. 1. Trends in future size over time

related to architecture and microarchitecture aspects of microprocessor design. In

2 Moore’s Laws: Evolution of Semiconductor VLSI IC Technology

2.1 Moore’s law I

Table 1. Complexity of microchip and Moore’s Law

lions/billion transistors functioning together on a single monolithic substrate. Fur-

Table 2. Capacity and speed increasing during the past period

The evolution of the integration density of microprocessor and DRAM memory

Fig. 2. Evolution of transistor count of CPU/microprocessor and memory ICs

Fig. 3. Shrinking linewidths over time

in die sizes (Fig. 4).

a device and circuit cleverness [2].

2.2 Moore’s law II

Fig. 4. Growth in die sizes over time

transistors $2 billion facility was needed to produce it [8].

3 Evolution of Design Objectives

Advances in fabrication technology and the emergence of new applications have

Fig. 5. Process Complexity over time

3.1 Design challenges of technology scaling

3.2 Design challenges of low power

Fig. 7. Evolution of design criteria in CMOS

whose power consumption is a basic constraint to be met, since their operativity in

Fig. 8. Each level impact for low - power design

3.3 Frequency scaling

Fig. 9. Forecast of operatingvoltage for CMOS technology over time

Fig. 10. Processor frequency doubles each generation

Fig. 11. Process technology (microns)

design more pipelined.

3.4 Challenges in VLSI circuit reliability

Fig. 13. SERs for two CMOS SRAM technology generations

In general, the current of the contemporary microprocessors has currently reached

mands in the transient and average current of higher-performance microprocessors

ceeding 100 TA/s by 2016. Switching hundreds of ampers within a fraction of

4 Future Directions in Microprocessor Systems

Deep-submicron technology allows billions of transistors on a single die, poten-

As the processor community prepares for a billion transistor on a chip, re-

4.1 Microprocessor today - microprocessor tomorrow

Fig. 17. a) The microprocessor today b) microprocessor tomorrow

4.2 Increasing concurrency in instruction processing

Pipelining overlaps the micro-operations required to execute different instruc-

Fig. 18. Frequency and performance improvements

Instruction level parallelism is a family of processor and compiler design tech-

the dependences between operations must be determined,

these independent operations must be scheduled to execute at some particular

Instead of issuing n n 4 5 6 instructions per cycle, the same performance

Fig. 20. Execution progresses in modern high performance processor

4.3 Future directions in microarchitectures

Fig. 21. Parallelism profile for a parallel thread models

B) Simultaneous-multithreading (SMT): is a processor design that consumes

Fig. 23. Chip multiprocessor

5 IC Design Today and in Future

Market-related trends continue to drive innovation in the semiconductor industry.

6 What is an Embedded System?

Embedded systems (ESs) are computers incorporated in consumer products or

6.1 What is SoC design?

6.2 Nanoelectronic components into electronic microsystems

6.3 Networks on chips

components. On-chip physical interconnections will present a limiting factor for

6.4 What is next in computer architecture?

[1] R. Ronen, A. Mendelson, K. Lai, F. Pollack, and J. Shen, “Comming challenges in

You might also like