The Limits of Semiconductor Technology and Oncoming Challenges in Computer Microarchitectures and Architectures
The Limits of Semiconductor Technology and Oncoming Challenges in Computer Microarchitectures and Architectures
Abstract: In the last three decades the world of computers and especially that of mi-
croprocessors has been advanced at exponential rates in both productivity and perfor-
mance. The integrated circuit industry has followed a steady path of constantly shrink-
ing devices geometries and increased functionality that larger chips provide. The
technology that enabled this exponential growth is a combination of advancements
in process technology, microarchitecture, architecture and design and development
tools. Together, these performances and functionality improvements have resulted in
a history of new technology generations every two to three years, commonly referred
to as ”Moore Law”. Each new generation has approximately doubled logic circuit
density and increased performance by about 40%. This paper overviews some of the
microarchitectural techniques that are typical for contemporary high-performance mi-
croprocessors. The techniques are classified into those that increase the concurrency
in instruction processing, while maintaining the appearance of sequential processing
(pipelining, super-scalar execution, out-of-order execution, etc.), and those that ex-
ploit program behavior (memories hierarchies, branch predictors, trace caches, etc.).
In addition, the paper also discusses microarchitectural techniques likely to be used in
the near future such as microarchitectures with multiple sequencers and thread-level
speculation, and microarchitectural techniques intended for minimization of power
consumption.
Keywords: Embedded systems, computer microarchitectures, semiconductor tech-
nology.
1 Introduction
During the past 40 years the semiconductor VLSI IC industry has distinguished
itself both by rapid pace of performance improvements in its products, and by a
Manuscript received March 2, 2004
The authors are with Faculty of Electronic Engineering, University of Niš, Beogradska 14, PO
Box 73, 18000 Niš, Serbia and Montenegro (e-mail: [email protected]).
285
286 M. Stojčev, T. Tokić, and I. Milentijević:
steady path of constantly shrinking device geometries and increasing chip size.
Technology scaling has been the primary driver behind improving the perfor-
mance characteristics of IC’s. The speed and integration density of IC’s have dra-
matically improved. Exploitation of a billion transistor capacity of a single VLSI
IC requires new system paradigms and significant improvements to design produc-
tivity. Structural complexity and functional diversity of such IC’s are the challenges
for the design teams. Structural complexity can be increased by having more pro-
ductive design methods and by putting more resources in design work. Functional
diversity of information technology products will increase too. The next generation
products will be based on computers, but the full exploitation of silicon capacity
will require drastical improvements in design productivity and system architecture
[1].
Together these performances and functionality improvements are generally iden-
tified in a history of new technology generations with the growth of the micropro-
cessor, which is frequently described as a ”Moore’s Law”. Moore’s Law states that
each new generation has approximately doubled logic circuit density and increased
performance by 40% while quadrupling memory capacity [2]. According to Inter-
national Technology Roadmap for Semiconductor (IRTS) projections, the number
of transistors per chip and the local clock frequencies for high-performance mi-
croprocessors will continue to grow exponentially in the next 10 years too. The
2003 IRTS predicts that by 2014 microprocessor gate length will have been 35 nm,
voltage will drop to 0.4V, and clock frequency will rise to almost 30 GHz. Fig.
1. presents some of these predictions. As a consequence, experts expect that in
the next 10 years the transistor count for microprocessors will increase to 1 billion,
providing about 100 000 MIPS [3].
The aim of this paper is to present, in more details, the trends and challenges
The limits of semiconductor technology and oncoming challenges ... 287
The pace of IC technology over the past forty years has been well characterized by
Moore’s Law. It was noted in 1965 by Gordon Moore, research director of Fairchild
Semiconductor, that the integration density of the first commercial integrated circuit
was doubled approximately every year [4]. From the chronology in Table 1, we
see that the first microchip was invented in 1959. Thus, the complexity was one
transistor. In 1964, complexity grew up to 32 transistors, and in 1965, a chip in the
Fairchild R&D lab had 64 transistors. Moore predicted that chip complexity would
be doubled every year based on data for 1959, 1964, and 1965 [5].
In 1975, the prediction was revised to suggest a new, slower rate of growth:
Doubling of the IC transistor count every two years. This trend of exponential
growth of IC complexity is commonly referred to as Moore’s Law I. (Some people
say that Moore’s Law complexity predicts a doubling every 18 months).
As a result, since the beginning of commercial production of IC’s in the early
1960’s, circuit complexity has risen from a few transistors to hundreds of mil-
288 M. Stojčev, T. Tokić, and I. Milentijević:
Table 3. Semiconductor Industry Association Roadmaps (SIARs) prediction summary for high-end
processors
Specification/year 1997 1999 2001 2003 2006 2009 2012
Feature size (micron) 0.25 0.18 0.15 0.13 0.1 0.07 0.05
Supply voltage (V) 1.8-2.5 1.5-1.8 1.2-1.5 1.2-1.5 0.9-1.2 0.6-0.9 0.5-0.6
Transistors/chip (millions) 11 21 40 76 200 520 1400
DRAM bits/chip (mega) 167 1070 1700 4290 17200 68700 275000
Die size (mm2 ) 300 340 385 430 520 620 750
Global clock freq. (MHz) 750 1200 1400 1600 2000 2500 3000
Local clock freq. (MHz) 750 1250 1500 2100 3500 6000 10000
Maximum power/chip (W) 70 90 110 130 160 170 175
growth rates have been the major driving forces of the computer revolution dur-
ing the past period.
One of the key drivers behind the industry abilities to double transistor counts
every 18 to 24 months, is the continuous reduction in linewidths (see Fig. 3).
Shrinking linewidths not only enables more components to fit onto an IC (typi-
cally 2x per linewidth generation) but also lower costs (typically 30% per linewidth
generation).
Shrinking linewidths have slowed down the rate of growth in die size to 1.14x
per year versus 1.38 to 1.58x per year for transistor counts, and since the mid
nineties accelerating linewidth shrinks have halted and even reversed the growth
290 M. Stojčev, T. Tokić, and I. Milentijević:
factor of 2 comes from an increase in chip area; c) a final factor of 2 comes from
In 1996 Intel augmented Moores law (the number of transistor on processor doubles
approximately every 18 mounts) with Moores law II.
Law II says that as sophistication of chip increases, the cost of fabrication rises
exponentially (Fig. 6).
For example: In 1986 Intel manufactured 386 that counted 250 000 transistors
in fabs costing $200 million. In 1996 for Pentium processor that counted 6 million
The limits of semiconductor technology and oncoming challenges ... 291
towards voltage reduction, this level plays a very important role in power saving
[13].
Present day general purpose microprocessor designers are faced with the daunt-
ing task of reducing power dissipation since power dissipation quickly becomes a
bottleneck for future technologies.
For all the integrated circuits used in battery-powered portable devices, power
consumption is the main issue. Furthermore, power consumption is also the main
issue for high-performance integrated circuit due to heat dissipation. Consequently,
power consumption is a dramatic problem for all integrated circuits designed today
[10, 13].
Advances in CMOS technology, however, are driving the operating voltage of
integrated circuits increasingly lower. The forecast of operating voltages in CMOS
technology is shown in Fig. 9. From these general trends, it is clear that circuits
will need to operate at 0 5V and even below within next ten years [15].
In order to evaluate how past technologies have met the performance goal, we have
been plotting product frequencies over time, as shown in Fig. 10 [3].
Assuming a technology generation spans two to three years, the data show that
microprocessor frequency has doubled every generation. Several factors may be
accounted. Consider the data plotted on the right - hand y - axis in Fig. 10. The
average number of gate delays in a clock period is decreasing because both the new
The limits of semiconductor technology and oncoming challenges ... 295
microarchitectures use shorter pipelines for static gates, and because the advanced
circuit techniques reduce the critical path delays even further. This could be the
main reason that the frequency is doubled in every technology generation. One
may suspect that this frequency increase comes at the expense of overdesign or
oversize of transistors. Fig. 11. shows how transistor size scales across different
process technologies.
According to the previous discussion, we can conclude that the twofold fre-
quency improvement for each technology generation is primarily due to the fol-
lowing factors [11]:
The reduced number of gates employed in a clock period, what makes the
296 M. Stojčev, T. Tokić, and I. Milentijević:
Advanced circuit design techniques that reduce the average gate delay be-
yond 30% per generation.
Shrinking geometries, lower power voltages, and higher frequencies have a neg-
ative impact on reliability. Together, they increase the number of occurrences of
intermittent and transient faults [16, 17, 18].
Faults experienced by semiconductor devices fall into three main categories:
permanent, intermittent, and transient [19, 20].
Permanent faults reflect irreversible physical changes. The improvement of
semiconductor design and manufacturing techniques has significantly decreased
the rate of occurrence of permanent faults. Figure 12 shows the evolution of per-
manent - fault rates for CMOS microprocessors and static and dynamic memories
over the past decade. The semiconductor industry is widely adopting copper in-
terconnects. This trend has a positive impact on permanent - faults rate of occur-
rence, as copper provides a higher electro-migration threshold than aluminium does
[16, 12, 17, 18].
Intermittent faults occur because of unstable or marginal hardware; they can be
activated by environmental changes, like higher or lower temperature or voltage.
Many times intermittent precede the occurrence of permanent faults.
Transient faults occur because of temporary environmental conditions. Sev-
eral phenomena induce transient faults: neutron and alpha particles; power supply
and interconnect noise; electromagnetic interference; and electrostatic discharge.
Higher VLSI integration and lower supply voltages have contributed to higher oc-
currence rates for particle - induced transients, also known as soft errors. Fig. 13
The limits of semiconductor technology and oncoming challenges ... 297
plots measured neutron - and alpha - induced soft errors rates (SERs) for CMOS
SRAMs as a function of memory capacity [16].
Fig. 12. The evolution of permanent-fault rates for CMOS microprocessors and
SRAM and DRAM memories over the past decade
Fault avoidance and fault tolerance are the main approaches used to increase
the reliability of VLSI circuits [16, 17, 20]. Fault avoidance relies on improved
materials, manufacturing processes, and circuit design. For instance, lower - alpha
emission interconnect and packaging materials contribute to low SERs. Silicon on
insulator is commonly used process solution for lower circuit sensitivity to particle-
induced transients [16].
Fault tolerance is implementable at the circuit or system level. It relies on
concurrent error detection, error recovery, error correction codes (CEDs), and space
or time redundancy. Designers have successfully built both hardware and software
implementations [17, 18].
Intermittent and transient faults are expected to represent the main source of
errors experienced by VLSI circuits. In general, the semiconductor industry is ap-
proaching a new stage in the development and manufacturing of VLSI circuits.
Failure avoidance, based on design technologies and process technologies, would
not fully control intermittent and transient faults. Fault - tolerant solutions, presently
employed in custom - designed systems, will become widely used in off-the-shelf
ICs tomorrow, i.e. in mainstream commercial applications [21]. Designers will
have to embed these solutions into VLSI circuits especially microprocessors, in or-
der to provide better fault and error handling, and to avoid silent data corruption
[16, 22].
As an example of transient errors we will consider the influences of changes in
the supply voltage referred to as power supply noise. Power supply noise adversely
affects circuit operation through the following mechanisms: a) signal uncertainty;
b) on-chip clock jitter; c) noise margin degradation; and d) degradation of gate ox-
ide reliability. For correct circuit operation the supply levels have to be maintained
within a certain range near the nominal voltage levels. This range is called the
power noise margin. The primary objective in the design of the distribution sys-
tem is to supply sufficient current to each transistor on an integrated circuit while
ensuring that the power noise does not exceed the target noise margins. As an il-
lustration, the evolution of the average current of high-performance Intel family of
microprocessors is given in Fig. 14 [3, 4].
Fig. 14. Evolution of the average current in Fig. 15. Evolution of transient current over time
higher performance Intel’s family microproces-
sors
Fig. 16. The national Technology Roadmap for Semiconductors: (a) total transistor
per chip, (b) on - chip local clock
As it can be seen from Fig. 17.a the compiled program uses to tell the micro-
processor what it (the program) needs to be done, and the microprocessors use to
know what it must be carried out in behalf of the program. The ISA is implemented
by a set of hardware structures collectively referred to as the microprocessor’s mi-
croarchitecture. If we take our levels of transformation and include the algorithm
and language into microprocessor, the microprocessor then becomes the thing that
uses device technology to solve the problem (see Fig. 17.b)[25].
the operations, that are independent of any operation that has not been com-
pleted yet, must be determined, and
Fig. 19 shows the breakdown of these three tasks, between the compiler and
runtime hardware for the three classes of architecture.
Current superscalars can execute four or more instructions per cycle. In prac-
tice, however, they achieve only one or two, because current applications have low
ILP.
302 M. Stojčev, T. Tokić, and I. Milentijević:
Fig. 19. Division of responsibilities between the compiler and the hard-
ware for the three classes of architecture
could be achieved by pipelining the functional units and instruction issue hardware
n times, in this way speeding up the clock rate by a factor of n, but issuing only one
instruction per cycle. This strategy is termed superpipelining [27, 26].
Superscalar and VLIW machines represent two different approaches to the
same ultimate goal, which is achieving high performance via instruction-level par-
allel processing. The two approaches have evolved through different historical
paths and from different perspectives. It has been suggested that these two ap-
proaches are quite synergistic and there is strong motivation for pursuing poten-
tial integration of the two approaches [26]. Let us now point in brief to the main
features of the VLIW and superscalar processors. VLIW processors rely on the
compiler to schedule instructions for parallel execution by placing multiple opera-
tions in a single long instruction word. All of the operations in a VLIW instruction
are executed in the same cycle, allowing the compiler to control which instruction
to execute in any given cycle. VLIW processors can be relatively simple, allow-
ing them to be implemented at high clock speeds, but they are generally unable to
maintain compatibility between generations because any change to the processor
implementation requires programs to be recompiled if they are to execute correctly
[28]. Superscalar processors, on the other hand, contain hardware that examines a
sequential program to locate instructions that can be executed in parallel. This al-
lows them to maintain compatibility between generations and to achieve speedups
on program that were compiled for sequential processors, but they have a limited
The limits of semiconductor technology and oncoming challenges ... 303
window of instructions that the hardware examines to select instructions that can
be executed in parallel, which can reduce performance [28].
VLIW architectures have the following properties: a) there is one central con-
trol logic unit issuing a single long instruction per cycle; b) each long instruction
consists of many tightly coupled independent operations; c) each operation requires
a small statically predictable number of cycles to execute; d) operations can be
pipelined.
Figure 20 illustrates how instruction processing is conceptually carried out
in a modern, high-performance processor that use ILP [26, 29]. Instructions are
fetched, decoded, and renamed in program order. At any given cycle, we may actu-
ally fetch or decode multiple instructions. Many current processors fetch up to four
instructions simultaneously. Branch prediction is used to predict the path through
the code, so that fetching can run-ahead of execution. After decode, instruction is
allowed to execute once its input data becomes available and provided that suffi-
cient execution resources are available. Once an instruction executes, it is allowed
to complete. At the last step, instructions commit to program order.
rate thread or flow control. By designing the processor as a collection of PEs, (a)
the number of global wires is reduced, and (b) very little communication occurs
through global wires. Thus, much of communication occurring in the multi- PE
processor is local in nature and occurs through short wires. The commonly used
model for control flow among threads is the parallel threads model. The fork in-
struction specifies the creation of new threads and their starting addresses, while
the join instruction serves as a synchronizing point and collects the threads. The
thread sequencing model is illustrated in Fig. 21 [30].
Fig. 22. How three different architectures partition issue slots: a) super-
scalar; b) multithreaded superscalar; and c) SMT
The limits of semiconductor technology and oncoming challenges ... 305
The rows of squares represent issue slots. The processor either finds an in-
struction to execute (filled box) or it allows the slots to remain unused (empty box)
[30].
C) Chip multiprocessor (CMP) - the idea is to put several microprocessors
on a single die (see for example Fig. 23). The performance of small-scale CMP
scales close to linear with the number of microprocessors and is likely to exceed the
performance of an equivalent multiprocessor system. CMP is an attractive option
to use when moving to a new process technology. New process technology allows
us to shrink and duplicate our best existing microprocessor on the some silicon die,
thus doubling the performance at the same power [30, 29].
Twenty five years ago information processing was associated with large main-
frame computers. At the end of the last century, this shifted towards information
processing based on personal computers. These trends continue towards miniatur-
ization of product. Nowadays, more and more information processing devices are
portable computers integrated into larger products. These new types of information
technology applications are called ubiquitous computing, ”pervasive computing”,
and ambient computing. Embedded systems are one of the origins of these three
areas and they provide a major part of the necessary technology [34].
they often incorporate analog components, and can in the future, also include
opto/microelectronic mechanical system components [37].
Short time to market, large gate counts, and high-performance characterize to-
day’s VLSI design environment. SoC technology holds the key for previously men-
tioned complex applications by enabling high-performance, embedded processing
solutions at a low single-chip cost.
To quickly create SoC designs with the required complexity, designers must
use predesigned intellectual property (IP) blocks, also referred as macros, cores,
or virtual components. For SoC designs, this means reusing previously designed
cores wherever possible. The more design reuse we have, the faster the SoC time
to market is [38].
From the system architect’s point of view quick SoC assembly cores using is
not an easy job due to the following reasons: CPU selection, decision which func-
tions will be performed in hardware versus software, integrating cores into SoCs,
achieving correct timing, physical design of large systems, testing and system ver-
ification, and others [39].
More and more modern information systems require an analog input-output inter-
face. System’s inputs typically come from analog sensors. With aim to provide
easy processing, the systems convert these signals as quickly as possible into digi-
tal format. The system subsequently reconverts these signals back to analog output
through actuators such as lamps, motors, speakers, and display [40]. Examples of
such systems include everyday products like TVs, phones, PCs, and PDAs. Such
products also include the equally pervasive but invisible engine control units that
manage, for example, an internal combustion of engine’s functions [41].
A complete system includes not only electronic functions but also sensors and
actuators. Microelectromechanical system (MEMS) technology makes it possible
to build these systems using silicon and thus allowing new levels of integration
among the system’s electronics, mechanical, optical, and/or fluidic elements [41,
40]
According to ITRS prediction [3], by the end of the decade, SoCs using 50nm
transistors and operating below 1V, will grow up to 4 billion transistors running
at 10 GHz. The major design problem accompanied with these chips will be the
challenge how to provide correct function and reliable operation of the interacting
308 M. Stojčev, T. Tokić, and I. Milentijević:
25 shows the possible trade-off involving area, time T , and power in processor
design. The power and area axes are typically optimized for server processors [8].
Fig. 24. Performance of microprocessors, mainframes, su- Fig. 25. Area, performance (time),
percomputers and minicomputers over time and power trade-off trends in server
and client processor designs
7 Conclusion
As technology scales, important new opportunities emerge for VLSI ICs design-
ers. Understanding technology trends and specific applications is the main criterion
for designing efficient and effective chips. There are several difficult and exciting
challenges facing the design of complex ICs. To continue its phenomenal historical
growth and continue to follow Moore’s law, the semiconductor industry will require
advances on all fronts - from front-end process and lithography to design innova-
tive high-performance processor architectures, and SoC solutions. The roadmap’s
goal is to bring experts together in each of these fields to determine what those
challenges are, and potentially how to solve them.
The presented discussion says that there are a lot of challenging problems left
in systems research. If we look for a progress we need to think hard, for a long time
and about where to direct our efforts.
References
[3] The 2003 international technology roadmap for semiconductors. [Online]. Available:
http://public.irts.net
[4] A. Mezhiba and E. Friedman, Power Distribution Networks in High-speed Integrated
Circuits. Boston: Kluwer Academic Publishers, 2004.
[5] M. Shooman, Reliability of Computer System and Networks: Fault Tolerance, Anal-
ysis. and Design. New York: Wiley - Interscience Publication, 2002.
[6] J. Hennessy and D. Patterson, Computer Architecture: A Quantitive Approach,
3rd ed. Amsterdam: Morgan Kaufman Pub., 2003.
[7] D. Adams, High Performance Memory Testing: Design Principles, Fault Modeling
and Self - Test. Boston: Kluwer Academic Publishers, 2003.
[8] M. Flynn et al., “Deep-submicron microprocessor design isues,” IEEE Micro, vol. 19,
no. 4, pp. 11–22, 1999.
[9] V. Zaccaria et al., Power Estimation and Optimization Methodologies for VLIW
Based Embeded Systems. Boston: Kluwer Academic Publishers, 2003.
[10] Varadarajan et al., “Low power design issues,” in The Computer Engineering Hand-
book, V. Oklobdzija, Ed. Baca Raton: CRC Press, 2002.
[11] S. Borkar, “Design challenges of technology scaling,” IEEE Micro, vol. 19, no. 4,
July - August 1999.
[12] S. Kang and Y. Leblebici, CMOS Digital Integrated Circuits: Analysis and Design.
Boston: Mc Graw Hill, 2003, vol. 3/1.
[13] K. Seno, “Implementation level impact on low power design,” in The Computer En-
gineering Handbook, V. Oklobdzija, Ed. Baca Raton: CRC Press, 2002.
[14] N. Nikolici and B. A. Hashimi, Power Constrained Testing of VLSI Circuits. Boston:
Kluwer Academic Publishers, 2003.
[15] Semiconductor industry association. (03) The National Technology Roadmap for
Semiconductors. [Online]. Available: http://www.sematech.org/
[16] C. Constrantinescu, “Trends and challenges in vlsi circuit reliability,” IEEE Micro,
vol. 19, no. 4, pp. 14–19, July - August 1999.
[17] P. Lala, Self Checking and Fault Tolerant Digital System Design. San Francisco:
Morgan Kaufman Pub., 2001.
[18] M. Stojcev et al., “Implementation of self checking two level combinational logic
on fpga and cpld circuits,” Microelectronics Reliability, vol. 44, no. 1, pp. 173–178,
January 2004.
[19] M. Esonu et al., “Fault tolerant design methodology for systolic array architectures,”
IEE Proc. Compt. Dig. Tech.,, vol. 141, no. 1, pp. 17–28, 1994.
[20] B. Johnson, Design an Analysis of Fault Tolerant Systems. Reading MA: Addison
Wesley, 1990.
[21] K. Mohanram et al., “Synthesis of low-cost parity-based partially self checking cir-
cuits,” Journal of Electronic Testing: Theory and Application (JETTA),, vol. 16, no.
1/2, pp. 145–153, 2001.
The limits of semiconductor technology and oncoming challenges ... 311
[22] D. K et al., “Rsyn: A system for automated synthesis of reliable multilevel circuits,”
IEEE Trans. VLSI Systems, vol. 2, no. 2, pp. 186–195, 1994.
[23] S. Eggers et al., “Simultaneous multitreading: A pplatform for next - generation
processors,” IEEE Micro, vol. 17, no. 5, pp. 12–19, 1997.
[24] J.-P. Soininen and H. Hensala, Jontash A., and Tenhunen H., Eds Networks on Chips.
Boston: Kluwer Academic Publishers, 2003, ch. A Design Methodology for NoC -
Based Systems.
[25] Y. Patt, “Requirements, bottlenecks, and good fortune: Agents for microprocessor
evolution,” Proc. of the IEEE, vol. 89, no. 11, pp. 1553–1559, 2001.
[26] J. Shen and M. Lipasti, Modern Processor Design: Fundamentals of Superscalar
Processors. New York: McGraw Hill Book Comp., 2003.
[27] R. Rau and J. Fisher, “Instruction-level parallel processing: History, overview and
perspective,” The Journal of Supercomputing, vol. 7, no. 1, pp. 1–56, 1993.
[28] N. Carter, Computer Architecture. New York: McGraw Hill Book Company, 2002.
[29] A. Mosholos and S. Guirindar, “Microarchitectural innovations: Boosting micropro-
cessor performance beyond semiconductor technology scaling,” Proc. of the IEEE,
vol. 89, no. 11, pp. 1560–1575, 2001.
[30] D. Burger and J. Goodman, “Billion-transistor architectures: There and back again,”
IEEE Computer, vol. 37, no. 3, pp. 22–28, 2004.
[31] T. Claasen, “System on chip: Changing ic design today and in the future,” IEEE
Micro, vol. 23, no. 3, pp. 20–26, 2004.
[32] R. Gupta and Y. Zorian, “Introducing core-based system design,” IEEE Design and
test of computer, vol. 14, no. 4, pp. 15–25, 1997.
[33] P. Rashinkar et al., System-on-a-Chip: Methodology and Techniques. Boston:
Kluwer Academic Pub., 2001.
[34] P. Marwedel, Embedded System Design. Boston: Kluwer Academic Pub.,, 2003.
[35] H. Al-Asad et al., “Online bist for embedded systems,” Design&Test of Computers,
vol. 15, no. 4, pp. 17–24, 1998.
[36] R. Leupers, Code Optimization Techniques for Embedded Processors: Methods, Al-
gorithms, and Tools. Boston: Kluwer Academic Pub., 2000.
[37] H. Chang et al., Surviving the SoC Revolution: A Guide to Platform-Based Design.
Boston: Kluwer Academic Pub., 1999.
[38] M. Birnboun and H. Sachs, “How vsia answers the soc dilemma,” IEEE Computer,
vol. 32, no. 6, pp. 42–50, 1999.
[39] R. Bergamaschi et al., “Automating the design of socs using cores,” IEEE Design&
Test of computers, vol. 18, no. 5, pp. 32–45, 2001.
[40] R. Waser, Nanoelectronics and Information Technology: Advanced Electronic Mate-
rials and Novel Device. Weinheim: John Wiley and Sons, 2003.
[41] B. Murari, “Integrating nanoelectronic components into electronic microsystems,”
IEEE Micro, vol. 23, no. 3, pp. 36–44, 2003.
312 M. Stojčev, T. Tokić, and I. Milentijević:
[42] A. Jantsch, “Nocs: A new contract between hardware and software,” in Proc of Eu-
romicro Symposium on Digital System Design, DSD 2003, Belek-Antalya, Turkey,
2003, pp. 10–16.
[43] M. Pflanz and H. Vierhaus, “Online check and recovery techniques for depandable
embedded processors,” IEEE Micro, vol. 21, no. 5, pp. 24–40, 2001.
[44] L. Benini and D. D. Micheli, “Networks on chips: A new soc paradigm,” IEEE Com-
puter, vol. 35, no. 1, pp. 70–78, 2002.
[45] T. Austin, “Mobile supercomputers,” IEEE Computer, vol. 37, no. 5, pp. 81–83,
2004.